Sample records for pseudogene structural analysis

  1. Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution.

    PubMed

    Pavlícek, Adam; Paces, Jan; Elleder, Daniel; Hejnar, Jirí

    2002-03-01

    We report here the presence of numerous processed pseudogenes derived from the W family of endogenous retroviruses in the human genome. These pseudogenes are structurally colinear with the retroviral mRNA followed by a poly(A) tail. Our analysis of insertion sites of HERV-W processed pseudogenes shows a strong preference for the insertion motif of long interspersed nuclear element (LINE) retrotransposons. The genomic distribution, stability during evolution, and frequent truncations at the 5' end resemble those of the pseudogenes generated by LINEs. We therefore suggest that HERV-W processed pseudogenes arose by multiple and independent LINE-mediated retrotransposition of retroviral mRNA. These data document that the majority of HERV-W copies are actually nontranscribed promoterless pseudogenes. The current search for HERV-Ws associated with several human diseases should concentrate on a small subset of transcriptionally competent elements.

  2. Differentially-Expressed Pseudogenes in HIV-1 Infection.

    PubMed

    Gupta, Aditi; Brown, C Titus; Zheng, Yong-Hui; Adami, Christoph

    2015-09-29

    Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these "functional" pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit.

  3. Mobile genetic element proliferation and gene inactivation impact over the genome structure and metabolic capabilities of Sodalis glossinidius, the secondary endosymbiont of tsetse flies

    PubMed Central

    2010-01-01

    Background Genome reduction is a common evolutionary process in symbiotic and pathogenic bacteria. This process has been extensively characterized in bacterial endosymbionts of insects, where primary mutualistic bacteria represent the most extreme cases of genome reduction consequence of a massive process of gene inactivation and loss during their evolution from free-living ancestors. Sodalis glossinidius, the secondary endosymbiont of tsetse flies, contains one of the few complete genomes of bacteria at the very beginning of the symbiotic association, allowing to evaluate the relative impact of mobile genetic element proliferation and gene inactivation over the structure and functional capabilities of this bacterial endosymbiont during the transition to a host dependent lifestyle. Results A detailed characterization of mobile genetic elements and pseudogenes reveals a massive presence of different types of prophage elements together with five different families of IS elements that have proliferated across the genome of Sodalis glossinidius at different levels. In addition, a detailed survey of intergenic regions allowed the characterization of 1501 pseudogenes, a much higher number than the 972 pseudogenes described in the original annotation. Pseudogene structure reveals a minor impact of mobile genetic element proliferation in the process of gene inactivation, with most of pseudogenes originated by multiple frameshift mutations and premature stop codons. The comparison of metabolic profiles of Sodalis glossinidius and tsetse fly primary endosymbiont Wiglesworthia glossinidia based on their whole gene and pseudogene repertoires revealed a novel case of pathway inactivation, the arginine biosynthesis, in Sodalis glossinidius together with a possible case of metabolic complementation with Wigglesworthia glossinidia for thiamine biosynthesis. Conclusions The complete re-analysis of the genome sequence of Sodalis glossinidius reveals novel insights in the evolutionary transition from a free-living ancestor to a host-dependent lifestyle, with a massive proliferation of mobile genetic elements mainly of phage origin although with minor impact in the process of gene inactivation that is taking place in this bacterial genome. The metabolic analysis of the whole endosymbiotic consortia of tsetse flies have revealed a possible phenomenon of metabolic complementation between primary and secondary endosymbionts that can contribute to explain the co-existence of both bacterial endosymbionts in the context of the tsetse host. PMID:20649993

  4. Differentially-Expressed Pseudogenes in HIV-1 Infection

    PubMed Central

    Gupta, Aditi; Brown, C. Titus; Zheng, Yong-Hui; Adami, Christoph

    2015-01-01

    Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these “functional” pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit. PMID:26426037

  5. Structural characterization and chromosomal location of the mouse macrophage migration inhibitory factor gene and pseudogenes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bozza, M.; Gerard, C.; Kolakowski, L.F. Jr.

    1995-06-10

    Macrophage migration inhibitory factor, MIF, is a cytokine released by T-lymphocytes, macrophages, and the pituitary gland that serves to integrate peripheral and central inflammatory responses. Ubiquitous expression and developmental regulation suggest that MIF may have additional roles outside of the immune system. Here we report the structure and chromosomal location of the mouse Mif gene and the partial characterization of five Mif pseudogenes. The mouse Mif gene spans less than 0.7 kb of chromosomal DNA and is composed of three exons. A comparison between the mouse and the human genes shows a similar gene structure and common regulatory elements inmore » both promoter regions. The mouse Mif gene maps to the middle region of chromosome 10, between Bcr and S100b, which have been mapped to human chromosomes 22q11 and 21q22.3, respectively. The entire sequence of two pseudogenes demonstrates the absence of introns, the presence of the 5{prime} untranslated region of the cDNA, a 3{prime} poly(A) tail, and the lack of sequence similarity with untranscribed regions of the gene. The five pseudogenes are highly homologous to the cDNA, but contain a variable number of mutations that would produce mutated or truncated MIF-like proteins. Phylogenetic analyses of MIF genes and pseudogenes indicate several independent genetic events that can account for multiple genomic integrations. Three of the Mif pseudogenes were also mapped by interspecific backcross to chromosomes 1, 9, and 17. These results suggest that Mif pseudogenes originated by retrotransposition. 46 refs., 5 figs., 1 tab.« less

  6. A nuclear ribosomal DNA pseudogene in triatomines opens a new research field of fundamental and applied implications in Chagas disease.

    PubMed

    Zuriaga, María Angeles; Mas-Coma, Santiago; Bargues, María Dolores

    2015-05-01

    A pseudogene, designated as "ps(5.8S+ITS-2)", paralogous to the 5.8S gene and internal transcribed spacer (ITS)-2 of the nuclear ribosomal DNA (rDNA), has been recently found in many triatomine species distributed throughout North America, Central America and northern South America. Among characteristics used as criteria for pseudogene verification, secondary structures and free energy are highlighted, showing a lower fit between minimum free energy, partition function and centroid structures, although in given cases the fit only appeared to be slightly lower. The unique characteristics of "ps(5.8S+ITS-2)" as a processed or retrotransposed pseudogenic unit of the ghost type are reviewed, with emphasis on its potential functionality compared to the functionality of genes and spacers of the normal rDNA operon. Besides the technical problem of the risk for erroneous sequence results, the usefulness of "ps(5.8S+ITS-2)" for specimen classification, phylogenetic analyses and systematic/taxonomic studies should be highlighted, based on consistence and retention index values, which in pseudogenic sequence trees were higher than in functional sequence trees. Additionally, intraindividual, interpopulational and interspecific differences in pseudogene amount and the fact that it is a pseudogene in the nuclear rDNA suggests a potential relationships with fitness, behaviour and adaptability of triatomine vectors and consequently its potential utility in Chagas disease epidemiology and control.

  7. Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates

    PubMed Central

    2010-01-01

    Background Unitary pseudogenes are a class of unprocessed pseudogenes without functioning counterparts in the genome. They constitute only a small fraction of annotated pseudogenes in the human genome. However, as they represent distinct functional losses over time, they shed light on the unique features of humans in primate evolution. Results We have developed a pipeline to detect human unitary pseudogenes through analyzing the global inventory of orthologs between the human genome and its mammalian relatives. We focus on gene losses along the human lineage after the divergence from rodents about 75 million years ago. In total, we identify 76 unitary pseudogenes, including previously annotated ones, and many novel ones. By comparing each of these to its functioning ortholog in other mammals, we can approximately date the creation of each unitary pseudogene (that is, the gene 'death date') and show that for our group of 76, the functional genes appear to be disabled at a fairly uniform rate throughout primate evolution - not all at once, correlated, for instance, with the 'Alu burst'. Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population. Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage. Conclusions This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans. PMID:20210993

  8. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    PubMed

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  9. Using secondary structure to identify ribosomal numts: cautionary examples from the human genome.

    PubMed

    Olson, Link E; Yoder, Anne D

    2002-01-01

    The identification of inadvertently sequenced mitochondrial pseudogenes (numts) is critical to any study employing mitochondrial DNA sequence data. Failure to discriminate numts correctly can confound phylogenetic reconstruction and studies of molecular evolution. This is especially problematic for ribosomal mtDNA genes. Unlike protein-coding loci, whose pseudogenes tend to accumulate diagnostic frameshift or premature stop mutations, functional ribosomal genes are not constrained to maintain a reading frame and can accumulate insertion-deletion events of varying length, particularly in nonpairing regions. Several authors have advocated using structural features of the transcribed rRNA molecule to differentiate functional mitochondrial rRNA genes from their nuclear paralogs. We explored this approach using the mitochondrial 12S rRNA gene and three known 12S numts from the human genome in the context of anthropoid phylogeny and the inferred secondary structure of primate 12S rRNA. Contrary to expectation, each of the three human numts exhibits striking concordance with secondary structure models, with little, if any, indication of their pseudogene status, and would likely escape detection based on structural criteria alone. Furthermore, we show that the unwitting inclusion of a particularly ancient (18-25 Myr old) and surprisingly cryptic human numt in a phylogenetic analysis would yield a well-supported but dramatically incorrect conclusion regarding anthropoid relationships. Though we endorse the use of secondary structure models for inferring positional homology wholeheartedly, we caution against reliance on structural criteria for the discrimination of rRNA numts, given the potential fallibility of this approach.

  10. Extensive 5.8S nrDNA polymorphism in Mammillaria (Cactaceae) with special reference to the identification of pseudogenic internal transcribed spacer regions.

    PubMed

    Harpke, Doerte; Peterson, Angela

    2008-05-01

    The internal transcribed spacer (ITS) region (ITS1, 5.8S rDNA, ITS2) represents the most widely applied nuclear marker in eukaryotic phylogenetics. Although this region has been assumed to evolve in concert, the number of investigations revealing high degrees of intra-individual polymorphism connected with the presence of pseudogenes has risen. The 5.8S rDNA is the most important diagnostic marker for functionality of the ITS region. In Mammillaria, intra-individual 5.8S rDNA polymorphisms of up to 36% and up to nine different types have been found. Twenty-eight of 30 cloned genomic Mammillaria sequences were identified as putative pseudogenes. For the identification of pseudogenic ITS regions, in addition to formal tests based on substitution rates, we attempted to focus on functional features of the 5.8S rDNA (5.8S motif, secondary structure). The importance of functional data for the identification of pseudogenes is outlined and discussed. The identification of pseudogenes is essential, because they may cause erroneous phylogenies and taxonomic problems.

  11. Nuclear rDNA pseudogenes in Chagas disease vectors: evolutionary implications of a new 5.8S+ITS-2 paralogous sequence marker in triatomines of North, Central and northern South America.

    PubMed

    Bargues, M Dolores; Zuriaga, M Angeles; Mas-Coma, Santiago

    2014-01-01

    A pseudogene, paralogous to rDNA 5.8S and ITS-2, is described in Meccus dimidiata dimidiata, M. d. capitata, M. d. maculippenis, M. d. hegneri, M. sp. aff. dimidiata, M. p. phyllosoma, M. p. longipennis, M. p. pallidipennis, M. p. picturata, M. p. mazzottii, Triatoma mexicana, Triatoma nitida and Triatoma sanguisuga, covering North America, Central America and northern South America. Such a nuclear rDNA pseudogene is very rare. In the 5.8S gene, criteria for pseudogene identification included length variability, lower GC content, mutations regarding the functional uniform sequence, and relatively high base substitutions in evolutionary conserved sites. At ITS-2 level, criteria were the shorter sequence and large proportion of insertions and deletions (indels). Pseudogenic 5.8S and ITS-2 secondary structures were different from the functional foldings, different one another, showing less negative values for minimum free energy (mfe) and centroid predictions, and lower fit between mfe, partition function, and centroid structures. A complete characterization indicated a processed pseudogenic unit of the ghost type, escaping from rDNA concerted evolution and with functionality subject to constraints instead of evolving free by neutral drift. Despite a high indel number, low mutation number and an evolutionary rate similar to the functional ITS-2, that pseudogene distinguishes different taxa and furnishes coherent phylogenetic topologies with resolution similar to the functional ITS-2. The discovery of a pseudogene in many phylogenetically related species is unique in animals and allowed for an estimation of its palaeobiogeographical origin based on molecular clock data, inheritance pathways, evolutionary rate and pattern, and geographical spread. Additional to the technical risk to be considered henceforth, this relict pseudogene, designated as "ps(5.8S+ITS-2)", proves to be a valuable marker for specimen classification, phylogenetic analyses, and systematic/taxonomic studies. It opens a new research field, Chagas disease epidemiology and control included, given its potential relationships with triatomine fitness, behaviour and adaptability. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Molecular analysis of the glucocerebrosidase gene locus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Winfield, S.L.; Martin, B.M.; Fandino, A.

    1994-09-01

    Gaucher disease is due to a deficiency in the activity of the lysosomal enzyme glucocerebrosidase. Both the functional gene for this enzyme and a pseudogene are located in close proximity on chromosome 1q21. Analysis of the mutations present in patient samples has suggested interaction between the functional gene and the pseudogene in the origin of mutant genotypes. To investigate the involvement of regions flanking the functional gene and pseudogene in the origin of mutations found in Gaucher disease, a YAC clone containing DNA from this locus has been subcloned and characterized. The original YAC containing {approximately}360 kb was truncated withmore » the use of fragmentation plasmids to about 85 kb. A lambda library derived from this YAC was screened to obtain clones containing glucocerebrosidase sequences. PCR amplification was used to identify subclones containing 5{prime}, central, or 3{prime} sequences of the functional gene or of the pseudogene. Clones spanning the entire distance from the last exon of the functional gene to intron 1 of the pseudogene, the 5{prime} end of the functional gene and 16 kb of 5{prime} flanking region and approximately 15 kb of 3{prime} flanking region of the pseudogene were sequenced. Sequence data from 48 kb of intergenic and flanking regions of the glucocerebrosidase gene and its pseudogene has been generated. A large number of Alu sequences and several simple repeats have been found. Two of these repeats exhibit fragment length polymorphism. There is almost 100% homology between the 3{prime} flanking regions of the functional gene and the pseudogene, extending to about 4 kb past the termination codons. A much lower degree of homology is observed in the 5{prime} flanking region. Patient samples are currently being screened for polymorphisms in these flanking regions.« less

  13. The HMGA1 Pseudogene 7 Induces miR-483 and miR-675 Upregulation by Activating Egr1 through a ceRNA Mechanism

    PubMed Central

    De Martino, Marco; Azzariti, Amalia; Arra, Claudio; Fusco, Alfredo; Esposito, Francesco

    2017-01-01

    Several studies have established that pseudogene mRNAs can work as competing endogenous RNAs and, when deregulated, play a key role in the onset of human neoplasias. Recently, we have isolated two HMGA1 pseudogenes, HMGA1P6 and HMGA1P7. These pseudogenes have a critical role in cancer progression, acting as micro RNA (miRNA) sponges for HMGA1 and other cancer-related genes. HMGA1 pseudogenes were found overexpressed in several human carcinomas, and their expression levels positively correlate with an advanced cancer stage and a poor prognosis. In order to investigate the molecular alterations following HMGA1 pseudogene 7 overexpression, we carried out miRNA sequencing analysis on HMGA1P7 overexpressing mouse embryonic fibroblasts. Intriguingly, the most upregulated miRNAs were miR-483 and miR-675 that have been described as key regulators in cancer progression. Here, we report that HMGA1P7 upregulates miR-483 and miR-675 through a competing endogenous RNA mechanism with Egr1, a transcriptional factor that positively regulates miR-483 and miR-675 expression. PMID:29149041

  14. Mansonella ozzardi mitogenome and pseudogene characterisation provides new perspectives on filarial parasite systematics and CO-1 barcoding.

    PubMed

    Crainey, James Lee; Marín, Michel Abanto; Silva, Túllio Romão Ribeiro da; de Medeiros, Jansen Fernandes; Pessoa, Felipe Arley Costa; Santos, Yago Vinícius; Vicente, Ana Carolina Paulo; Luz, Sérgio Luiz Bessa

    2018-04-18

    Despite the broad distribution of M. ozzardi in Latin America and the Caribbean, there is still very little DNA sequence data available to study this neglected parasite's epidemiology. Mitochondrial DNA (mtDNA) sequences, especially the cytochrome oxidase (CO1) gene's barcoding region, have been targeted successfully for filarial diagnostics and for epidemiological, ecological and evolutionary studies. MtDNA-based studies can, however, be compromised by unrecognised mitochondrial pseudogenes, such as Numts. Here, we have used shot-gun Illumina-HiSeq sequencing to recover the first complete Mansonella genus mitogenome and to identify several mitochondrial-origin pseudogenes. Mitogenome phylogenetic analysis placed M. ozzardi in the Onchocercidae "ONC5" clade and suggested that Mansonella parasites are more closely related to Wuchereria and Brugia genera parasites than they are to Loa genus parasites. DNA sequence alignments, BLAST searches and conceptual translations have been used to compliment phylogenetic analysis showing that M. ozzardi from the Amazon and Caribbean regions are near-identical and that previously reported Peruvian M. ozzardi CO1 reference sequences are probably of pseudogene origin. In addition to adding a much-needed resource to the Mansonella genus's molecular tool-kit and providing evidence that some M. ozzardi CO1 sequence deposits are pseudogenes, our results suggest that all Neotropical M. ozzardi parasites are closely related.

  15. Not so pseudo: the evolutionary history of protein phosphatase 1 regulatory subunit 2 and related pseudogenes

    PubMed Central

    2013-01-01

    Background Pseudogenes are traditionally considered “dead” genes, therefore lacking biological functions. This view has however been challenged during the last decade. This is the case of the Protein phosphatase 1 regulatory subunit 2 (PPP1R2) or inhibitor-2 gene family, for which several incomplete copies exist scattered throughout the genome. Results In this study, the pseudogenization process of PPP1R2 was analyzed. Ten PPP1R2-related pseudogenes (PPP1R2P1-P10), highly similar to PPP1R2, were retrieved from the human genome assembly present in the databases. The phylogenetic analysis of mammalian PPP1R2 and related pseudogenes suggested that PPP1R2P7 and PPP1R2P9 retroposons appeared before the great mammalian radiation, while the remaining pseudogenes are primate-specific and retroposed at different times during Primate evolution. Although considered inactive, four of these pseudogenes seem to be transcribed and possibly possess biological functions. Given the role of PPP1R2 in sperm motility, the presence of these proteins was assessed in human sperm, and two PPP1R2-related proteins were detected, PPP1R2P3 and PPP1R2P9. Signatures of negative and positive selection were also detected in PPP1R2P9, further suggesting a role as a functional protein. Conclusions The results show that contrary to initial observations PPP1R2-related pseudogenes are not simple bystanders of the evolutionary process but may rather be at the origin of genes with novel functions. PMID:24195737

  16. Whole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity

    PubMed Central

    2012-01-01

    Background Francisella is a genus of gram-negative bacterium highly virulent in fishes and human where F. tularensis is causing the serious disease tularaemia in human. Recently Francisella species have been reported to cause mortality in aquaculture species like Atlantic cod and tilapia. We have completed the sequencing and draft assembly of the Francisella noatunensis subsp. orientalisToba04 strain isolated from farmed Tilapia. Compared to other available Francisella genomes, it is most similar to the genome of Francisella philomiragia subsp. philomiragia, a free-living bacterium not virulent to human. Results The genome is rearranged compared to the available Francisella genomes even though we found no IS-elements in the genome. Nearly 16% percent of the predicted ORFs are pseudogenes. Computational pathway analysis indicates that a number of the metabolic pathways are disrupted due to pseudogenes. Comparing the novel genome with other available Francisella genomes, we found around 2.5% of unique genes present in Francisella noatunensis subsp. orientalis Toba04 and a list of genes uniquely present in the human-pathogenic Francisella subspecies. Most of these genes might have transferred from bacterial species through horizontal gene transfer. Comparative analysis between human and fish pathogen also provide insights into genes responsible for pathogenecity. Our analysis of pseudogenes indicates that the evolution of Francisella subspecies’s pseudogenes from Tilapia is old with large number of pseudogenes having more than one inactivating mutation. Conclusions The fish pathogen has lost non-essential genes some time ago. Evolutionary analysis of the Francisella genomes, strongly suggests that human and fish pathogenic Francisella species have evolved independently from free-living metabolically competent Francisella species. These findings will contribute to understanding the evolution of Francisella species and pathogenesis. PMID:23131096

  17. The Sequence and Analysis of Duplication Rich Human Chromosome 16

    DOE R&D Accomplishments Database

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-01-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.

  18. Pseudogene redux with new biological significance.

    PubMed

    Salmena, Leonardo

    2014-01-01

    The study of pseudogenes, originally dismissed as genomic relics of evolutionary selection, has seen a resurgence in scientific literature, in addition to being a peculiar topic of discussion in theological debates. For a long time, pseudogenes have been touted as a beacon of natural selection and a definitive proof of evolution due to the slow mutation rate that differentiated them from their parental genes and ultimately caused their genetic demise as functional genes. It now seems that "creationists" have co-opted some recent reports identifying unheralded biological functions to pseudogens and other noncoding RNAs as evidence to undermine the existence of evolution and supporting intelligent design. This issue of Methods in Molecular Biology focused on pseudogenes will certainly not end, nor enter this debate; however, scientists who are also genomics and pseudogene enthusiasts will certainly appreciate that many scientists are thinking about these particular genetic elements in new and interesting ways. With this new interest in a biological significance and "non-junk" role for pseudogenes and other noncoding RNAs, new methods and approaches are being developed to unlock the mystery of these ancient artifacts we know as pseudogenes. In this brief introductory chapter we highlight the renewed interest in pseudogenes and review a rationale for intensification of pseudogene-related research.

  19. The repertoire of bitter taste receptor genes in canids.

    PubMed

    Shang, Shuai; Wu, Xiaoyang; Chen, Jun; Zhang, Huanxin; Zhong, Huaming; Wei, Qinguo; Yan, Jiakuo; Li, Haotian; Liu, Guangshuai; Sha, Weilai; Zhang, Honghai

    2017-07-01

    Bitter taste receptors (Tas2rs) play important roles in mammalian defense mechanisms by helping animals detect and avoid toxins in food. Although Tas2r genes have been widely studied in several mammals, minimal research has been performed in canids. To analyze the genetic basis of Tas2r genes in canids, we first identified Tas2r genes in the wolf, maned wolf, red fox, corsac fox, Tibetan fox, fennec fox, dhole and African hunting dog. A total of 183 Tas2r genes, consisting of 118 intact genes, 6 partial genes and 59 pseudogenes, were detected. Differences in the pseudogenes were observed among nine canid species. For example, Tas2r4 was a pseudogene in the dog but might play a functional role in other canid species. The Tas2r42 and Tas2r10 genes were pseudogenes in the maned wolf and dhole, respectively, and the Tas2r5 and Tas2r34 genes were pseudogenes in the African hunting dog; however, these genes were intact genes in other canid species. The differences in Tas2r pseudogenes among canids might suggest that the loss of intact Tas2r genes in canid species is species-dependent. We further compared the 183 Tas2r genes identified in this study with Tas2r genes from ten additional carnivorous species to evaluate the potential influence of diet on the evolution of the Tas2r gene repertoire. Phylogenetic analysis revealed that most of the Tas2r genes from the 18 species intermingled across the tree, suggesting that Tas2r genes are conserved among carnivores. Within canids, we found that some Tas2r genes corresponded to the traditional taxonomic groupings, while some did not. PIC analysis showed that the number of Tas2r genes in carnivores exhibited no positive correlation with diet composition, which might be due to the limited number of carnivores included in our study.

  20. Comparative sequence analysis of Mycobacterium leprae and the new leprosy-causing Mycobacterium lepromatosis.

    PubMed

    Han, Xiang Y; Sizer, Kurt C; Thompson, Erika J; Kabanja, Juma; Li, Jun; Hu, Peter; Gómez-Valero, Laura; Silva, Francisco J

    2009-10-01

    Mycobacterium lepromatosis is a newly discovered leprosy-causing organism. Preliminary phylogenetic analysis of its 16S rRNA gene and a few other gene segments revealed significant divergence from Mycobacterium leprae, a well-known cause of leprosy, that justifies the status of M. lepromatosis as a new species. In this study we analyzed the sequences of 20 genes and pseudogenes (22,814 nucleotides). Overall, the level of matching of these sequences with M. leprae sequences was 90.9%, which substantiated the species-level difference; the levels of matching for the 16S rRNA genes and 14 protein-encoding genes were 98.0% and 93.1%, respectively, but the level of matching for five pseudogenes was only 79.1%. Five conserved protein-encoding genes were selected to construct phylogenetic trees and to calculate the numbers of synonymous substitutions (dS values) and nonsynonymous substitutions (dN values) in the two species. Robust phylogenetic trees constructed using concatenated alignment of these genes placed M. lepromatosis and M. leprae in a tight cluster with long terminal branches, implying that the divergence occurred long ago. The dS and dN values were also much higher than those for other closest pairs of mycobacteria. The dS values were 14 to 28% of the dS values for M. leprae and Mycobacterium tuberculosis, a more divergent pair of species. These results thus indicate that M. lepromatosis and M. leprae diverged approximately 10 million years ago. The M. lepromatosis pseudogenes analyzed that were also pseudogenes in M. leprae showed nearly neutral evolution, and their relative ages were similar to those of M. leprae pseudogenes, suggesting that they were pseudogenes before divergence. Taken together, the results described above indicate that M. lepromatosis and M. leprae diverged from a common ancestor after the massive gene inactivation event described previously for M. leprae.

  1. Structure and content of the major histocompatibility complex (MHC) class I regions of the great anthropoid apes.

    PubMed

    Venditti, C P; Lawlor, D A; Sharma, P; Chorney, M J

    1996-09-01

    The origins of the functional class I genes predated human speciation, a phenomenon known as trans-speciation. The retention of class Ia orthologues within the great apes, however, has not been paralleled by studies designed to examine the pseudogene content, organization, and structure of their class I regions. Therefore, we have begun the systematic characterization of the Old World primate MHCs. The numbers and sizes of fragments harboring class I sequences were similar among the chimpanzee, gorilla, and human genomes tested. Both of the gorillas included in our study possessed genomic fragments carrying orthologues of the recently evolved HLA-H pseudogene identical to those found in the human. The overall megabase restriction fragment patterns of humans and chimpanzees appeared slightly more similar to each other, although the HLA-A subregional megabase variants may have been generated following the emergence of Homo sapiens. Based on the results of this initial study, it is difficult to generate a firm species tree and to determine human's closest evolutionary neighbor. Nevertheless, an analysis of MHC subregional similarities and differences in the hominoid apes may ultimately aid in localizing and identifying MHC haplotype-associated disease genes such as idiopathic hemochromatosis.

  2. Rapid differentiation of Staphylococcus aureus isolates harbouring egc loci with pseudogenes psient1 and psient2 and the selu or seluv gene using PCR-RFLP.

    PubMed

    Collery, Mark M; Smyth, Cyril J

    2007-02-01

    The egc locus of Staphylococus aureus harbours two enterotoxin genes (seg and sei) and three enterotoxin-like genes (selm, seln and selo). Between the sei and seln genes are located two pseudogenes, psient1 and psient2, or the selu or seluv gene. While these two alternative sei-seln intergenic regions can be distinguished by PCR, to date, DNA sequencing has been the only confirmatory option because of the very high degree of sequence similarity between egc loci bearing the pseudogenes and the selu or seluv gene. In silico restriction enzyme digestion of genomic regions encompassing the egc locus from the 3' end of the sei gene through the 5' first quarter of the seln gene allowed pseudogene- and selu- or seluv-bearing egc loci to be distinguished by PCR-RFLP. Experimental application of these findings demonstrated that endonuclease HindIII cleaved PCR amplimers bearing pseudogenes but not those with a selu or seluv gene, while selu- or seluv-bearing amplimers were susceptible to cleavage by endonuclease HphI, but not by endonuclease HindIII. The restriction enzyme BccI cleaved selu- or seluv-harbouring amplimers at a unique restriction site created by their signature 15 bp insertion compared with pseudogene-bearing amplimers, thereby allowing distinction of these egc loci. PCR-RFLP analysis using these restriction enzymes provides a rapid, easy to interpret alternative to DNA sequencing for verification of PCR findings on the nature of an egc locus type, and can also be used for the primary identification of the intergenic sei-seln egc locus type.

  3. Differentiation of the glucocerebrosidase gene from pseudogene by long-template PCR: Implications for Gaucher disease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tayebi, N.; Cushner, S.; Sidransky, E.

    1996-09-01

    We describe the use of long-template PCR to differentiate the glucocerebrosidase gene from its pseudogene, which will simplify molecular diagnostic testing and the detection of known and new mutations in patients with Gaucher disease. Gaucher disease results from the inherited deficiency of the lysosomal enzyme, glucocerebrosidase. Sixteen kilobases downstream of the glucocerebrosidase gene is a pseudogene, which is {approximately}2 kb shorter and has >96% identity to the coding regions of the functional gene. Many mutations encountered in Gaucher patients are identical to sequences ordinarily found only in the pseudogene, and some result from recombination between the gene and pseudogene. Thus,more » for diagnostic purposes it is essential to differentiate between sequences from the gene and pseudogene. 9 refs., 1 fig.« less

  4. Clinical analysis of PMS2: mutation detection and avoidance of pseudogenes.

    PubMed

    Vaughn, Cecily P; Robles, Jorge; Swensen, Jeffrey J; Miller, Christine E; Lyon, Elaine; Mao, Rong; Bayrak-Toydemir, Pinar; Samowitz, Wade S

    2010-05-01

    Germline mutation detection in PMS2, one of four mismatch repair genes associated with Lynch syndrome, is greatly complicated by the presence of numerous pseudogenes. We used a modification of a long-range PCR method to evaluate PMS2 in 145 clinical samples. This modification avoids potential interference from the pseudogene PMS2CL by utilizing a long-range product spanning exons 11-15, with the forward primer anchored in exon 10, an exon not shared by PMS2CL. Large deletions were identified by MLPA. Pathogenic PMS2 mutations were identified in 22 of 59 patients whose tumors showed isolated loss of PMS2 by immunohistochemistry (IHC), the IHC profile most commonly associated with a germline PMS2 mutation. Three additional patients with pathogenic mutations were identified from 53 samples without IHC data. Thirty-seven percent of the identified mutations were large deletions encompassing one or more exons. In 27 patients whose tumors showed absence of either another protein or combination of proteins, no pathogenic mutations were identified. We conclude that modified long-range PCR can be used to preferentially amplify the PMS2 gene and avoid pseudogene interference, thus providing a clinically useful germline analysis of PMS2. Our data also support the use of IHC screening to direct germline testing of PMS2. (c) 2010 Wiley-Liss, Inc.

  5. A mammalian pseudogene lncRNA at the interface of inflammation and anti-inflammatory therapeutics

    PubMed Central

    Rapicavoli, Nicole A; Qu, Kun; Zhang, Jiajing; Mikhail, Megan; Laberge, Remi-Martin; Chang, Howard Y

    2013-01-01

    Pseudogenes are thought to be inactive gene sequences, but recent evidence of extensive pseudogene transcription raised the question of potential function. Here we discover and characterize the sets of mouse lncRNAs induced by inflammatory signaling via TNFα. TNFα regulates hundreds of lncRNAs, including 54 pseudogene lncRNAs, several of which show exquisitely selective expression in response to specific cytokines and microbial components in a NF-κB-dependent manner. Lethe, a pseudogene lncRNA, is selectively induced by proinflammatory cytokines via NF-κB or glucocorticoid receptor agonist, and functions in negative feedback signaling to NF-κB. Lethe interacts with NF-κB subunit RelA to inhibit RelA DNA binding and target gene activation. Lethe level decreases with organismal age, a physiological state associated with increased NF-κB activity. These findings suggest that expression of pseudogenes lncRNAs are actively regulated and constitute functional regulators of inflammatory signaling. DOI: http://dx.doi.org/10.7554/eLife.00762.001 PMID:23898399

  6. Immunoglobulin Genomics in the Guinea Pig (Cavia porcellus)

    PubMed Central

    Guo, Yongchen; Bao, Yonghua; Meng, Qingwen; Hu, Xiaoxiang; Meng, Qingyong; Ren, Liming; Li, Ning; Zhao, Yaofeng

    2012-01-01

    In science, the guinea pig is known as one of the gold standards for modeling human disease. It is especially important as a molecular and cellular biology model for studying the human immune system, as its immunological genes are more similar to human genes than are those of mice. The utility of the guinea pig as a model organism can be further enhanced by further characterization of the genes encoding components of the immune system. Here, we report the genomic organization of the guinea pig immunoglobulin (Ig) heavy and light chain genes. The guinea pig IgH locus is located in genomic scaffolds 54 and 75, and spans approximately 6,480 kb. 507 VH segments (94 potentially functional genes and 413 pseudogenes), 41 DH segments, six JH segments, four constant region genes (μ, γ, ε, and α), and one reverse δ remnant fragment were identified within the two scaffolds. Many VH pseudogenes were found within the guinea pig, and likely constituted a potential donor pool for gene conversion during evolution. The Igκ locus mapped to a 4,029 kb region of scaffold 37 and 24 is composed of 349 Vκ (111 potentially functional genes and 238 pseudogenes), three Jκ and one Cκ genes. The Igλ locus spans 1,642 kb in scaffold 4 and consists of 142 Vλ (58 potentially functional genes and 84 pseudogenes) and 11 Jλ -Cλ clusters. Phylogenetic analysis suggested the guinea pig’s large germline VH gene segments appear to form limited gene families. Therefore, this species may generate antibody diversity via a gene conversion-like mechanism associated with its pseudogene reserves. PMID:22761756

  7. Differences in selection drive olfactory receptor genes in different directions in dogs and wolf.

    PubMed

    Chen, Rui; Irwin, David M; Zhang, Ya-Ping

    2012-11-01

    The olfactory receptor (OR) gene family is the largest gene family found in mammalian genomes. It is known to evolve through a birth-and-death process. Here, we characterized the sequences of 16 segregating OR pseudogenes in the samples of the wolf and the Chinese village dog (CVD) and compared them with the sequences from dogs of different breeds. Our results show that the segregating OR pseudogenes in breed dogs are under strong purifying selection, while evolving neutrally in the CVD, and show a more complicated pattern in the wolf. In the wolf, we found a trend to remove deleterious polymorphisms and accumulate nondeleterious polymorphisms. On the basis of protein structure of the ORs, we found that the distribution of different types of polymorphisms (synonymous, nonsynonymous, tolerated, and untolerated) varied greatly between the wolf and the breed dogs. In summary, our results suggest that different forms of selection have acted on the segregating OR pseudogenes in the CVD since domestication, breed dogs after breed formation, and ancestral wolf population, which has driven the evolution of these genes in different directions.

  8. Immunoglobulin genomics in the guinea pig (Cavia porcellus).

    PubMed

    Guo, Yongchen; Bao, Yonghua; Meng, Qingwen; Hu, Xiaoxiang; Meng, Qingyong; Ren, Liming; Li, Ning; Zhao, Yaofeng

    2012-01-01

    In science, the guinea pig is known as one of the gold standards for modeling human disease. It is especially important as a molecular and cellular biology model for studying the human immune system, as its immunological genes are more similar to human genes than are those of mice. The utility of the guinea pig as a model organism can be further enhanced by further characterization of the genes encoding components of the immune system. Here, we report the genomic organization of the guinea pig immunoglobulin (Ig) heavy and light chain genes. The guinea pig IgH locus is located in genomic scaffolds 54 and 75, and spans approximately 6,480 kb. 507 V(H) segments (94 potentially functional genes and 413 pseudogenes), 41 D(H) segments, six J(H) segments, four constant region genes (μ, γ, ε, and α), and one reverse δ remnant fragment were identified within the two scaffolds. Many V(H) pseudogenes were found within the guinea pig, and likely constituted a potential donor pool for gene conversion during evolution. The Igκ locus mapped to a 4,029 kb region of scaffold 37 and 24 is composed of 349 V(κ) (111 potentially functional genes and 238 pseudogenes), three J(κ) and one C(κ) genes. The Igλ locus spans 1,642 kb in scaffold 4 and consists of 142 V(λ) (58 potentially functional genes and 84 pseudogenes) and 11 J(λ) -C(λ) clusters. Phylogenetic analysis suggested the guinea pig's large germline V(H) gene segments appear to form limited gene families. Therefore, this species may generate antibody diversity via a gene conversion-like mechanism associated with its pseudogene reserves.

  9. The rDNA ITS region in the lessepsian marine angiosperm Halophila stipulacea (Forssk.) Aschers. (Hydrocharitaceae): intragenomic variability and putative pseudogenic sequences.

    PubMed

    Ruggiero, Maria Valeria; Procaccini, Gabriele

    2004-01-01

    Halophila stipulacea is a dioecious marine angiosperm, widely distributed along the western coasts of the Indian Ocean and the Red Sea. This species is thought to be a Lessepsian immigrant that entered the Mediterranean Sea from the Red Sea after the opening of the Suez Canal (1869). Previous studies have revealed both high phenotypic and genetic variability in Halophila stipulacea populations from the western Mediterranean basin. In order to test the hypothesis of a Lessepsian introduction, we compare genetic polymorphism between putative native (Red Sea) and introduced (Mediterranean) populations through rDNA ITS region (ITS1-5.8S-ITS2) sequence analysis. A high degree of intraindividual variability of ITS sequences was found. Most of the intragenomic polymorphism was due to pseudogenic sequences, present in almost all individuals. Features of ITS functional sequences and pseudogenes are described. Possible causes for the lack of homogenization of ITS paralogues within individuals are discussed.

  10. Intron-exon organization of the active human protein S gene PS. alpha. and its pseudogene PS. beta. : Duplication and silencing during primate evolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ploos van Amstel, H.; Reitsma, P.H.; van der Logt, C.P.

    The human protein S locus on chromosome 3 consists of two protein S genes, PS{alpha} and PS{beta}. Here the authors report the cloning and characterization of both genes. Fifteen exons of the PS{alpha} gene were identified that together code for protein S mRNA as derived from the reported protein S cDNAs. Analysis by primer extension of liver protein S mRNA, however, reveals the presence of two mRNA forms that differ in the length of their 5{prime}-noncoding region. Both transcripts contain a 5{prime}-noncoding region longer than found in the protein S cDNAs. The two products may arise from alternative splicing ofmore » an additional intron in this region or from the usage of two start sites for transcription. The intron-exon organization of the PS{alpha} gene fully supports the hypothesis that the protein S gene is the product of an evolutional assembling process in which gene modules coding for structural/functional protein units also found in other coagulation proteins have been put upstream of the ancestral gene of a steroid hormone binding protein. The PS{beta} gene is identified as a pseudogene. It contains a large variety of detrimental aberrations, viz., the absence of exon I, a splice site mutation, three stop codons, and a frame shift mutation. Overall the two genes PS{alpha} and PS{beta} show between their exonic sequences 96.5% homology. Southern analysis of primate DNA showed that the duplication of the ancestral protein S gene has occurred after the branching of the orangutan from the African apes. A nonsense mutation that is present in the pseudogene of man also could be identified in one of the two protein S genes of both chimpanzee and gorilla. This implicates that silencing of one of the two protein S genes must have taken place before the divergence of the three African apes.« less

  11. New Implications on Genomic Adaptation Derived from the Helicobacter pylori Genome Comparison

    PubMed Central

    Lara-Ramírez, Edgar Eduardo; Segura-Cabrera, Aldo; Guo, Xianwu; Yu, Gongxin; García-Pérez, Carlos Armando; Rodríguez-Pérez, Mario A.

    2011-01-01

    Background Helicobacter pylori has a reduced genome and lives in a tough environment for long-term persistence. It evolved with its particular characteristics for biological adaptation. Because several H. pylori genome sequences are available, comparative analysis could help to better understand genomic adaptation of this particular bacterium. Principal Findings We analyzed nine H. pylori genomes with emphasis on microevolution from a different perspective. Inversion was an important factor to shape the genome structure. Illegitimate recombination not only led to genomic inversion but also inverted fragment duplication, both of which contributed to the creation of new genes and gene family, and further, homological recombination contributed to events of inversion. Based on the information of genomic rearrangement, the first genome scaffold structure of H. pylori last common ancestor was produced. The core genome consists of 1186 genes, of which 22 genes could particularly adapt to human stomach niche. H. pylori contains high proportion of pseudogenes whose genesis was principally caused by homopolynucleotide (HPN) mutations. Such mutations are reversible and facilitate the control of gene expression through the change of DNA structure. The reversible mutations and a quasi-panmictic feature could allow such genes or gene fragments frequently transferred within or between populations. Hence, pseudogenes could be a reservoir of adaptation materials and the HPN mutations could be favorable to H. pylori adaptation, leading to HPN accumulation on the genomes, which corresponds to a special feature of Helicobacter species: extremely high HPN composition of genome. Conclusion Our research demonstrated that both genome content and structure of H. pylori have been highly adapted to its particular life style. PMID:21387011

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Popp, R.A.; Lalley, P.A.; Whitney, J.B.

    A genetic polymorphism for a Bgl I endonuclease site near the ..cap alpha..-globin-like pseudogene ..cap alpha..-4 of C57BL/6 and C3H/HeN mice was used to show that ..cap alpha..-4 was not affected by three independent mutations in which the adult globin genes ..cap alpha..-1 and ..cap alpha..-2 were deleted. These results indicated that ..cap alpha..-4 might not be located adjacent to the adult ..cap alpha..-globin genes on chromosome 11. Restriction endonuclease analysis of DNA of a primary clone of a Chinese hamster-mouse somatic cell hybrid that had lost mouse chromosomes 11 and 18 showed that this clone lacked the adult murinemore » globin genes ..cap alpha..-1 and ..cap alpha..-2 but it did contain the ..cap alpha..-globin-like pseudogenes ..cap alpha..-3 and ..cap alpha..-4. These results indicated that the adult ..cap alpha..-globin genes and ..cap alpha..-globin-like pseudogenes are not located on the same chromosome. Similar analyses of several other Chinese hamster-mouse somatic cell hybrids that had segregated other mouse chromosomes indicated that the ..cap alpha..-globin-like pseudogenes ..cap alpha..-3 and ..cap alpha..-4 are located on mouse chromosomes 15 and 17, respectively. These data explain why ..cap alpha..-3 and ..cap alpha..-4 were not affected by the three independently induced deletion-type mutations that cause ..cap alpha..-thalassemia in the mouse.« less

  13. A systematic evaluation of expression of HERV-W elements; influence of genomic context, viral structure and orientation

    PubMed Central

    2011-01-01

    Background One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated. Results Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes. Conclusions Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues. PMID:21226900

  14. Isolation of CYP3A5P cDNA from human liver: a reflection of a novel cytochrome P-450 pseudogene.

    PubMed

    Schuetz, J D; Guzelian, P S

    1995-03-14

    We have isolated, from a human liver cDNA library, a 1627 bp CYP3A5 cDNA variant (CYP3A5P) that contains several large insertions, deletions, and in-frame termination codons. By comparison with the genomic structure of other CYP3A genes, the major insertions in CYP3A5P cDNA demarcate the inferred sites of several CYP3A5 exons. The segments inserted in CYP3A5P have no homology with splice donor acceptor sites. It is unlikely that CYP3A5P cDNA represents an artifact of the cloning procedures since Southern blot analysis of human genomic DNA disclosed that CYP3A5P cDNA hybridized with a DNA fragment distinct from fragments that hybridized with either CYP3A5, CYP3A3 or CYP3A4. Moreover, analysis of adult human liver RNA on Northern blots hybridized with a CYP3A5P cDNA fragment revealed the presence of an mRNA with the predicted size of CYP3A5P. We conclude that CYP3A5P cDNA was derived from a separate gene, CYP3A5P, most likely a pseudogene evolved from CYP3A5.

  15. Evolution of a pseudogene: exclusive survival of a functional mitochondrial nad7 gene supports Haplomitrium as the earliest liverwort lineage and proposes a secondary loss of RNA editing in Marchantiidae.

    PubMed

    Groth-Malonek, Milena; Wahrmund, Ute; Polsakiewicz, Monika; Knoop, Volker

    2007-04-01

    Gene transfer from the mitochondrion into the nucleus is a corollary of the endosymbiont hypothesis. The frequent and independent transfer of genes for mitochondrial ribosomal proteins is well documented with many examples in angiosperms, whereas transfer of genes for components of the respiratory chain is a rarity. A notable exception is the nad7 gene, encoding subunit 7 of complex I, in the liverwort Marchantia polymorpha, which resides as a full-length, intron-carrying and transcribed, but nonspliced pseudogene in the chondriome, whereas its functional counterpart is nuclear encoded. To elucidate the patterns of pseudogene degeneration, we have investigated the mitochondrial nad7 locus in 12 other liverworts of broad phylogenetic distribution. We find that the mitochondrial nad7 gene is nonfunctional in 11 of them. However, the modes of pseudogene degeneration vary: whereas point mutations, accompanied by single-nucleotide indels, predominantly introduce stop codons into the reading frame in marchantiid liverworts, larger indels introduce frameshifts in the simple thalloid and leafy jungermanniid taxa. Most notably, however, the mitochondrial nad7 reading frame appears to be intact in the isolated liverwort genus Haplomitrium. Its functional expression is shown by cDNA analysis identifying typical RNA-editing events to reconstitute conserved codon identities and also confirming functional splicing of the 2 liverwort-specific group II introns. We interpret our results 1) to indicate the presence of a functional mitochondrial nad7 gene in the earliest land plants and strongly supporting a basal placement of Haplomitrium among the liverworts, 2) to indicate different modes of pseudogene degeneration and chondriome evolution in the later branching liverwort clades, 3) to suggest a surprisingly long maintenance of a nonfunctional gene in the presumed oldest group of land plants, and 4) to support the model of a secondary loss of RNA-editing activity in marchantiid liverworts.

  16. The molecular dynamics of long noncoding RNA control of transcription in PTEN and its pseudogene

    PubMed Central

    Lister, Nicholas; Shevchenko, Galina; Walshe, James L.; Groen, Jessica; Johnsson, Per; Vidarsdóttir, Linda; Grander, Dan; Ataide, Sandro F.; Morris, Kevin V.

    2017-01-01

    RNA has been found to interact with chromatin and modulate gene transcription. In human cells, little is known about how long noncoding RNAs (lncRNAs) interact with target loci in the context of chromatin. We find here, using the phosphatase and tensin homolog (PTEN) pseudogene as a model system, that antisense lncRNAs interact first with a 5′ UTR-containing promoter-spanning transcript, which is then followed by the recruitment of DNA methyltransferase 3a (DNMT3a), ultimately resulting in the transcriptional and epigenetic control of gene expression. Moreover, we find that the lncRNA and promoter-spanning transcript interaction are based on a combination of structural and sequence components of the antisense lncRNA. These observations suggest, on the basis of this one example, that evolutionary pressures may be placed on RNA structure more so than sequence conservation. Collectively, the observations presented here suggest a much more complex and vibrant RNA regulatory world may be operative in the regulation of gene expression. PMID:28847966

  17. A new CYP21A1P/CYP21A2 chimeric gene identified in an Italian woman suffering from classical congenital adrenal hyperplasia form

    PubMed Central

    Concolino, Paola; Mello, Enrica; Minucci, Angelo; Giardina, Emiliano; Zuppi, Cecilia; Toscano, Vincenzo; Capoluongo, Ettore

    2009-01-01

    Background More than 90% of Congenital Adrenal Hyperplasia (CAH) cases are associated with mutations in the 21-hydroxylase gene (CYP21A2) in the HLA class III area on the short arm of chromosome 6p21.3. In this region, a 30 kb deletion produces a non functional chimeric gene with its 5' and 3' ends corresponding to CYP21A1P pseudogene and CYP21A2, respectively. To date, five different CYP21A1P/CYP21A2 chimeric genes have been found and characterized in recent studies. In this paper, we describe a new CYP21A1P/CYP21A2 chimera (CH-6) found in an Italian CAH patient. Methods Southern blot analysis and CYP21A2 sequencing were performed on the patient. In addition, in order to isolate the new CH-6 chimeric gene, two different strategies were used. Results The CYP21A2 sequencing analysis showed that the patient was homozygote for the g.655C/A>G mutation and heterozygote for the p.P30L missense mutation. In addition, the promoter sequence revealed the presence, in heterozygosis, of 13 SNPs generally produced by microconversion events between gene and pseudogene. Southern blot analysis showed that the woman was heterozygote for the classic 30-kb deletion producing a new CYP21A1P/CYP21A2 chimeric gene (CH-6). The hybrid junction site was located between the end of intron 2 pseudogene, after the g.656C/A>G mutation, and the beginning of exon 3, before the 8 bp deletion. Consequently, CH-6 carries three mutations: the weak pseudogene promoter region, the p.P30L and the g.655C/A>G splice mutation. Conclusion We describe a new CYP21A1P/CYP21A2 chimera (CH-6), associated with the HLA-B15, DR13 haplotype, in a young Italian CAH patient. PMID:19624807

  18. Stem cell regulatory function mediated by expression of a novel mouse Oct4 pseudogene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Huey; Shabbir, Arsalan; Molnar, Merced

    2007-03-30

    Multiple pseudogenes have been proposed for embryonic stem (ES) cell-specific genes, and their abundance suggests that some of these potential pseudogenes may be functional. ES cell-specific expression of Oct4 regulates stem cell pluripotency and self-renewing state. Although Oct4 expression has been reported in adult tissues during gene reprogramming, the detected Oct4 signal might be contributed by Oct4 pseudogenes. Among the multiple Oct4 transcripts characterized here is a {approx}1 kb clone derived from P19 embryonal carcinoma stem cells, which shares a {approx}87% sequence homology with the parent Oct4 gene, and has the potential of encoding an 80-amino acid product (designated asmore » Oct4P1). Adenoviral expression of Oct4P1 in mesenchymal stem cells promotes their proliferation and inhibits their osteochondral differentiation. These dual effects of Oct4P1 are reminiscent of the stem cell regulatory function of the parent Oct4, and suggest that Oct4P1 may be a functional pseudogene or a novel Oct4-related gene with a unique function in stem cells.« less

  19. Genomic comparison of the closely-related Salmonella enterica serovars enteritidis, dublin and gallinarum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.

    The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content betweenmore » strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. As a result, the loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.« less

  20. Genomic comparison of the closely-related Salmonella enterica serovars enteritidis, dublin and gallinarum

    DOE PAGES

    Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.; ...

    2015-06-03

    The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content betweenmore » strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. As a result, the loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.« less

  1. Genomic Comparison of the Closely-Related Salmonella enterica Serovars Enteritidis, Dublin and Gallinarum

    PubMed Central

    Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.; Busch, Julia; Cassman, Noriko; Dutilh, Bas E.; Green, Dawn; Matlock, Brian; Heffernan, Brian; Olsen, Gary J.; Farris Hanna, Leigh; Schifferli, Dieter M.; Maloy, Stanley; Dinsdale, Elizabeth A.; Edwards, Robert A.

    2015-01-01

    The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content between strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. The loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars. PMID:26039056

  2. Detection of two distinct forms of apoC-I in great apes.

    PubMed

    Puppione, Donald L; Ryan, Christopher M; Bassilian, Sara; Souda, Puneet; Xiao, Xinshu; Ryder, Oliver A; Whitelegge, Julian P

    2010-03-01

    ApoC-I, the smallest of the soluble apolipoproteins, associates with both TG-rich lipoproteins and HDL. Mass spectral analyses of human apoC-I previously had demonstrated that in the circulation there are two forms, either a 57 amino acid protein or a 55 amino acid protein, due to the loss of two amino acids from the N-terminus. In our analyses of the apolipoproteins of the other great apes by mass spectrometry, four forms of apoC-I were detected. Two of these showed a high degree of identity to the mature and truncated forms of human apoC-I. The other two were homologous to the virtual protein and its truncated form that are encoded by a human pseudogene. In humans, the genes for apoC-I and its pseudogene are located on chromosome 19, the pseudogene being 2.5 kb downstream from the apoC-I gene. Based on the similarity between the apoC-I gene and the pseudogene, it has been concluded that the latter arose from the former as a result of gene duplication approximately 35 million years ago. Interestingly, the virtual protein encoded by the pseudogene is acidic, not basic like apoC-I. In the chimpanzee, there also are two genes for apoC-I, the one upstream encodes a basic protein and the downstream gene, rather than being a pseudogene, encodes an acidic protein (P86336). In addition to reporting on the molecular masses of great ape apoC-I, we were able to clearly demonstrate by "Top-down" sequencing that the acidic form arose from a separate gene. In our analyses, we have measured the molecular masses of apoC-I associated with the HDL of the following great apes: bonobo (Pan paniscus), chimpanzee (Pan troglodytes), and the Sumatran orangutan (Pongo abelii). Genomic variations in chromosome 19 among great apes, baboons and macaques as they relate to both genes for apoC-I and the pseudogene are compared and discussed.

  3. Noise-induced multistability in the regulation of cancer by genes and pseudogenes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Petrosyan, K. G., E-mail: pkaren@phys.sinica.edu.tw; Hu, Chin-Kun, E-mail: huck@phys.sinica.edu.tw; National Center for Theoretical Sciences, National Tsing Hua University, Hsinchu 30013, Taiwan

    2016-07-28

    We extend a previously introduced model of stochastic gene regulation of cancer to a nonlinear case having both gene and pseudogene messenger RNAs (mRNAs) self-regulated. The model consists of stochastic Boolean genetic elements and possesses noise-induced multistability (multimodality). We obtain analytical expressions for probabilities for the case of constant but finite number of microRNA molecules which act as a noise source for the competing gene and pseudogene mRNAs. The probability distribution functions display both the global bistability regime as well as even-odd number oscillations for a certain range of model parameters. Statistical characteristics of the mRNA’s level fluctuations are evaluated.more » The obtained results of the extended model advance our understanding of the process of stochastic gene and pseudogene expressions that is crucial in regulation of cancer.« less

  4. RNA-based mutation analysis identifies an unusual MSH6 splicing defect and circumvents PMS2 pseudogene interference.

    PubMed

    Etzler, J; Peyrl, A; Zatkova, A; Schildhaus, H-U; Ficek, A; Merkelbach-Bruse, S; Kratz, C P; Attarbaschi, A; Hainfellner, J A; Yao, S; Messiaen, L; Slavc, I; Wimmer, K

    2008-02-01

    Heterozygous germline mutations in one of the mismatch repair (MMR) genes MLH1, MSH2, MSH6, and PMS2 cause hereditary nonpolyposis colorectal cancer (HNPCC) or Lynch syndrome, a dominantly inherited cancer susceptibility syndrome. Recent reports provide evidence for a novel recessively inherited cancer syndrome with constitutive MMR deficiency due to biallelic germline mutations in one of the MMR genes. MMR-deficiency (MMR-D) syndrome is characterized by childhood brain tumors, hematological and/or gastrointestinal malignancies, and signs of neurofibromatosis type 1 (NF1). We established an RNA-based mutation detection assay for the four MMR genes, since 1) a number of splicing defects may escape detection by the analysis of genomic DNA, and 2) DNA-based mutation detection in the PMS2 gene is severely hampered by the presence of multiple highly similar pseudogenes, including PMS2CL. Using this assay, which is based on direct cDNA sequencing of RT-PCR products, we investigated two families with children suspected to suffer from MMR-D syndrome. We identified a homozygous complex MSH6 splicing alteration in the index patients of the first family and a novel homozygous PMS2 mutation (c.182delA) in the index patient of the second family. Furthermore, we demonstrate, by the analysis of a PMS2/PMS2CL "hybrid" allele carrier, that RNA-based PMS2 testing effectively avoids the caveats of genomic DNA amplification approaches; i.e., pseudogene coamplification as well as allelic dropout, and will, thus, allow more sensitive mutation analysis in MMR deficiency and in HNPCC patients with PMS2 defects. (c) 2007 Wiley-Liss, Inc.

  5. Expression of homing endonuclease gene and insertion-like element in sea anemone mitochondrial genomes: Lesson learned from Anemonia viridis.

    PubMed

    Chi, Sylvia Ighem; Urbarova, Ilona; Johansen, Steinar D

    2018-04-30

    The mitochondrial genomes of sea anemones are dynamic in structure. Invasion by genetic elements, such as self-catalytic group I introns or insertion-like sequences, contribute to sea anemone mitochondrial genome expansion and complexity. By using next generation sequencing we investigated the complete mtDNAs and corresponding transcriptomes of the temperate sea anemone Anemonia viridis and its closer tropical relative Anemonia majano. Two versions of fused homing endonuclease gene (HEG) organization were observed among the Actiniidae sea anemones; in-frame gene fusion and pseudo-gene fusion. We provided support for the pseudo-gene fusion organization in Anemonia species, resulting in a repressed HEG from the COI-884 group I intron. orfA, a putative protein-coding gene with insertion-like features, was present in both Anemonia species. Interestingly, orfA and COI expression were significantly up-regulated upon long-term environmental stress corresponding to low seawater pH conditions. This study provides new insights to the dynamics of sea anemone mitochondrial genome structure and function. Copyright © 2018 Elsevier B.V. All rights reserved.

  6. Generation and reactivation of T-cell receptor A joining region pseudogenes in primates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thiel, C.; Lanchbury, J.S.; Otting, N.

    1996-06-01

    Tandemly duplicated T-cell receptor (Tcr) AJ (J{alpha}) segments contribute significantly to TCRA chain junctional region diversity in mammals. Since only limited data exists on TCRA diversity in nonhuman primates, we examined the TCRAJ regions of 37 chimpanzee and 71 rhesus macaque TCRA cDNA clones derived from inverse polymerase chain reaction on peripheral blood mononuclear cell cDNA of healthy animals. Twenty-five different TCRAJ regions were characterized in the chimpanzee and 36 in the rhesus macaque. Each bears a close structural relationship to an equivalent human TCRAJ region. Conserved amino acid motifs are shared between all three species. There are indications thatmore » differences between nonhuman primates and humans exist in the generation of TCRAJ pseudogenes. The nucleotide and amino acid sequences of the various characterized TCRAJ of each species are reported and we compare our results to the available information on human genomic sequences. Although we provide evidence of dynamic processes modifying TCRAJ segments during primate evolution, their repertoire and primary structure appears to be relatively conserved. 21 refs., 2 figs.« less

  7. The vestigial olfactory receptor subgenome of odontocete whales: phylogenetic congruence between gene-tree reconciliation and supermatrix methods.

    PubMed

    McGowen, Michael R; Clark, Clay; Gatesy, John

    2008-08-01

    The macroevolutionary transition of whales (cetaceans) from a terrestrial quadruped to an obligate aquatic form involved major changes in sensory abilities. Compared to terrestrial mammals, the olfactory system of baleen whales is dramatically reduced, and in toothed whales is completely absent. We sampled the olfactory receptor (OR) subgenomes of eight cetacean species from four families. A multigene tree of 115 newly characterized OR sequences from these eight species and published data for Bos taurus revealed a diverse array of class II OR paralogues in Cetacea. Evolution of the OR gene superfamily in toothed whales (Odontoceti) featured a multitude of independent pseudogenization events, supporting anatomical evidence that odontocetes have lost their olfactory sense. We explored the phylogenetic utility of OR pseudogenes in Cetacea, concentrating on delphinids (oceanic dolphins), the product of a rapid evolutionary radiation that has been difficult to resolve in previous studies of mitochondrial DNA sequences. Phylogenetic analyses of OR pseudogenes using both gene-tree reconciliation and supermatrix methods yielded fully resolved, consistently supported relationships among members of four delphinid subfamilies. Alternative minimizations of gene duplications, gene duplications plus gene losses, deep coalescence events, and nucleotide substitutions plus indels returned highly congruent phylogenetic hypotheses. Novel DNA sequence data for six single-copy nuclear loci and three mitochondrial genes (> 5000 aligned nucleotides) provided an independent test of the OR trees. Nucleotide substitutions and indels in OR pseudogenes showed a very low degree of homoplasy in comparison to mitochondrial DNA and, on average, provided more variation than single-copy nuclear DNA. Our results suggest that phylogenetic analysis of the large OR superfamily will be effective for resolving relationships within Cetacea whether supermatrix or gene-tree reconciliation procedures are used.

  8. The c.1364C>A (p.A455E) Mutation in the CFTR Pseudogene Results in an Incorrectly Assigned Carrier Status by a Commonly Used Screening Platform.

    PubMed

    Deeb, Kristin K; Metcalf, James D; Sesock, Kaitlin M; Shen, Junqing; Wensel, Christine A; Rippel, Larisa I; Smith, Michelle; Chapman, Mark S; Zhang, Shulin

    2015-07-01

    Cystic fibrosis (CF) is one of the most common recessive conditions among whites, with an estimated carrier frequency of 1 in 25 in the United States. Population-based CF carrier screening was implemented in the United States in 2001. The number of mutations screened by each laboratory may vary; however, the 23 most common CF mutations recommended for screening by the American College of Medical Genetics and American College of Obstetricians and Gynecologists are included in all platforms. The c.1364C>A (p.A455E) mutation located in exon 10 of the CFTR gene is one of the 23 mutations. Because CFTR exon 10 and its flanking intronic regions are duplicated and transposed onto several other chromosomes of the human genome during evolution and function as unprocessed pseudogenes, variations in the CFTR pseudogenes may confound CF screening results for mutations located in exon 10 of the CFTR gene. We report an incorrectly identified carrier status for the c.1364C>A (p.A455E) mutation in a healthy individual using the Hologic InPlex CF assay. Further analysis revealed that the mutation resides in one of the CFTR pseudogenes. Because most commercial kits and laboratory-developed tests for CF carrier screening involve a short amplicon encompassing this mutation, this finding suggests that individuals with the c.1364C>A (p.A455E) mutation may require further investigation to avoid a false assignment of CF carrier status. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  9. An evolutionary insight into the hatching strategies of pipefish and seahorse embryos.

    PubMed

    Kawaguchi, Mari; Nakano, Yuko; Kawahara-Miki, Ryouka; Inokuchi, Mayu; Yorifuji, Makiko; Okubo, Ryohei; Nagasawa, Tatsuki; Hiroi, Junya; Kono, Tomohiro; Kaneko, Toyoji

    2016-03-01

    Syngnathiform fishes carry their eggs in a brood structure found in males. The brood structure differs from species to species: seahorses carry eggs within enclosed brood pouch, messmate pipefish carry eggs in the semi-brood pouch, and alligator pipefish carry eggs in the egg compartment on abdomen. These egg protection strategies were established during syngnathiform evolution. In the present study, we compared the hatching mode of protected embryos of three species. Electron microscopic observations revealed that alligator pipefish and messmate pipefish egg envelopes were thicker than those of seahorses, suggesting that the seahorse produces a weaker envelope. Furthermore, molecular genetic analysis revealed that these two pipefishes possessed the egg envelope-digesting enzymes, high choriolytic enzyme (HCE), and low choriolytic enzyme (LCE), as do many euteleosts. In seahorses, however, only HCE gene expression was detected. When searching the entire seahorse genome by high-throughput DNA sequencing, we did not find a functional LCE gene and only a trace of the LCE gene exon was found, confirming that the seahorse LCE gene was pseudogenized during evolution. Finally, we estimated the size and number of hatching gland cells expressing hatching enzyme genes by whole-mount in situ hybridization. The seahorse cells were the smallest of the three species, while they had the greatest number. These results suggest that the isolation of eggs from the external environment by paternal bearing might bring the egg envelope thin, and then, the hatching enzyme genes became pseudogenized. J. Exp. Zool. (Mol. Dev. Evol.) 9999B:XX-XX, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  10. Gaucher disease: A G[sup +1][yields]A[sup +1] IVS2 splice donor site mutation causing exon 2 skipping in the acid [beta]-glucosidase mRNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Guo-Shun; Grabowski, G.A.

    1992-10-01

    Gaucher disease is the most frequent lysosomal storage disease and the most prevalent Jewish genetic disease. About 30 identified missense mutations are causal to the defective activity of acid [beta]-glucosidase in this disease. cDNAs were characterized from a moderately affected 9-year-old Ashkenazi Jewish Gaucher disease type 1 patient whose 80-years-old, enzyme-deficient, 1226G (Asn[sup 370][yields]Ser [N370S]) homozygous grandfather was nearly asymptomatic. Sequence analyses revealed four populations of cDNAs with either the 1226G mutation, an exact exon 2 ([Delta] EX2) deletion, a deletion of exon 2 and the first 115 bp of exon 3 ([Delta] EX2-3), or a completely normal sequence. Aboutmore » 50% of the cDNAs were the [Delta] EX2, the [Delta] EX2-3, and the normal cDNAs, in a ratio of 6:3:1. Specific amplification and characterization of exon 2 and 5[prime] and 3[prime] intronic flanking sequences from the structural gene demonstrated clones with either the normal sequence or with a G[sup +1][yields]A[sup +1] transition at the exon 2/intron 2 boundary. This mutation destroyed the splice donor consensus site (U1 binding site) for mRNA processing. This transition also was present at the corresponding exon/intron boundary of the highly homologous pseudogene. This new mutation, termed [open quotes]IVS2 G[sup +1],[close quotes] is the first in the Ashkenazi Jewish population. The occurrence of this [open quotes]pseudogene[close quotes]-type mutation in the structural gene indicates the role of acid [beta]-glucosidase pseudogene and structural gene rearrangements in the pathogenesis of this disease. 33 refs., 8 figs., 1 tab.« less

  11. Complete Genome Sequence of Treponema paraluiscuniculi, Strain Cuniculi A: The Loss of Infectivity to Humans Is Associated with Genome Decay

    PubMed Central

    Šmajs, David; Zobaníková, Marie; Strouhal, Michal; Čejková, Darina; Dugan-Rocha, Shannon; Pospíšilová, Petra; Norris, Steven J.; Albert, Tom; Qin, Xiang; Hallsworth-Pepin, Kym; Buhay, Christian; Muzny, Donna M.; Chen, Lei; Gibbs, Richard A.; Weinstock, George M.

    2011-01-01

    Treponema paraluiscuniculi is the causative agent of rabbit venereal spirochetosis. It is not infectious to humans, although its genome structure is very closely related to other pathogenic Treponema species including Treponema pallidum subspecies pallidum, the etiological agent of syphilis. In this study, the genome sequence of Treponema paraluiscuniculi, strain Cuniculi A, was determined by a combination of several high-throughput sequencing strategies. Whereas the overall size (1,133,390 bp), arrangement, and gene content of the Cuniculi A genome closely resembled those of the T. pallidum genome, the T. paraluiscuniculi genome contained a markedly higher number of pseudogenes and gene fragments (51). In addition to pseudogenes, 33 divergent genes were also found in the T. paraluiscuniculi genome. A set of 32 (out of 84) affected genes encoded proteins of known or predicted function in the Nichols genome. These proteins included virulence factors, gene regulators and components of DNA repair and recombination. The majority (52 or 61.9%) of the Cuniculi A pseudogenes and divergent genes were of unknown function. Our results indicate that T. paraluiscuniculi has evolved from a T. pallidum-like ancestor and adapted to a specialized host-associated niche (rabbits) during loss of infectivity to humans. The genes that are inactivated or altered in T. paraluiscuniculi are candidates for virulence factors important in the infectivity and pathogenesis of T. pallidum subspecies. PMID:21655244

  12. pseudoMap: an innovative and comprehensive resource for identification of siRNA-mediated mechanisms in human transcribed pseudogenes.

    PubMed

    Chan, Wen-Ling; Yang, Wen-Kuang; Huang, Hsien-Da; Chang, Jan-Gowth

    2013-01-01

    RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL: http://pseudomap.mbc.nctu.edu.tw/

  13. Uncovering the molecular organization of unusual highly scattered 5S rDNA: The case of Chariesterus armatus (Heteroptera).

    PubMed

    Bardella, Vanessa Bellini; Cabral-de-Mello, Diogo Cavalcanti

    2018-03-10

    One cluster of 5S rDNA per haploid genome is the most common pattern among Heteroptera. However, in Chariesterus armatus, highly scattered signals were noticed. We isolated and characterized the entire 5S rDNA unit of C. armatus aiming to a deeper knowledge of molecular organization of the 5S rDNA among Heteroptera and to understand possible causes and consequences of 5S rDNA chromosomal spreading. For a comparative analysis, we performed the same approach in Holymenia histrio with 5S rDNA restricted to one bivalent. Multiple 5S rDNA variants were observed in both species, though they were more variable in C. armatus, with some of variants corresponding to pseudogenes. These pseudogenes suggest birth-and-death mechanism, though homogenization was also observed (concerted evolution), indicating evolution through mixed model. Association between transposable elements and 5S rDNA was not observed, suggesting spreading of 5S rDNA through other mechanisms, like ectopic recombination. Scattered organization is a rare example for 5S rDNA, and such organization in C. armatus genome could have led to the high diversification of sequences favoring their pseudogenization. Copyright © 2017. Published by Elsevier B.V.

  14. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae.

    PubMed

    Choi, Kyoung Su; Park, Kyu Tae; Park, SeonJoo

    2017-11-16

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops ( Colocasia , commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna , Spirodela , Wolffiella , Wolffia , Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus . In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region.

  15. Characterization of a novel gene at the Gaucher disease locus spanning the region between the glucocerebrosidase (GC) pseudogene and thrombospondin (TSP)3

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ginns, E.I.; Winfield, S.; Sidransky, E.

    1994-09-01

    The human GC locus on chromosome 1q21 encompasses a 7 kb functional gene encoding the enzyme deficient in Gaucher disease, and a highly homologous sequence 16 Kb downstream that has the properties of a pseudogene. A novel gene, gene X, spanning the 6 kb region between the pseudogene and TSP3 has been identified and characterized in the mouse, and appears to be critical for normal embryonic development. As in the mouse, the human gene X is located 5{prime} to the TSP3 gene and two genes are transcribed divergently from a bidirectional promoter; the direction of transcription of gene X andmore » GC is convergent. However, in the human, gene X and GC are separated by gene X and GC pseudogenes that are the consequence of a gene duplication. The gene X pseudogene lacks the first exon and part of the second exon of the functional gene and may not be transcribed. Northern blot analyses indicate that gene X is transcribed in both normal individuals and in patients with Gaucher disease, but the function of this gene is still unknown. The possibility that mutations in gene X could account for some of the diversity of symptoms encountered in individuals with the more atypical presentations of Gaucher disease is under investigation.« less

  16. Avoidance of pseudogene interference in the detection of 3' deletions in PMS2.

    PubMed

    Vaughn, Cecily P; Hart, Kimberly J; Samowitz, Wade S; Swensen, Jeffrey J

    2011-09-01

    Lynch syndrome is characterized by mutations in the mismatch repair genes MLH1, MSH2, MSH6, and PMS2. In PMS2, detection of mutations is confounded by numerous pseudogenes. Detection of 3' deletions is particularly complicated by the pseudogene PMS2CL, which has strong similarity to PMS2 exons 9 and 11-15, due to extensive gene conversion. A newly designed multiplex ligation-dependent probe amplification (MLPA) kit incorporates probes for variants found in both PMS2 and PMS2CL. This provides detection of deletions, but does not allow localization of deletions to the gene or pseudogene. To address this, we have developed a methodology incorporating reference samples with known copy numbers of variants, and paired MLPA results with sequencing of PMS2 and PMS2CL. We tested a subset of clinically indicated samples for which mutations were either unidentified or not fully characterized using existing methods. We identified eight unrelated patients with deletions encompassing exons 9-15, 11-15, 13-15, 14-15, and 15. By incorporating specific, characterized reference samples and sequencing the gene and pseudogene it is possible to identify deletions in this region of PMS2 and provide clinically relevant results. This methodology represents a significant advance in the diagnosis of patients with Lynch syndrome caused by PMS2 mutations. © 2011 Wiley-Liss, Inc.

  17. Gene duplication and phylogeography of North American members of the Hart Park serogroup of avian rhabdoviruses.

    PubMed

    Allison, Andrew B; Mead, Daniel G; Palacios, Gustavo F; Tesh, Robert B; Holmes, Edward C

    2014-01-05

    Flanders virus (FLAV) and Hart Park virus (HPV) are rhabdoviruses that circulate in mosquito-bird cycles in the eastern and western United States, respectively, and constitute the only two North American representatives of the Hart Park serogroup. Previously, it was suggested that FLAV is unique among the rhabdoviruses in that it contains two pseudogenes located between the P and M genes, while the cognate sequence for HPV has been lacking. Herein, we demonstrate that FLAV and HPV do not contain pseudogenes in this region, but encode three small functional proteins designated as U1-U3 that apparently arose by gene duplication. To further investigate the U1-U3 region, we conducted the first large-scale evolutionary analysis of a member of the Hart Park serogroup by analyzing over 100 spatially and temporally distinct FLAV isolates. Our phylogeographic analysis demonstrates that although FLAV appears to be slowly evolving, phylogenetically divergent lineages co-circulate sympatrically. © 2013 Published by Elsevier Inc.

  18. Pan-Genomic Analysis Provides Insights into the Genomic Variation and Evolution of Salmonella Paratyphi A

    PubMed Central

    Chen, Chunxia; Cui, Xiaoying; Yu, Jun; Xiao, Jingfa; Kan, Biao

    2012-01-01

    Salmonella Paratyphi A (S. Paratyphi A) is a highly adapted, human-specific pathogen that causes paratyphoid fever. Cases of paratyphoid fever have recently been increasing, and the disease is becoming a major public health concern, especially in Eastern and Southern Asia. To investigate the genomic variation and evolution of S. Paratyphi A, a pan-genomic analysis was performed on five newly sequenced S. Paratyphi A strains and two other reference strains. A whole genome comparison revealed that the seven genomes are collinear and that their organization is highly conserved. The high rate of substitutions in part of the core genome indicates that there are frequent homologous recombination events. Based on the changes in the pan-genome size and cluster number (both in the core functional genes and core pseudogenes), it can be inferred that the sharply increasing number of pseudogene clusters may have strong correlation with the inactivation of functional genes, and indicates that the S. Paratyphi A genome is being degraded. PMID:23028950

  19. Asymmetric histone modifications between the original and derived loci of human segmental duplications

    PubMed Central

    Zheng, Deyou

    2008-01-01

    Background Sequencing and annotation of several mammalian genomes have revealed that segmental duplications are a common architectural feature of primate genomes; in fact, about 5% of the human genome is composed of large blocks of interspersed segmental duplications. These segmental duplications have been implicated in genomic copy-number variation, gene novelty, and various genomic disorders. However, the molecular processes involved in the evolution and regulation of duplicated sequences remain largely unexplored. Results In this study, the profile of about 20 histone modifications within human segmental duplications was characterized using high-resolution, genome-wide data derived from a ChIP-Seq study. The analysis demonstrates that derivative loci of segmental duplications often differ significantly from the original with respect to many histone methylations. Further investigation showed that genes are present three times more frequently in the original than in the derivative, whereas pseudogenes exhibit the opposite trend. These asymmetries tend to increase with the age of segmental duplications. The uneven distribution of genes and pseudogenes does not, however, fully account for the asymmetry in the profile of histone modifications. Conclusion The first systematic analysis of histone modifications between segmental duplications demonstrates that two seemingly 'identical' genomic copies are distinct in their epigenomic properties. Results here suggest that local chromatin environments may be implicated in the discrimination of derived copies of segmental duplications from their originals, leading to a biased pseudogenization of the new duplicates. The data also indicate that further exploration of the interactions between histone modification and sequence degeneration is necessary in order to understand the divergence of duplicated sequences. PMID:18598352

  20. Molecular evolution and diversification of snake toxin genes, revealed by analysis of intron sequences.

    PubMed

    Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T

    2003-08-14

    The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.

  1. Mycobacterium leprae: genes, pseudogenes and genetic diversity

    PubMed Central

    Singh, Pushpendra; Cole, Stewart T

    2011-01-01

    Leprosy, which has afflicted human populations for millenia, results from infection with Mycobacterium leprae, an unculturable pathogen with an exceptionally long generation time. Considerable insight into the biology and drug resistance of the leprosy bacillus has been obtained from genomics. M. leprae has undergone reductive evolution and pseudogenes now occupy half of its genome. Comparative genomics of four different strains revealed remarkable conservation of the genome (99.995% identity) yet uncovered 215 polymorphic sites, mainly single nucleotide polymorphisms, and a handful of new pseudogenes. Mapping these polymorphisms in a large panel of strains defined 16 single nucleotide polymorphism-subtypes that showed strong geographical associations and helped retrace the evolution of M. leprae. PMID:21162636

  2. Occurrence of mitochondrial CO1 pseudogenes in Neocalanus plumchrus (Crustacea: Copepoda): Hybridization indicated by recombined nuclear mitochondrial pseudogenes

    PubMed Central

    Lin, Ya-Ying

    2017-01-01

    A portion of the mitochondrial cytochrome c oxidase I gene was sequenced using both genomic DNA and complement DNA from three planktonic copepod Neocalanus species (N. cristatus, N. plumchrus, and N. flemingeri). Small but critical sequence differences in CO1 were observed between gDNA and cDNA from N. plumchrus. Furthermore, careful observation revealed the presence of recombination between sequences in gDNA from N. plumchrus. Moreover, a chimera of the N. cristatus and N. plumchrus sequences was obtained from N. plumchrus gDNA. The observed phenomena can be best explained by the preferential amplification of the nuclear mitochondrial pseudogenes from gDNA of N. plumchrus. Two conclusions can be drawn from the observations. First, nuclear mitochondrial pseudogenes are pervasive in N. plumchrus. Second, a mating between a female N. cristatus and a male N. plumchrus produced viable offspring, which further backcrossed to a N. plumchrus individual. These observations not only demonstrate intriguing mating behavior in these species, but also emphasize the importance of careful interpretation of species marker sequences amplified from gDNA. PMID:28231343

  3. Antigenic variation of Anaplasma marginale msp2 occurs by combinatorial gene conversion.

    PubMed

    Brayton, Kelly A; Palmer, Guy H; Lundgren, Anna; Yi, Jooyoung; Barbet, Anthony F

    2002-03-01

    The rickettsial pathogen Anaplasma marginale establishes lifelong persistent infection in the mammalian reservoir host, during which time immune escape variants continually arise in part because of variation in the expressed copy of the immunodominant outer membrane protein MSP2. A key question is how the small 1.2 Mb A. marginale genome generates sufficient variants to allow long-term persistence in an immunocompetent reservoir host. The recombination of whole pseudogenes into the single msp2 expression site has been previously identified as one method of generating variants, but is inadequate to generate the number of variants required for persistent infection. In the present study, we demonstrate that recombination of a whole pseudogene is followed by a second level of variation in which small segments of pseudogenes recombine into the expression site by gene conversion. Evidence for four short sequential changes in the hypervariable region of msp2 coupled with the identification of nine pseudogenes from a single strain of A. marginale provides for a combinatorial number of possible expressed MSP2 variants sufficient for lifelong persistence.

  4. The complete plastid genome sequence of Eustrephus latifolius (Asparagaceae: Lomandroideae).

    PubMed

    Kim, Hyoung Tae; Kim, Jung Sung; Kim, Joo-Hwan

    2016-01-01

    The complete chloroplast (cp) genome sequence of Eustrephus latifolius was firstly determined in subfamily Lomandriodeae of family Asparagaceae. It was 159,736 bp and contained a large single copy region (82,403 bp) and a small single copy region (13,607 bp) which were separated by two inverted repeat regions (31,863 bp). In total, 132 genes were identified and they were consisted of 83 coding genes, 8 rRNA genes, 38 tRNA genes, 3 pseudogenes. rpl23 and clpP were pseudogenes due to sequence deletions. Among 23 genes containing introns, rps12 and ycf3 contained two introns and the rest had just one intron. The intact ycf68 was identified within an intron of trnI-GAU. The amino acid sequence was almost identical with Phoenix dactylifera in Aracales. Ycf1 of E. latifolius was completely located in IR. It was similar to cp genome structure of Lemna minor, Spirodela polyrhiza, Wolffiella lingulata, Wolffia australiana in Alismatales.

  5. Non-concerted ITS evolution in Mammillaria (Cactaceae).

    PubMed

    Harpke, Doerte; Peterson, Angela

    2006-12-01

    Molecular studies of 21 species of the large Cactaceae genus Mammillaria representing a variety of intrageneric taxonomic levels revealed a high degree of intra-individual polymorphism of the internal transcribed spacer region (ITS1, 5.8S rDNA, ITS2). Only a few of these ITS copies belong to apparently functional genes, whereas most are probably non-functional (pseudogenes). As a multiple gene family, the ITS region is subjected to concerted evolution. However, the high degree of intra-individual polymorphism of up to 36% in ITS1 and up to 35% in ITS2 suggests a non-concerted evolution of these loci in Mammillaria. Conserved angiosperm motifs of ITS1 and ITS2 were compared between genomic and cDNA ITS clones of Mammillaria. Some of these motifs (e.g., ITS1 motif 1, 'TGGT' within ITS2) in combination with the determination of GC-content, length comparisons of the spacers and ITS2 secondary structure (helices II and III) are helpful in the identification of pseudogene rDNA regions.

  6. Sex bias in copy number variation of olfactory receptor gene family depends on ethnicity.

    PubMed

    Shadravan, Farideh

    2013-01-01

    Gender plays a pivotal role in the human genetic identity and is also manifested in many genetic disorders particularly mental retardation. In this study its effect on copy number variation (CNV), known to cause genetic disorders was explored. As the olfactory receptor (OR) repertoire comprises the largest human gene family, it was selected for this study, which was carried out within and between three populations, derived from 150 individuals from the 1000 Genome Project. Analysis of 3872 CNVs detected among 791 OR loci, in which 307 loci showed CNV, revealed the following novel findings: Sex bias in CNV was significantly more prevalent in uncommon than common CNV variants of OR pseudogenes, in which the male genome showed more CNVs; and in one-copy number loss compared to complete deletion of OR pseudogenes; both findings implying a more recent evolutionary role for gender. Sex bias in copy number gain was also detected. Another novel finding was that the observed sex bias was largely dependent on ethnicity and was in general absent in East Asians. Using a CNV public database for sick children (International Standard Cytogenomic Array Consortium) the application of these findings for improving clinical molecular diagnostics is discussed by showing an example of sex bias in CNV among kids with autism. Additional clinical relevance is discussed, as the most polymorphic CNV-enriched OR cluster in the human genome, located on chr 15q11.2, is found near the Prader-Willi syndrome/Angelman syndrome bi-directionally imprinted region associated with two well-known mental retardation syndromes. As olfaction represents the primitive cognition in most mammals, arguably in competition with the development of a larger brain, the extensive retention of OR pseudogenes in females of this study, might point to a parent-of-origin indirect regulatory role for OR pseudogenes in the embryonic development of human brain. Thus any perturbation in the temporal regulation of olfactory system could lead to developmental delay disorders including mental retardation.

  7. Mapping of aldose reductase gene sequences to human chromosomes 1, 3, 7, 9, 11, and 13

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bateman, J.B.; Kojis, T.; Heinzmann, C.

    1993-09-01

    Aldose reductase (alditol:NAD(P)+ 1-oxidoreductase; EC 1.1.1.21) (AR) catalyzes the reduction of several aldehydes, including that of glucose, to the corresponding sugar alcohol. Using a complementary DNA clone encoding human AR, the authors mapped the gene sequences to human chromosomes 1, 3, 7, 9, 11, 13, 14, and 18 by somatic cell hybridization. By in situ hybridization analysis, sequences were localized to human chromosomes 1q32-q43, 3p12, 7q31-q35, 9q22, 11p14-p15, and 13q14-q21. As a putative functional AR gene has been mapped to chromosome 7 and a putative pseudogene to chromosome 3, the sequences on the other seven chromosomes may represent other activemore » genes, non-aldose reductase homologous sequences, or pseudogenes. 24 refs., 3 figs., 2 tabs.« less

  8. Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts

    PubMed Central

    Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo

    2007-01-01

    Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083

  9. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae

    PubMed Central

    Park, Kyu Tae

    2017-01-01

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops (Colocasia, commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna, Spirodela, Wolffiella, Wolffia, Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus. In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region. PMID:29144427

  10. Are Synonymous Substitutions in Flowering Plant Mitochondria Neutral?

    PubMed

    Wynn, Emily L; Christensen, Alan C

    2015-10-01

    Angiosperm mitochondrial genes appear to have very low mutation rates, while non-gene regions expand, diverge, and rearrange quickly. One possible explanation for this disparity is that synonymous substitutions in plant mitochondrial genes are not truly neutral and selection keeps their occurrence low. If this were true, the explanation for the disparity in mutation rates in genes and non-genes needs to consider selection as well as mechanisms of DNA repair. Rps14 is co-transcribed with cob and rpl5 in most plant mitochondrial genomes, but in some genomes, rps14 has been duplicated to the nucleus leaving a pseudogene in the mitochondria. This provides an opportunity to compare neutral substitution rates in pseudogenes with synonymous substitution rates in the orthologs. Genes and pseudogenes of rps14 have been aligned among different species and the mutation rates have been calculated. Neutral substitution rates in pseudogenes and synonymous substitution rates in genes are significantly different, providing evidence that synonymous substitutions in plant mitochondrial genes are not completely neutral. The non-neutrality is not sufficient to completely explain the exceptionally low mutation rates in land plant mitochondrial genomes, but selective forces appear to play a small role.

  11. Genomic gigantism: DNA loss is slow in mountain grasshoppers.

    PubMed

    Bensasson, D; Petrov, D A; Zhang, D X; Hartl, D L; Hewitt, G M

    2001-02-01

    Several studies have shown DNA loss to be inversely correlated with genome size in animals. These studies include a comparison between Drosophila and the cricket, Laupala, but there has been no assessment of DNA loss in insects with very large genomes. Podisma pedestris, the brown mountain grasshopper, has a genome over 100 times as large as that of Drosophila and 10 times as large as that of Laupala. We used 58 paralogous nuclear pseudogenes of mitochondrial origin to study the characteristics of insertion, deletion, and point substitution in P. pedestris and Italopodisma. In animals, these pseudogenes are "dead on arrival"; they are abundant in many different eukaryotes, and their mitochondrial origin simplifies the identification of point substitutions accumulated in nuclear pseudogene lineages. There appears to be a mononucleotide repeat within the 643-bp pseudogene sequence studied that acts as a strong hot spot for insertions or deletions (indels). Because the data for other insect species did not contain such an unusual region, hot spots were excluded from species comparisons. The rate of DNA loss relative to point substitution appears to be considerably and significantly lower in the grasshoppers studied than in Drosophila or Laupala. This suggests that the inverse correlation between genome size and the rate of DNA loss can be extended to comparisons between insects with large or gigantic genomes (i.e., Laupala and Podisma). The low rate of DNA loss implies that in grasshoppers, the accumulation of point mutations is a more potent force for obscuring ancient pseudogenes than their loss through indel accumulation, whereas the reverse is true for Drosophila. The main factor contributing to the difference in the rates of DNA loss estimated for grasshoppers, crickets, and Drosophila appears to be deletion size. Large deletions are relatively rare in Podisma and Italopodisma.

  12. The eukaryotic cofactor for the human immunodeficiency virus type 1 (HIV-1) Rev protein, elF-5A, maps to chromosome 17p12-p13: Three elF-5A pseudogenes map to 10q23.3, 17q25, and 19q13.2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steinkasserer, A.; Koettnitz, K.; Hauber, J.

    1995-02-10

    The eukaryotic initiation factor 5A (eIF-5A) has been identified as an essential cofactor for the HIV-1 transactivator protein Rev. Rev plays a key role in the complex regulation of HIV-1 gene expression and thereby in the generation of infectious virus particles. Expression of eIF-5A is vital for Rev function, and inhibition of this interaction leads to a block of the viral replication cycle. In humans, four different eIF-5A genes have been identified. One codes for the eIF-5A protein and the other three are pseudogenes. Using a panel of somatic rodent-human cell hybrids in combination with fluorescence in situ hybridization analysis,more » we show that the four genes map to three different chromosomes. The coding eIF-5A gene (EIF5A) maps to 17p12-p13, and the three pseudogenes EIF5AP1, EIF5AP2, and EIF5AP3 map to 10q23.3, 17q25, and 19q13.2, respectively. This is the first localization report for a eukaryotic cofactor for a regulatory HIV-1 protein. 16 refs., 1 fig.« less

  13. A case of early onset rectal cancer of Lynch syndrome with a novel deleterious PMS2 mutation.

    PubMed

    Nomura, Sachio; Fujimoto, Yoshiya; Yamamoto, Noriko; Sato, Yuri; Ashihara, Yuumi; Kita, Mizuho; Yamaguchi, Junya; Ishikawa, Yuichi; Ueno, Masashi; Arai, Masami

    2015-10-01

    Heterozygous deleterious mutation of the PMS2 gene is a cause of Lynch syndrome, an autosomal dominant cancer disease. However, the frequency of PMS2 mutation is rare compared with that of the other causative genes; MSH2, MLH1 and MSH6. PMS2 mutation has so far only been reported once from a Japanese facility. Detection of PMS2 mutation is relatively complicated due to the existence of 15 highly homologous pseudogenes, and its gene conversion event with the pseudogene PMS2CL. Therefore, for PMS2 mutation analysis, it is crucial to clearly distinguish PMS2 from its pseudogenes. We report here a novel deleterious 11 bp deletion mutation of exon 11 of PMS2 distinguished from PMS2CL in a 34-year-old Japanese female with rectal cancer. PMS2 mutated at c.1492del11 results in a truncated 500 amino acid protein rather than the wild-type protein of 862 amino acids. This is supported by the fact that, although there is usually concordance between MLH1 and PMS2 expression, cells were immunohistochemically positive for MLH1, whereas PMS2 could not be immunohistochemically stained using an anti-C-terminal PMS2 antibody, or effective PMS2 mRNA degradation with NMD caused by the frameshift mutation. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  14. The Evolution of Ribosomal DNA: Divergent Paralogues and Phylogenetic Implications

    PubMed Central

    Buckler-IV, E. S.; Ippolito, A.; Holtsford, T. P.

    1997-01-01

    Although nuclear ribosomal DNA (rDNA) repeats evolve together through concerted evolution, some genomes contain a considerable diversity of paralogous rDNA. This diversity includes not only multiple functional loci but also putative pseudogenes and recombinants. We examined the occurrence of divergent paralogues and recombinants in Gossypium, Nicotiana, Tripsacum, Winteraceae, and Zea ribosomal internal transcribed spacer (ITS) sequences. Some of the divergent paralogues are probably rDNA pseudogenes, since they have low predicted secondary structure stability, high substitution rates, and many deamination-driven substitutions at methylation sites. Under standard PCR conditions, the low stability paralogues amplified well, while many high-stability paralogues amplified poorly. Under highly denaturing PCR conditions (i.e., with dimethylsulfoxide), both low- and high-stability paralogues amplified well. We also found recombination between divergent paralogues. For phylogenetics, divergent ribosomal paralogues can aid in reconstructing ancestral states and thus serve as good outgroups. Divergent paralogues can also provide companion rDNA phylogenies. However, phylogeneticists must discriminate among families of divergent paralogues and recombinants or suffer from muddled and inaccurate organismal phylogenies. PMID:9055091

  15. Emerging Putative Associations between Non-Coding RNAs and Protein-Coding Genes in Neuropathic Pain: Added Value from Reusing Microarray Data

    PubMed Central

    Raju, Hemalatha B.; Tsinoremas, Nicholas F.; Capobianco, Enrico

    2016-01-01

    Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein–protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches. PMID:27803687

  16. Emerging Putative Associations between Non-Coding RNAs and Protein-Coding Genes in Neuropathic Pain: Added Value from Reusing Microarray Data.

    PubMed

    Raju, Hemalatha B; Tsinoremas, Nicholas F; Capobianco, Enrico

    2016-01-01

    Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein-protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches.

  17. Pseudogene PHBP1 promotes esophageal squamous cell carcinoma proliferation by increasing its cognate gene PHB expression.

    PubMed

    Feng, Feiyue; Qiu, Bin; Zang, Ruochuan; Song, Peng; Gao, Shugeng

    2017-04-25

    Natural antisense transcripts (NATs) as one of the most diverse classes of long noncoding RNAs (lncRNAs), have been demonstrated involved in fundamental biological processes in human. Here, we reported that human prohibitin gene pseudogene 1 (PHBP1) was upregulated in ESCC, and increased PHBP1 expression in ESCC was associated with clinical advanced stage. Functional experiments showed that PHBP1 knockdown inhibited ESCC cells proliferation, colony formation and xenograft tumor growth in vitro and in vivo by causing cell-cycle arrest at the G1-G0 phase. Mechanisms analysis revealed that PHBP1 transcript as an antisense transcript of PHB is partially complementary to PHB mRNA and formed an RNA-RNA hybrid with PHB, consequently inducing an increase of PHB expression at both the mRNA and protein levels. Furthermore, PHBP1 expression is strongly correlated with PHB expression in ESCC tissues. Collectively, this study elucidates an important role of PHBP1 in promoting ESCC partly via increasing PHB expression.

  18. Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome.

    PubMed

    Aljohi, Hasan Awad; Liu, Wanfei; Lin, Qiang; Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O; Alawad, Abdullah O; Al-Sadi, Abdullah M; Hu, Songnian; Yu, Jun

    2016-01-01

    Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants.

  19. Horizontal transfer of DNA from the mitochondrial to the plastid genome and its subsequent evolution in milkweeds (apocynaceae).

    PubMed

    Straub, Shannon C K; Cronn, Richard C; Edwards, Christopher; Fishbein, Mark; Liston, Aaron

    2013-01-01

    Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae]) and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2-rpoC2 intergenic spacer of the plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found recent gene conversion of the mitochondrial rpoC2 pseudogene in Asclepias by the plastid gene, which reflects continued interaction of these genomes.

  20. Pseudogenization of the tooth gene enamelysin (MMP20) in the common ancestor of extant baleen whales

    PubMed Central

    Meredith, Robert W.; Gatesy, John; Cheng, Joyce; Springer, Mark S.

    2011-01-01

    Whales in the suborder Mysticeti are filter feeders that use baleen to sift zooplankton and small fish from ocean waters. Adult mysticetes lack teeth, although tooth buds are present in foetal stages. Cladistic analyses suggest that functional teeth were lost in the common ancestor of crown-group Mysticeti. DNA sequences for the tooth-specific genes, ameloblastin (AMBN), enamelin (ENAM) and amelogenin (AMEL), have frameshift mutations and/or stop codons in this taxon, but none of these molecular cavities are shared by all extant mysticetes. Here, we provide the first evidence for pseudogenization of a tooth gene, enamelysin (MMP20), in the common ancestor of living baleen whales. Specifically, pseudogenization resulted from the insertion of a CHR-2 SINE retroposon in exon 2 of MMP20. Genomic and palaeontological data now provide congruent support for the loss of enamel-capped teeth on the common ancestral branch of crown-group mysticetes. The new data for MMP20 also document a polymorphic stop codon in exon 2 of the pygmy sperm whale (Kogia breviceps), which has enamel-less teeth. These results, in conjunction with the evidence for pseudogenization of MMP20 in Hoffmann's two-toed sloth (Choloepus hoffmanni), another enamel-less species, support the hypothesis that the only unique, non-overlapping function of the MMP20 gene is in enamel formation. PMID:20861053

  1. Horizontal Transfer of DNA from the Mitochondrial to the Plastid Genome and Its Subsequent Evolution in Milkweeds (Apocynaceae)

    PubMed Central

    Straub, Shannon C.K.; Cronn, Richard C.; Edwards, Christopher; Fishbein, Mark; Liston, Aaron

    2013-01-01

    Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae]) and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2–rpoC2 intergenic spacer of the plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found recent gene conversion of the mitochondrial rpoC2 pseudogene in Asclepias by the plastid gene, which reflects continued interaction of these genomes. PMID:24029811

  2. Insight into the evolution and origin of leprosy bacilli from the genome sequence of Mycobacterium lepromatosis

    PubMed Central

    Singh, Pushpendra; Benjak, Andrej; Schuenemann, Verena J.; Herbig, Alexander; Avanzi, Charlotte; Busso, Philippe; Nieselt, Kay; Krause, Johannes; Vera-Cabrera, Lucio; Cole, Stewart T.

    2015-01-01

    Mycobacterium lepromatosis is an uncultured human pathogen associated with diffuse lepromatous leprosy and a reactional state known as Lucio's phenomenon. By using deep sequencing with and without DNA enrichment, we obtained the near-complete genome sequence of M. lepromatosis present in a skin biopsy from a Mexican patient, and compared it with that of Mycobacterium leprae, which has undergone extensive reductive evolution. The genomes display extensive synteny and are similar in size (∼3.27 Mb). Protein-coding genes share 93% nucleotide sequence identity, whereas pseudogenes are only 82% identical. The events that led to pseudogenization of 50% of the genome likely occurred before divergence from their most recent common ancestor (MRCA), and both M. lepromatosis and M. leprae have since accumulated new pseudogenes or acquired specific deletions. Functional comparisons suggest that M. lepromatosis has lost several enzymes required for amino acid synthesis whereas M. leprae has a defective heme pathway. M. lepromatosis has retained all functions required to infect the Schwann cells of the peripheral nervous system and therefore may also be neuropathogenic. A phylogeographic survey of 227 leprosy biopsies by differential PCR revealed that 221 contained M. leprae whereas only six, all from Mexico, harbored M. lepromatosis. Phylogenetic comparisons indicate that M. lepromatosis is closer than M. leprae to the MRCA, and a Bayesian dating analysis suggests that they diverged from their MRCA approximately 13.9 Mya. Thus, despite their ancient separation, the two leprosy bacilli are remarkably conserved and still cause similar pathologic conditions. PMID:25831531

  3. The Argonaute protein TbAGO1 contributes to large and mini-chromosome segregation and is required for control of RIME retroposons and RHS pseudogene-associated transcripts.

    PubMed

    Durand-Dubief, Mickaël; Absalon, Sabrina; Menzer, Linda; Ngwabyt, Sandra; Ersfeld, Klaus; Bastin, Philippe

    2007-12-01

    The protist Trypanosoma brucei possesses a single Argonaute gene called TbAGO1 that is necessary for RNAi silencing. We previously showed that in strain 427, TbAGO1 knock-out leads to a slow growth phenotype and to chromosome segregation defects. Here we report that the slow growth phenotype is linked to defects in segregation of both large and mini-chromosome populations, with large chromosomes being the most affected. These phenotypes are completely reversed upon inducible re-expression of TbAGO1 fused to GFP, demonstrating their link with TbAGO1. Trypanosomes that do not express TbAGO1 show a general increase in the abundance of transcripts derived from the short retroposon RIME (Ribosomal Interspersed Mobile Element). Supplementary large RIME transcripts emerge in the absence of RNAi, a phenomenon coupled to the disappearance of short transcripts. These fluctuations are reversed by inducible expression of GFP::TbAGO1. Furthermore, we use a combination of Northern blots, RT-PCR and sequencing to reveal that RNAi controls expression of transcripts derived from RHS (Retrotransposon Hot Spot) pseudogenes (RHS genes with retro-element(s) integrated within their coding sequence). Absence of RNAi also leads to an increase of steady-state transcripts from regular RHS genes (those without retro-element), indicating a role for pseudogene in control of gene expression. However, analysis of retroposon abundance and arrangement in the genome of multiple clonal cell lines of TbAGO1-/- failed to reveal movement of mobile elements despite the increased amounts of retroposon transcripts.

  4. [Structural organization of 5S ribosomal DNA of Rosa rugosa].

    PubMed

    Tynkevych, Iu O; Volkov, R A

    2014-01-01

    In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.

  5. Evolution of Genes Involved in Gamete Interaction: Evidence for Positive Selection, Duplications and Losses in Vertebrates

    PubMed Central

    Callebaut, Isabelle; Laurin, Michel; Pascal, Géraldine; Poupon, Anne; Goudet, Ghylène; Monget, Philippe

    2012-01-01

    Genes encoding proteins involved in sperm-egg interaction and fertilization exhibit a particularly fast evolution and may participate in prezygotic species isolation [1], [2]. Some of them (ZP3, ADAM1, ADAM2, ACR and CD9) have individually been shown to evolve under positive selection [3], [4], suggesting a role of positive Darwinian selection on sperm-egg interaction. However, the genes involved in this biological function have not been systematically and exhaustively studied with an evolutionary perspective, in particular across vertebrates with internal and external fertilization. Here we show that 33 genes among the 69 that have been experimentally shown to be involved in fertilization in at least one taxon in vertebrates are under positive selection. Moreover, we identified 17 pseudogenes and 39 genes that have at least one duplicate in one species. For 15 genes, we found neither positive selection, nor gene copies or pseudogenes. Genes of teleosts, especially genes involved in sperm-oolemma fusion, appear to be more frequently under positive selection than genes of birds and eutherians. In contrast, pseudogenization, gene loss and gene gain are more frequent in eutherians. Thus, each of the 19 studied vertebrate species exhibits a unique signature characterized by gene gain and loss, as well as position of amino acids under positive selection. Reflecting these clade-specific signatures, teleosts and eutherian mammals are recovered as clades in a parsimony analysis. Interestingly the same analysis places Xenopus apart from teleosts, with which it shares the primitive external fertilization, and locates it along with amniotes (which share internal fertilization), suggesting that external or internal environmental conditions of germ cell interaction may not be the unique factors that drive the evolution of fertilization genes. Our work should improve our understanding of the fertilization process and on the establishment of reproductive barriers, for example by offering new leads for experiments on genes identified as positively selected. PMID:22957080

  6. Molecular characterization of the celiac disease epitope domains in α-gliadin genes in Aegilops tauschii and hexaploid wheats (Triticum aestivum L.).

    PubMed

    Xie, Zhenze; Wang, Congyan; Wang, Ke; Wang, Shunli; Li, Xiaohui; Zhang, Zhao; Ma, Wujun; Yan, Yueming

    2010-11-01

    Nineteen novel full-ORF α-gliadin genes and 32 pseudogenes containing at least one stop codon were cloned and sequenced from three Aegilops tauschii accessions (T15, T43 and T26) and two bread wheat cultivars (Gaocheng 8901 and Zhongyou 9507). Analysis of three typical α-gliadin genes (Gli-At4, Gli-G1 and Gli-Z4) revealed some InDels and a considerable number of SNPs among them. Most of the pseudogenes were resulted from C to T change, leading to the generation of TAG or TAA in-frame stop codon. The putative proteins of both Gli-At3 and Gli-Z7 genes contained an extra cysteine residue in the unique domain II. Analysis of toxic epitodes among 19 deduced α-gliadins demonstrated that 14 of these contained 1-5 T cell stimulatory toxic epitopes while the other 5 did not contain any toxic epitopes. The glutamine residues in two specific ployglutamine domains ranged from 7 to 27, indicating a high variation in length. According to the numbers of 4 T cell stimulatory toxic epitopes and glutamine residues in the two ployglutamine domains among the 19 α-gliadin genes, 2 were assigned to chromosome 6A, 5 to chromosome 6B and 12 to chromosome 6D. These results were consistent with those from wheat cv. Chinese Spring nulli-tetrasomic and phylogenetic analysis. Secondary structure prediction showed that all α-gliadins had high content of β-strands and most of the α-helixes and β-strands were present in two unique domains. Phylogenetic analysis demonstrated that α-gliadin genes had a high homology with γ-gliadin, B-hordein, and LMW-GS genes and they diverged at approximate 39 MYA. Finally, the five α-gliadin genes were successfully expressed in E. coli, and their expression amount reached to the maximum after 4 h induced by IPTG, indicating that the α-gliadin genes can express in a high level under the control of T(7) promoter.

  7. Human Nanog pseudogene8 promotes the proliferation of gastrointestinal cancer cells

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Uchino, Keita, E-mail: uchino13@intmed1.med.kyushu-u.ac.jp; Hirano, Gen; Hirahashi, Minako

    2012-09-10

    There is emerging evidence that human solid tumor cells originate from cancer stem cells (CSCs). In cancer cell lines, tumor-initiating CSCs are mainly found in the side population (SP) that has the capacity to extrude dyes such as Hoechst 33342. We found that Nanog is expressed specifically in SP cells of human gastrointestinal (GI) cancer cells. Nucleotide sequencing revealed that NanogP8 but not Nanog was expressed in GI cancer cells. Transfection of NanogP8 into GI cancer cell lines promoted cell proliferation, while its inhibition by anti-Nanog siRNA suppressed the proliferation. Immunohistochemical staining of primary GI cancer tissues revealed NanogP8 proteinmore » to be strongly expressed in 3 out of 60 cases. In these cases, NanogP8 was found especially in an infiltrative part of the tumor, in proliferating cells with Ki67 expression. These data suggest that NanogP8 is involved in GI cancer development in a fraction of patients, in whom it presumably acts by supporting CSC proliferation. -- Highlights: Black-Right-Pointing-Pointer Nanog maintains pluripotency by regulating embryonic stem cells differentiation. Black-Right-Pointing-Pointer Nanog is expressed in cancer stem cells of human gastrointestinal cancer cells. Black-Right-Pointing-Pointer Nucleotide sequencing revealed that Nanog pseudogene8 but not Nanog was expressed. Black-Right-Pointing-Pointer Nanog pseudogene8 promotes cancer stem cells proliferation. Black-Right-Pointing-Pointer Nanog pseudogene8 is involved in gastrointestinal cancer development.« less

  8. Differential expression of Oct4 variants and pseudogenes in normal urothelium and urothelial cancer.

    PubMed

    Wezel, Felix; Pearson, Joanna; Kirkwood, Lisa A; Southgate, Jennifer

    2013-10-01

    The transcription factor octamer-binding protein 4 (Oct4; encoded by POU5F1) has a key role in maintaining embryonic stem cell pluripotency during early embryonic development and it is required for generation of induced pluripotent stem cells. Controversy exists concerning Oct4 expression in somatic tissues, with reports that Oct4 is expressed in normal and in neoplastic urothelium carrying implications for a bladder cancer stem cell phenotype. Here, we show that the pluripotency-associated Oct4A transcript was absent from cultures of highly regenerative normal human urothelial cells and from low-grade to high-grade urothelial carcinoma cell lines, whereas alternatively spliced variants and transcribed pseudogenes were expressed in abundance. Immunolabeling and immunoblotting studies confirmed the absence of Oct4A in normal and neoplastic urothelial cells and tissues, but indicated the presence of alternative isoforms or potentially translated pseudogenes. The stable forced expression of Oct4A in normal human urothelial cells in vitro profoundly inhibited growth and affected morphology, but protein expression was rapidly down-regulated. Our findings demonstrate that pluripotency-associated isoform Oct4A is not expressed by normal or malignant human urothelium and therefore is unlikely to play a role in a cancer stem cell phenotype. However, our findings also indicate that urothelium expresses a variety of other Oct4 splice-variant isoforms and transcribed pseudogenes that warrant further study. Copyright © 2013 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.

  9. Next-Generation Sequencing of Protein-Coding and Long Non-protein-Coding RNAs in Two Types of Exosomes Derived from Human Whole Saliva.

    PubMed

    Ogawa, Yuko; Tsujimoto, Masafumi; Yanoshita, Ryohei

    2016-01-01

    Exosomes are small extracellular vesicles containing microRNAs and mRNAs that are produced by various types of cells. We previously used ultrafiltration and size-exclusion chromatography to isolate two types of human salivary exosomes (exosomes I, II) that are different in size and proteomes. We showed that salivary exosomes contain large repertoires of small RNAs. However, precise information regarding long RNAs in salivary exosomes has not been fully determined. In this study, we investigated the compositions of protein-coding RNAs (pcRNAs) and long non-protein-coding RNAs (lncRNAs) of exosome I, exosome II and whole saliva (WS) by next-generation sequencing technology. Although 11% of all RNAs were commonly detected among the three samples, the compositions of reads mapping to known RNAs were similar. The most abundant pcRNA is ribosomal RNA protein, and pcRNAs of some salivary proteins such as S100 calcium-binding protein A8 (protein S100-A8) were present in salivary exosomes. Interestingly, lncRNAs of pseudogenes (presumably, processed pseudogenes) were abundant in exosome I, exosome II and WS. Translationally controlled tumor protein gene, which plays an important role in cell proliferation, cell death and immune responses, was highly expressed as pcRNA and pseudogenes in salivary exosomes. Our results show that salivary exosomes contain various types of RNAs such as pseudogenes and small RNAs, and may mediate intercellular communication by transferring these RNAs to target cells as gene expression regulators.

  10. The human cytochrome P450 3A locus. Gene evolution by capture of downstream exons.

    PubMed

    Finta, C; Zaphiropoulos, P G

    2000-12-30

    Using a bacterial artificial chromosome (BAC) clone, we have mapped the human cytochrome P450 3A (CYP3A) locus containing the genes encoding for CYP3A4, CYP3A5 and CYP3A7. The genes lie in a head-to-tail orientation in the order of 3A4, 3A7 and 3A5. In both intergenic regions (3A4-3A7 and 3A7-3A5), we have detected several additional cytochrome P450 3A exons, forming two CYP3A pseudogenes. These pseudogenes have the same orientation as the CYP3A genes. To our surprise, a 3A7 mRNA species has been detected in which the exons 2 and 13 of one of the pseudogenes (the one that is downstream of 3A7) are spliced after the 3A7 terminal exon. This results in an mRNA molecule that consists of the 13 3A7 exons and two additional exons at the 3' end. The additional two exons originating from the pseudogene are in an altered reading frame and consequently have the capability to code a completely different amino acid sequence than the canonical CYP3A exons 2 and 13. These findings may represent a generalized evolutionary process with genes having the potential to capture neighboring sequences and use them as functional exons.

  11. Genome-wide A-to-I RNA editing in fungi independent of ADAR enzymes

    PubMed Central

    Liu, Huiquan; Wang, Qinhu; He, Yi; Chen, Lingfeng; Hao, Chaofeng; Jiang, Cong; Li, Yang; Dai, Yafeng; Kang, Zhensheng; Xu, Jin-Rong

    2016-01-01

    Yeasts and filamentous fungi do not have adenosine deaminase acting on RNA (ADAR) orthologs and are believed to lack A-to-I RNA editing, which is the most prevalent editing of mRNA in animals. However, during this study with the PUK1 (FGRRES_01058) pseudokinase gene important for sexual reproduction in Fusarium graminearum, we found that two tandem stop codons, UA1831GUA1834G, in its kinase domain were changed to UG1831GUG1834G by RNA editing in perithecia. To confirm A-to-I editing of PUK1 transcripts, strand-specific RNA-seq data were generated with RNA isolated from conidia, hyphae, and perithecia. PUK1 was almost specifically expressed in perithecia, and 90% of transcripts were edited to UG1831GUG1834G. Genome-wide analysis identified 26,056 perithecium-specific A-to-I editing sites. Unlike those in animals, 70.5% of A-to-I editing sites in F. graminearum occur in coding regions, and more than two-thirds of them result in amino acid changes, including editing of 69 PUK1-like pseudogenes with stop codons in ORFs. PUK1 orthologs and other pseudogenes also displayed stage-specific expression and editing in Neurospora crassa and F. verticillioides. Furthermore, F. graminearum differs from animals in the sequence preference and structure selectivity of A-to-I editing sites. Whereas A's embedded in RNA stems are targeted by ADARs, RNA editing in F. graminearum preferentially targets A's in hairpin loops, which is similar to the anticodon loop of tRNA targeted by adenosine deaminases acting on tRNA (ADATs). Overall, our results showed that A-to-I RNA editing occurs specifically during sexual reproduction and mainly in the coding regions in filamentous ascomycetes, involving adenosine deamination mechanisms distinct from metazoan ADARs. PMID:26934920

  12. Comparison of Intracellular "Ca. Endomicrobium Trichonymphae" Genomovars Illuminates the Requirement and Decay of Defense Systems against Foreign DNA.

    PubMed

    Izawa, Kazuki; Kuwahara, Hirokazu; Kihara, Kumiko; Yuki, Masahiro; Lo, Nathan; Itoh, Takehiko; Ohkuma, Moriya; Hongoh, Yuichi

    2016-10-13

    "Candidatus Endomicrobium trichonymphae" (Bacteria; Elusimicrobia) is an obligate intracellular symbiont of the cellulolytic protist genus Trichonympha in the termite gut. A previous genome analysis of "Ca Endomicrobium trichonymphae" phylotype Rs-D17 (genomovar Ri2008), obtained from a Trichonympha agilis cell in the gut of the termite Reticulitermes speratus, revealed that its genome is small (1.1 Mb) and contains many pseudogenes; it is in the course of reductive genome evolution. Here we report the complete genome sequence of another Rs-D17 genomovar, Ti2015, obtained from a different T. agilis cell present in an R. speratus gut. These two genomovars share most intact protein-coding genes and pseudogenes, showing 98.6% chromosome sequence similarity. However, characteristic differences were found in their defense systems, which comprised restriction-modification and CRISPR/Cas systems. The repertoire of intact restriction-modification systems differed between the genomovars, and two of the three CRISPR/Cas loci in genomovar Ri2008 are pseudogenized or missing in genomovar Ti2015. These results suggest relaxed selection pressure for maintaining these defense systems. Nevertheless, the remaining CRISPR/Cas system in each genomovar appears to be active; none of the "spacer" sequences (112 in Ri2008 and 128 in Ti2015) were shared whereas the "repeat" sequences were identical. Furthermore, we obtained draft genomes of three additional endosymbiotic Endomicrobium phylotypes from different host protist species, and discovered multiple, intact CRISPR/Cas systems in each genome. Collectively, unlike bacteriome endosymbionts in insects, the Endomicrobium endosymbionts of termite-gut protists appear to require defense against foreign DNA, although the required level of defense has likely been reduced during their intracellular lives. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. HnRNP A3 genes and pseudogenes in the vertebrate genomes.

    PubMed

    Makeyev, Aleksandr V; Kim, Chang Bae; Ruddle, Frank H; Enkhmandakh, Badam; Erdenechimeg, Lkhamsuren; Bayarsaihan, Dashzeveg

    2005-04-01

    The hnRNP A/B type proteins are abundant nuclear factors that bind to Pol II transcripts and are involved in numerous RNA-related activities. To date most data on the hnRNP A/B family have been obtained with recombinant proteins and cell cultures. Further characterization can result from an examination of the impact of various modifications in intact functional loci; however, such characterization is hampered by the presence of numerous and widely dispersed hnRNP A/B-related sequences in the mammalian genome. We have found hnRNP A3, a poorly recognized member of the hnRNP A/B family, among candidate transcription factors that interact with the regulatory region of the Hoxc8 gene and screened the human and mouse genomes for genes that encode hnRNP A3. We demonstrate that the sequence reported previously as the human hnRNP A3 gene (Accession number S63912) and located on 10p11.1 belongs to a processed pseudogene of the functional intron-containing locus HNRPA3, which we have identified on 2q31.2. We have also identified its murine orthologs on mouse chromosome 2D and rat chromosome 3q23. Alternative splices were revealed at the N-terminus and in the middle of hnRNP A3. 14 and 28 additional loci in the human and mouse genome, respectively, were mapped and identified as hnRNP A3 processed pseudogenes. In addition, we have found and compared hnRNP A3 orthologous genes in Gallus gallus, Xenopus tropicalis, and Danio rerio. The present in silico analysis serves as a necessary step toward a further functional characterization of hnRNP A3. (c) 2005 Wiley-Liss, Inc.

  14. Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome

    PubMed Central

    Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O.; Alawad, Abdullah O.; Al-Sadi, Abdullah M.; Hu, Songnian; Yu, Jun

    2016-01-01

    Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants. PMID:27736909

  15. Complete Sequence and Comparative Analysis of the Chloroplast Genome of Coconut Palm (Cocos nucifera)

    PubMed Central

    Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703

  16. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    PubMed

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.

  17. RNA editing makes mistakes in plant mitochondria: editing loses sense in transcripts of a rps19 pseudogene and in creating stop codons in coxI and rps3 mRNAs of Oenothera.

    PubMed Central

    Schuster, W; Brennicke, A

    1991-01-01

    An intact gene for the ribosomal protein S19 (rps19) is absent from Oenothera mitochondria. The conserved rps19 reading frame found in the mitochondrial genome is interrupted by a termination codon. This rps19 pseudogene is cotranscribed with the downstream rps3 gene and is edited on both sides of the translational stop. Editing, however, changes the amino acid sequence at positions that were well conserved before editing. Other strange editings create translational stops in open reading frames coding for functional proteins. In coxI and rps3 mRNAs CGA codons are edited to UGA stop codons only five and three codons, respectively, downstream to the initiation codon. These aberrant editings in essential open reading frames and in the rps19 pseudogene appear to have been shifted to these positions from other editing sites. These observations suggest a requirement for a continuous evolutionary constraint on the editing specificities in plant mitochondria. Images PMID:1762921

  18. Loss of Olfactory Receptor Genes Coincides with the Acquisition of Full Trichromatic Vision in Primates

    PubMed Central

    Wiebe, Victor; Przeworski, Molly; Lancet, Doron; Pääbo, Svante

    2004-01-01

    Olfactory receptor (OR) genes constitute the molecular basis for the sense of smell and are encoded by the largest gene family in mammalian genomes. Previous studies suggested that the proportion of pseudogenes in the OR gene family is significantly larger in humans than in other apes and significantly larger in apes than in the mouse. To investigate the process of degeneration of the olfactory repertoire in primates, we estimated the proportion of OR pseudogenes in 19 primate species by surveying randomly chosen subsets of 100 OR genes from each species. We find that apes, Old World monkeys and one New World monkey, the howler monkey, have a significantly higher proportion of OR pseudogenes than do other New World monkeys or the lemur (a prosimian). Strikingly, the howler monkey is also the only New World monkey to possess full trichromatic vision, along with Old World monkeys and apes. Our findings suggest that the deterioration of the olfactory repertoire occurred concomitant with the acquisition of full trichromatic color vision in primates. PMID:14737185

  19. Structure of novel rat major histocompatibility complex class II genes RT1.Ha and Hb

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arimura, Yutaka; Tang, Wei Ran; Koda, Toshiaki

    1995-03-01

    We have cloned the novel rat MHC class II genes, RT1.Ha and Hb, which are homologous to human HLA-DPA and DPB. RT1.Hb is a pseudogene, whereas RT1.Ha is apparently intact and may have transcriptional potential. In addition, with an RT1.Ha probe, we detecteda single Southern hybridization band in the genome of the mouse. This finding may aford an opportunity to analyze the HLA-DPA homologue in the mouse genome. 18 refs., 4 figs., 1 tab.

  20. The Trouble with MEAM2: Implications of Pseudogenes on Species Delimitation in the Globally Invasive Bemisia tabaci (Hemiptera: Aleyrodidae) Cryptic Species Complex.

    PubMed

    Tay, Wee Tek; Elfekih, Samia; Court, Leon N; Gordon, Karl H J; Delatte, Hélène; De Barro, Paul J

    2017-10-01

    Molecular species identification using suboptimal PCR primers can over-estimate species diversity due to coamplification of nuclear mitochondrial (NUMT) DNA/pseudogenes. For the agriculturally important whitefly Bemisia tabaci cryptic pest species complex, species identification depends primarily on characterization of the mitochondrial DNA cytochrome oxidase I (mtDNA COI) gene. The lack of robust PCR primers for the mtDNA COI gene can undermine correct species identification which in turn compromises management strategies. This problem is identified in the B. tabaci Africa/Middle East/Asia Minor clade which comprises the globally invasive Mediterranean (MED) and Middle East Asia Minor I (MEAM1) species, Middle East Asia Minor 2 (MEAM2), and the Indian Ocean (IO) species. Initially identified from the Indian Ocean island of Réunion, MEAM2 has since been reported from Japan, Peru, Turkey and Iraq. We identified MEAM2 individuals from a Peruvian population via Sanger sequencing of the mtDNA COI gene. In attempting to characterize the MEAM2 mitogenome, we instead characterized mitogenomes of MEAM1. We also report on the mitogenomes of MED, AUS, and IO thereby increasing genomic resources for members of this complex. Gene synteny (i.e., same gene composition and orientation) was observed with published B. tabaci cryptic species mitogenomes. Pseudogene fragments matching MEAM2 partial mtDNA COI gene exhibited low frequency single nucleotide polymorphisms that matched low copy number DNA fragments (<3%) of MEAM1 genomes, whereas presence of internal stop codons, loss of expected stop codons and poor primer annealing sites, all suggested MEAM2 as a pseudogene artifact and so not a real species. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. Major taste loss in carnivorous mammals

    PubMed Central

    Jiang, Peihua; Josue, Jesusa; Li, Xia; Glaser, Dieter; Li, Weihua; Brand, Joseph G.; Margolskee, Robert F.; Reed, Danielle R.; Beauchamp, Gary K.

    2012-01-01

    Mammalian sweet taste is primarily mediated by the type 1 taste receptor Tas1r2/Tas1r3, whereas Tas1r1/Tas1r3 act as the principal umami taste receptor. Bitter taste is mediated by a different group of G protein-coupled receptors, the Tas2rs, numbering 3 to ∼66, depending on the species. We showed previously that the behavioral indifference of cats toward sweet-tasting compounds can be explained by the pseudogenization of the Tas1r2 gene, which encodes the Tas1r2 receptor. To examine the generality of this finding, we sequenced the entire coding region of Tas1r2 from 12 species in the order Carnivora. Seven of these nonfeline species, all of which are exclusive meat eaters, also have independently pseudogenized Tas1r2 caused by ORF-disrupting mutations. Fittingly, the purifying selection pressure is markedly relaxed in these species with a pseudogenized Tas1r2. In behavioral tests, the Asian otter (defective Tas1r2) showed no preference for sweet compounds, but the spectacled bear (intact Tas1r2) did. In addition to the inactivation of Tas1r2, we found that sea lion Tas1r1 and Tas1r3 are also pseudogenized, consistent with their unique feeding behavior, which entails swallowing food whole without chewing. The extensive loss of Tas1r receptor function is not restricted to the sea lion: the bottlenose dolphin, which evolved independently from the sea lion but displays similar feeding behavior, also has all three Tas1rs inactivated, and may also lack functional bitter receptors. These data provide strong support for the view that loss of taste receptor function in mammals is widespread and directly related to feeding specializations. PMID:22411809

  2. Analysis of Complete Genomes of Propionibacterium acnes Reveals a Novel Plasmid and Increased Pseudogenes in an Acne Associated Strain

    PubMed Central

    Fitz-Gibbon, Sorel; Tomida, Shuta; Li, Huiying

    2013-01-01

    The human skin harbors a diverse community of bacteria, including the Gram-positive, anaerobic bacterium Propionibacterium acnes. P. acnes has historically been linked to the pathogenesis of acne vulgaris, a common skin disease affecting over 80% of all adolescents in the US. To gain insight into potential P. acnes pathogenic mechanisms, we previously sequenced the complete genome of a P. acnes strain HL096PA1 that is highly associated with acne. In this study, we compared its genome to the first published complete genome KPA171202. HL096PA1 harbors a linear plasmid, pIMPLE-HL096PA1. This is the first described P. acnes plasmid. We also observed a five-fold increase of pseudogenes in HL096PA1, several of which encode proteins in carbohydrate transport and metabolism. In addition, our analysis revealed a few island-like genomic regions that are unique to HL096PA1 and a large genomic inversion spanning the ribosomal operons. Together, these findings offer a basis for understanding P. acnes virulent properties, host adaptation mechanisms, and its potential role in acne pathogenesis at the strain level. Furthermore, the plasmid identified in HL096PA1 may potentially provide a new opportunity for P. acnes genetic manipulation and targeted therapy against specific disease-associated strains. PMID:23762865

  3. Rye B chromosomes encode a functional Argonaute-like protein with in vitro slicer activities similar to its A chromosome paralog.

    PubMed

    Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas

    2017-01-01

    B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.

  4. Largest vertebrate vomeronasal type 1 receptor gene repertoire in the semiaquatic platypus.

    PubMed

    Grus, Wendy E; Shi, Peng; Zhang, Jianzhi

    2007-10-01

    Vertebrate vomeronasal chemoreception plays important roles in many aspects of an organism's daily life, such as mating, territoriality, and foraging. Vomeronasal type 1 receptors (V1Rs) and vomeronasal type 2 receptors (V2Rs), 2 large families of G protein-coupled receptors, serve as vomeronasal receptors to bind to various pheromones and odorants. Contrary to the previous observations of reduced olfaction in aquatic and semiaquatic mammals, we here report the surprising finding that the platypus, a semiaquatic monotreme, has the largest V1R repertoire and nearly largest combined repertoire of V1Rs and V2Rs of all vertebrates surveyed, with 270 intact genes and 579 pseudogenes in the V1R family and 15 intact genes, 55 potentially intact genes, and 57 pseudogenes in the V2R family. Phylogenetic analysis shows a remarkable expansion of the V1R repertoire and a moderate expansion of the V2R repertoire in platypus since the separation of monotremes from placentals and marsupials. Our results challenge the view that olfaction is unimportant to aquatic mammals and call for further study into the role of vomeronasal reception in platypus physiology and behavior.

  5. Extensive gene conversion at the PMS2 DNA mismatch repair locus.

    PubMed

    Hayward, Bruce E; De Vos, Michel; Valleley, Elizabeth M A; Charlton, Ruth S; Taylor, Graham R; Sheridan, Eamonn; Bonthron, David T

    2007-05-01

    Mutations of the PMS2 DNA repair gene predispose to a characteristic range of malignancies, with either childhood onset (when both alleles are mutated) or a partially penetrant adult onset (if heterozygous). These mutations have been difficult to detect, due to interference from a family of pseudogenes located on chromosome 7. One of these, the PMS2CL pseudogene, lies within a 100-kb inverted duplication (inv dup), 700 kb centromeric to PMS2 itself on 7p22. Here, we show that the reference genomic sequences cannot be relied upon to distinguish PMS2 from PMS2CL, because of sequence transfer between the two loci. The 7p22 inv dup occurred prior to the divergence of modern ape species (15 million years ago [Mya]), but has undergone extensive sequence homogenization. This process appears to be ongoing, since there is considerable allelic diversity within the duplicated region, much of it derived from sequence exchange between PMS2 and PMS2CL. This sequence diversity can result in both false-positive and false-negative mutation analysis at this locus. Great caution is still needed in the design and interpretation of PMS2 mutation screens. 2007 Wiley-Liss, Inc.

  6. Cloning and characterization of the promoter regions from the parent and paralogous creatine transporter genes.

    PubMed

    Ndika, Joseph D T; Lusink, Vera; Beaubrun, Claudine; Kanhai, Warsha; Martinez-Munoz, Cristina; Jakobs, Cornelis; Salomons, Gajja S

    2014-01-10

    Interconversion between phosphocreatine and creatine, catalyzed by creatine kinase is crucial in the supply of ATP to tissues with high energy demand. Creatine's importance has been established by its use as an ergogenic aid in sport, as well as the development of intellectual disability in patients with congenital creatine deficiency. Creatine biosynthesis is complemented by dietary creatine uptake. Intracellular transport of creatine is carried out by a creatine transporter protein (CT1/CRT/CRTR) encoded by the SLC6A8 gene. Most tissues express this gene, with highest levels detected in skeletal muscle and kidney. There are lower levels of the gene detected in colon, brain, heart, testis and prostate. The mechanism(s) by which this regulation occurs is still poorly understood. A duplicated unprocessed pseudogene of SLC6A8-SLC6A10P has been mapped to chromosome 16p11.2 (contains the entire SLC6A8 gene, plus 2293 bp of 5'flanking sequence and its entire 3'UTR). Expression of SLC6A10P has so far only been shown in human testis and brain. It is still unclear as to what is the function of SLC6A10P. In a patient with autism, a chromosomal breakpoint that intersects the 5'flanking region of SLC6A10P was identified; suggesting that SLC6A10P is a non-coding RNA involved in autism. Our aim was to investigate the presence of cis-acting factor(s) that regulate expression of the creatine transporter, as well as to determine if these factors are functionally conserved upstream of the creatine transporter pseudogene. Via gene-specific PCR, cloning and functional luciferase assays we identified a 1104 bp sequence proximal to the mRNA start site of the SLC6A8 gene with promoter activity in five cell types. The corresponding 5'flanking sequence (1050 bp) on the pseudogene also had promoter activity in all 5 cell lines. Surprisingly the pseudogene promoter was stronger than that of its parent gene in 4 of the cell lines tested. To the best of our knowledge, this is the first experimental evidence of a pseudogene with stronger promoter activity than its parental gene. © 2013.

  7. Characterization of trh2 harbouring Vibrio parahaemolyticus strains isolated in Germany.

    PubMed

    Bechlars, Silke; Jäckel, Claudia; Diescher, Susanne; Wüstenhagen, Doreen A; Kubick, Stefan; Dieckmann, Ralf; Strauch, Eckhard

    2015-01-01

    Vibrio parahaemolyticus is a recognized human enteropathogen. Thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH) as well as the type III secretion system 2 (T3SS2) are considered as major virulence factors. As tdh positive strains are not detected in coastal waters of Germany, we focused on the characterization of trh positive strains, which were isolated from mussels, seawater and patients in Germany. Ten trh harbouring V. parahaemolyticus strains from Germany were compared to twenty-one trh positive strains from other countries. The complete trh sequences revealed clustering into three different types: trh1 and trh2 genes and a pseudogene Ψtrh. All German isolates possessed alleles of the trh2 gene. MLST analysis indicated a close relationship to Norwegian isolates suggesting that these strains belong to the autochthonous microflora of Northern Europe seawaters. Strains carrying the pseudogene Ψtrh were negative for T3SS2β effector vopC. Transcription of trh and vopC genes was analyzed under different growth conditions. Trh2 gene expression was not altered by bile while trh1 genes were inducible. VopC could be induced by urea in trh2 bearing strains. Most trh1 carrying strains were hemolytic against sheep erythrocytes while all trh2 positive strains did not show any hemolytic activity. TRH variants were synthesized in a prokaryotic cell-free system and their hemolytic activity was analyzed. TRH1 was active against sheep erythrocytes while TRH2 variants were not active at all. Our study reveals a high diversity among trh positive V. parahaemolyticus strains. The function of TRH2 hemolysins and the role of the pseudogene Ψtrh as pathogenicity factors are questionable. To assess the pathogenic potential of V. parahaemolyticus strains a differentiation of trh variants and the detection of T3SS2β components like vopC would improve the V. parahaemolyticus diagnostics and could lead to a refinement of the risk assessment in food analyses and clinical diagnostics.

  8. Mutational analysis of a patient with mucopolysaccharidosis type VII, and identification of pseudogenes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shipley, J.M.; Klinkenberg, M.; Wu, B.M.

    1993-03-01

    PCR of cDNA produced from patient fibroblasts allowed the authors to determine the paternal mutation in the first patient reported with [beta]-glucuronidase-deficiency mucopolysaccharidosis type VII (MPS VII). The G[r arrow]T transversion 1,881 bp downstream of the ATG translation initiation codon destroys an MboII restriction site and converts Trp627 to Cys (W627C). Digestion of genomic DNA PCR fragments with MboII indicated that the patient and the father were heterozygous for this missense mutation in exon 12. Failure to find cDNAs from patient RNA which did not contain this mutation suggested that the maternal mutation leads to greatly reduced synthesis or reducedmore » stability of mRNA from the mutant allele. In order to identify the maternal mutation, it was necessary to analyze genomic sequences. This approach was complicated by the finding of multiple unprocessed pseudogenes and/or closely related genes. Using PCR with a panel of human/rodent hybrid cell lines, the authors found that these pseudogenes were present over chromosomes 5-7, 20, and 22 and the Y chromosome. Conditions were defined which allowed them to amplify and characterize genomic sequences for the true [beta]-glucuronidase gene despite this background of related sequences. The patient proved to be heterozygous for a second mutation, in which a C[r arrow]T transition introduces a termination codon (R356STOP) in exon 7. The mother was also heterozygous for this mutation. Expression of a cDNA containing the maternal mutation produced no enzyme activity, as expected. Expression of the paternal mutation in COS-7 cells produced a surprisingly high (65% of control) level of activity. However, activity was 13% of control in transiently transfected murine MPS VII cells. The level of activity of this mutant allele appears to correlate with the level of overexpression. 39 refs., 5 figs., 1 tab.« less

  9. Intersecting transcriptomic profiling technologies and long non-coding RNA function in lung adenocarcinoma: discovery, mechanisms, and therapeutic applications

    PubMed Central

    Castillo, Jonathan; Stueve, Theresa R.; Marconett, Crystal N.

    2017-01-01

    Previously thought of as junk transcripts and pseudogene remnants, long non-coding RNAs (lncRNAs) have come into their own over the last decade as an essential component of cellular activity, regulating a plethora of functions within multicellular organisms. lncRNAs are now known to participate in development, cellular homeostasis, immunological processes, and the development of disease. With the advent of next generation sequencing technology, hundreds of thousands of lncRNAs have been identified. However, movement beyond mere discovery to the understanding of molecular processes has been stymied by the complicated genomic structure, tissue-restricted expression, and diverse regulatory roles lncRNAs play. In this review, we will focus on lncRNAs involved in lung cancer, the most common cause of cancer-related death in the United States and worldwide. We will summarize their various methods of discovery, provide consensus rankings of deregulated lncRNAs in lung cancer, and describe in detail the limited functional analysis that has been undertaken so far. PMID:29113413

  10. Complete Chloroplast Genome of Pinus massoniana (Pinaceae): Gene Rearrangements, Loss of ndh Genes, and Short Inverted Repeats Contraction, Expansion.

    PubMed

    Ni, ZhouXian; Ye, YouJu; Bai, Tiandao; Xu, Meng; Xu, Li-An

    2017-09-11

    The chloroplast genome (CPG) of Pinus massoniana belonging to the genus Pinus (Pinaceae), which is a primary source of turpentine, was sequenced and analyzed in terms of gene rearrangements, ndh genes loss, and the contraction and expansion of short inverted repeats (IRs). P. massoniana CPG has a typical quadripartite structure that includes large single copy (LSC) (65,563 bp), small single copy (SSC) (53,230 bp) and two IRs (IRa and IRb, 485 bp). The 108 unique genes were identified, including 73 protein-coding genes, 31 tRNAs, and 4 rRNAs. Most of the 81 simple sequence repeats (SSRs) identified in CPG were mononucleotides motifs of A/T types and located in non-coding regions. Comparisons with related species revealed an inversion (21,556 bp) in the LSC region; P. massoniana CPG lacks all 11 intact ndh genes (four ndh genes lost completely; the five remained truncated as pseudogenes; and the other two ndh genes remain as pseudogenes because of short insertions or deletions). A pair of short IRs was found instead of large IRs, and size variations among pine species were observed, which resulted from short insertions or deletions and non-synchronized variations between "IRa" and "IRb". The results of phylogenetic analyses based on whole CPG sequences of 16 conifers indicated that the whole CPG sequences could be used as a powerful tool in phylogenetic analyses.

  11. AtTCTP2, an Arabidopsis thaliana homolog of Translationally Controlled Tumor Protein, enhances in vitro plant regeneration

    PubMed Central

    Toscano-Morales, Roberto; Xoconostle-Cázares, Beatriz; Cabrera-Ponce, José L.; Hinojosa-Moya, Jesús; Ruiz-Salas, Jorge L.; Galván-Gordillo, Santiago V.; Guevara-González, Ramón G.; Ruiz-Medrano, Roberto

    2015-01-01

    The Translationally Controlled Tumor Protein (TCTP) is a central regulator of cell proliferation and differentiation in animals, and probably also in plants. Arabidopsis harbors two TCTP genes, AtTCTP1 (At3g16640), which is an important mitotic regulator, and AtTCTP2 (At3g05540), which is considered a pseudogene. Nevertheless, we have obtained evidence suggesting that this gene is functional. Indeed, a T-DNA insertion mutant, SALK_045146, displays a lethal phenotype during early rosette stage. Also, both the AtTCTP2 promoter and structural gene are functional, and heterozygous plants show delayed development. AtTCTP1 cannot compensate for the loss of AtTCTP2, since the accumulation levels of the AtTCTP1 transcript are even higher in heterozygous plants than in wild-type plants. Leaf explants transformed with Agrobacterium rhizogenes harboring AtTCTP2, but not AtTCTP1, led to whole plant regeneration with a high frequency. Insertion of a sequence present in AtTCTP1 but absent in AtTCTP2 demonstrates that it suppresses the capacity for plant regeneration; also, this phenomenon is enhanced by the presence of TCTP (AtTCTP1 or 2) in the nuclei of root cells. This confirms that AtTCTP2 is not a pseudogene and suggests the involvement of certain TCTP isoforms in vegetative reproduction in some plant species. PMID:26191065

  12. Pseudogenization of the MCP-2/CCL8 chemokine gene in European rabbit (genus Oryctolagus), but not in species of Cottontail rabbit (Sylvilagus) and Hare (Lepus)

    PubMed Central

    2012-01-01

    Background Recent studies in human have highlighted the importance of the monocyte chemotactic proteins (MCP) in leukocyte trafficking and their effects in inflammatory processes, tumor progression, and HIV-1 infection. In European rabbit (Oryctolagus cuniculus) one of the prime MCP targets, the chemokine receptor CCR5 underwent a unique structural alteration. Until now, no homologue of MCP-2/CCL8a, MCP-3/CCL7 or MCP-4/CCL13 genes have been reported for this species. This is interesting, because at least the first two genes are expressed in most, if not all, mammals studied, and appear to be implicated in a variety of important chemokine ligand-receptor interactions. By assessing the Rabbit Whole Genome Sequence (WGS) data we have searched for orthologs of the mammalian genes of the MCP-Eotaxin cluster. Results We have localized the orthologs of these chemokine genes in the genome of European rabbit and compared them to those of leporid genera which do (i.e. Oryctolagus and Bunolagus) or do not share the CCR5 alteration with European rabbit (i.e. Lepus and Sylvilagus). Of the Rabbit orthologs of the CCL8, CCL7, and CCL13 genes only the last two were potentially functional, although showing some structural anomalies at the protein level. The ortholog of MCP-2/CCL8 appeared to be pseudogenized by deleterious nucleotide substitutions affecting exon1 and exon2. By analyzing both genomic and cDNA products, these studies were extended to wild specimens of four genera of the Leporidae family: Oryctolagus, Bunolagus, Lepus, and Sylvilagus. It appeared that the anomalies of the MCP-3/CCL7 and MCP-4/CCL13 proteins are shared among the different species of leporids. In contrast, whereas MCP-2/CCL8 was pseudogenized in every studied specimen of the Oryctolagus - Bunolagus lineage, this gene was intact in species of the Lepus - Sylvilagus lineage, and was, at least in Lepus, correctly transcribed. Conclusion The biological function of a gene was often revealed in situations of dysfunction or gene loss. Infections with Myxoma virus (MYXV) tend to be fatal in European rabbit (genus Oryctolagus), while being harmless in Hares (genus Lepus) and benign in Cottontail rabbit (genus Sylvilagus), the natural hosts of the virus. This communication should stimulate research on a possible role of MCP-2/CCL8 in poxvirus related pathogenicity. PMID:22894773

  13. Foxo3 activity promoted by non-coding effects of circular RNA and Foxo3 pseudogene in the inhibition of tumor growth and angiogenesis.

    PubMed

    Yang, W; Du, W W; Li, X; Yee, A J; Yang, B B

    2016-07-28

    It has recently been shown that the upregulation of a pseudogene specific to a protein-coding gene could function as a sponge to bind multiple potential targeting microRNAs (miRNAs), resulting in increased gene expression. Similarly, it was recently demonstrated that circular RNAs can function as sponges for miRNAs, and could upregulate expression of mRNAs containing an identical sequence. Furthermore, some mRNAs are now known to not only translate protein, but also function to sponge miRNA binding, facilitating gene expression. Collectively, these appear to be effective mechanisms to ensure gene expression and protein activity. Here we show that expression of a member of the forkhead family of transcription factors, Foxo3, is regulated by the Foxo3 pseudogene (Foxo3P), and Foxo3 circular RNA, both of which bind to eight miRNAs. We found that the ectopic expression of the Foxo3P, Foxo3 circular RNA and Foxo3 mRNA could all suppress tumor growth and cancer cell proliferation and survival. Our results showed that at least three mechanisms are used to ensure protein translation of Foxo3, which reflects an essential role of Foxo3 and its corresponding non-coding RNAs.

  14. New steroid 5alpha-reductase type I (SRD5A1) homologous sequences on human chromosomes 6 and 8.

    PubMed

    Eminović, I; Liović, M; Prezelj, J; Kocijancic, A; Rozman, D; Komel, R

    2001-01-01

    To date, two genes encoding 5alpha-reductase isoenzymes are known (type I, type II), and one type I pseudogene. The divergent localization of these genes and the still not fully understood function of the encoded enzymes as well as the perplexing results we obtained after sequencing PCR-amplified SRD5A1 gene fragments (out of genomic DNA), made us assume that, in addition to the known SRD5A1 gene, one or more different human 5alpha-reductase type I coding genes may exist. Our research provide the first evidence for the existence of two new SRD5A1 related, previously unidentified sequences in the human genome. These sequences which were localized to chromosomes 6 and 8 are highly homologous (> 99%) to SRD5A1, and also do not contain any deletions or insertions that are otherwise a characteristic of the SRD5API pseudogene. Our results imply that these sequences may be either coding parts of yet unknown, active SRD5A1 genes, and/or of previously unidentified pseudogenes. These findings additionally support data of Chen et al. who confirmed the existence of various SRD5A1 proteins in cultured human skin cells.

  15. Vomeronasal and Olfactory Structures in Bats Revealed by DiceCT Clarify Genetic Evidence of Function

    PubMed Central

    Yohe, Laurel R.; Hoffmann, Simone; Curtis, Abigail

    2018-01-01

    The degree to which molecular and morphological loss of function occurs synchronously during the vestigialization of traits is not well understood. The mammalian vomeronasal system, a sense critical for mediating many social and reproductive behaviors, is highly conserved across mammals. New World Leaf-nosed bats (Phyllostomidae) are under strong selection to maintain a functional vomeronasal system such that most phyllostomids possess a distinct vomeronasal organ and an intact TRPC2, a gene encoding a protein primarily involved in vomeronasal sensory neuron signal transduction. Recent genetic evidence, however, shows that TRPC2 is a pseudogene in some Caribbean nectarivorous phyllostomids. The loss-of-function mutations suggest the sensory neural tissue of the vomeronasal organ is absent in these species despite strong selection on this gene in its mainland relatives, but the anatomy was unknown in most Caribbean nectarivorous phyllostomids until this study. We used diffusible iodine-based contrast-enhanced computed tomography (diceCT) to test whether the vomeronasal and main olfactory anatomy of several phyllostomid species matched genetic evidence of function, providing insight into whether loss of a structure is linked to pseudogenization of a molecular component of the system. The vomeronasal organ is indeed rudimentary or absent in species with a disrupted TRPC2 gene. Caribbean nectar-feeders also exhibit derived olfactory turbinal morphology and a large olfactory recess that differs from closely related bats that have an intact vomeronasal organ, which may hint that the main olfactory system may compensate for loss. We emphasize non-invasive diceCT is capable of detecting the vomeronasal organ, providing a feasible approach for quantifying mammalian chemosensory anatomy across species. PMID:29867373

  16. Are Synonymous Sites in Primates and Rodents Functionally Constrained?

    PubMed

    Price, Nicholas; Graur, Dan

    2016-01-01

    It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.

  17. The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus

    PubMed Central

    Matsuda, Fumihiko; Ishii, Kazuo; Bourvagnet, Patrice; Kuma, Kei-ichi; Hayashida, Hidenori; Miyata, Takashi; Honjo, Tasuku

    1998-01-01

    The complete nucleotide sequence of the 957-kb DNA of the human immunoglobulin heavy chain variable (VH) region locus was determined and 43 novel VH segments were identified. The region contains 123 VH segments classifiable into seven different families, of which 79 are pseudogenes. Of the 44 VH segments with an open reading frame, 39 are expressed as heavy chain proteins and 1 as mRNA, while the remaining 4 are not found in immunoglobulin cDNAs. Combinatorial diversity of VH region was calculated to be ∼6,000. Conservation of the promoter and recombination signal sequences was observed to be higher in functional VH segments than in pseudogenes. Phylogenetic analysis of 114 VH segments clearly showed clustering of the VH segments of each family. However, an independent branch in the tree contained a single VH, V4-44.1P, sharing similar levels of homology to human VH families and to those of other vertebrates. Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the locus took place at least eight times between 133 and 10 million years ago. One nonimmunoglobulin gene of unknown function was identified in the intergenic region. PMID:9841928

  18. Structural and functional partitioning of bread wheat chromosome 3B.

    PubMed

    Choulet, Frédéric; Alberti, Adriana; Theil, Sébastien; Glover, Natasha; Barbe, Valérie; Daron, Josquin; Pingault, Lise; Sourdille, Pierre; Couloux, Arnaud; Paux, Etienne; Leroy, Philippe; Mangenot, Sophie; Guilhot, Nicolas; Le Gouis, Jacques; Balfourier, Francois; Alaux, Michael; Jamilloux, Véronique; Poulain, Julie; Durand, Céline; Bellec, Arnaud; Gaspin, Christine; Safar, Jan; Dolezel, Jaroslav; Rogers, Jane; Vandepoele, Klaas; Aury, Jean-Marc; Mayer, Klaus; Berges, Hélène; Quesneville, Hadi; Wincker, Patrick; Feuillet, Catherine

    2014-07-18

    We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits. Copyright © 2014, American Association for the Advancement of Science.

  19. Confirmation of translatability and functionality certifies the dual endothelin1/VEGFsp receptor (DEspR) protein.

    PubMed

    Herrera, Victoria L M; Steffen, Martin; Moran, Ann Marie; Tan, Glaiza A; Pasion, Khristine A; Rivera, Keith; Pappin, Darryl J; Ruiz-Opazo, Nelson

    2016-06-14

    In contrast to rat and mouse databases, the NCBI gene database lists the human dual-endothelin1/VEGFsp receptor (DEspR, formerly Dear) as a unitary transcribed pseudogene due to a stop [TGA]-codon at codon#14 in automated DNA and RNA sequences. However, re-analysis is needed given prior single gene studies detected a tryptophan [TGG]-codon#14 by manual Sanger sequencing, demonstrated DEspR translatability and functionality, and since the demonstration of actual non-translatability through expression studies, the standard-of-excellence for pseudogene designation, has not been performed. Re-analysis must meet UNIPROT criteria for demonstration of a protein's existence at the highest (protein) level, which a priori, would override DNA- or RNA-based deductions. To dissect the nucleotide sequence discrepancy, we performed Maxam-Gilbert sequencing and reviewed 727 RNA-seq entries. To comply with the highest level multiple UNIPROT criteria for determining DEspR's existence, we performed various experiments using multiple anti-DEspR monoclonal antibodies (mAbs) targeting distinct DEspR epitopes with one spanning the contested tryptophan [TGG]-codon#14, assessing: (a) DEspR protein expression, (b) predicted full-length protein size, (c) sequence-predicted protein-specific properties beyond codon#14: receptor glycosylation and internalization, (d) protein-partner interactions, and (e) DEspR functionality via DEspR-inhibition effects. Maxam-Gilbert sequencing and some RNA-seq entries demonstrate two guanines, hence a tryptophan [TGG]-codon#14 within a compression site spanning an error-prone compression sequence motif. Western blot analysis using anti-DEspR mAbs targeting distinct DEspR epitopes detect the identical glycosylated 17.5 kDa pull-down protein. Decrease in DEspR-protein size after PNGase-F digest demonstrates post-translational glycosylation, concordant with the consensus-glycosylation site beyond codon#14. Like other small single-transmembrane proteins, mass spectrometry analysis of anti-DEspR mAb pull-down proteins do not detect DEspR, but detect DEspR-protein interactions with proteins implicated in intracellular trafficking and cancer. FACS analyses also detect DEspR-protein in different human cancer stem-like cells (CSCs). DEspR-inhibition studies identify DEspR-roles in CSC survival and growth. Live cell imaging detects fluorescently-labeled anti-DEspR mAb targeted-receptor internalization, concordant with the single internalization-recognition sequence also located beyond codon#14. Data confirm translatability of DEspR, the full-length DEspR protein beyond codon#14, and elucidate DEspR-specific functionality. Along with detection of the tryptophan [TGG]-codon#14 within an error-prone compression site, cumulative data demonstrating DEspR protein existence fulfill multiple UNIPROT criteria, thus refuting its pseudogene designation.

  20. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    PubMed

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  1. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective

    PubMed Central

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163

  2. Regional differences in mitochondrial DNA methylation in human post-mortem brain tissue.

    PubMed

    Devall, Matthew; Smith, Rebecca G; Jeffries, Aaron; Hannon, Eilis; Davies, Matthew N; Schalkwyk, Leonard; Mill, Jonathan; Weedon, Michael; Lunnon, Katie

    2017-01-01

    DNA methylation is an important epigenetic mechanism involved in gene regulation, with alterations in DNA methylation in the nuclear genome being linked to numerous complex diseases. Mitochondrial DNA methylation is a phenomenon that is receiving ever-increasing interest, particularly in diseases characterized by mitochondrial dysfunction; however, most studies have been limited to the investigation of specific target regions. Analyses spanning the entire mitochondrial genome have been limited, potentially due to the amount of input DNA required. Further, mitochondrial genetic studies have been previously confounded by nuclear-mitochondrial pseudogenes. Methylated DNA Immunoprecipitation Sequencing is a technique widely used to profile DNA methylation across the nuclear genome; however, reads mapped to mitochondrial DNA are often discarded. Here, we have developed an approach to control for nuclear-mitochondrial pseudogenes within Methylated DNA Immunoprecipitation Sequencing data. We highlight the utility of this approach in identifying differences in mitochondrial DNA methylation across regions of the human brain and pre-mortem blood. We were able to correlate mitochondrial DNA methylation patterns between the cortex, cerebellum and blood. We identified 74 nominally significant differentially methylated regions ( p  < 0.05) in the mitochondrial genome, between anatomically separate cortical regions and the cerebellum in matched samples ( N  = 3 matched donors). Further analysis identified eight significant differentially methylated regions between the total cortex and cerebellum after correcting for multiple testing. Using unsupervised hierarchical clustering analysis of the mitochondrial DNA methylome, we were able to identify tissue-specific patterns of mitochondrial DNA methylation between blood, cerebellum and cortex. Our study represents a comprehensive analysis of the mitochondrial methylome using pre-existing Methylated DNA Immunoprecipitation Sequencing data to identify brain region-specific patterns of mitochondrial DNA methylation.

  3. Harnessing Gene Conversion in Chicken B Cells to Create a Human Antibody Sequence Repertoire

    PubMed Central

    Schusser, Benjamin; Yi, Henry; Collarini, Ellen J.; Izquierdo, Shelley Mettler; Harriman, William D.; Etches, Robert J.; Leighton, Philip A.

    2013-01-01

    Transgenic chickens expressing human sequence antibodies would be a powerful tool to access human targets and epitopes that have been intractable in mammalian hosts because of tolerance to conserved proteins. To foster the development of the chicken platform, it is beneficial to validate transgene constructs using a rapid, cell culture-based method prior to generating fully transgenic birds. We describe a method for the expression of human immunoglobulin variable regions in the chicken DT40 B cell line and the further diversification of these genes by gene conversion. Chicken VL and VH loci were knocked out in DT40 cells and replaced with human VK and VH genes. To achieve gene conversion of human genes in chicken B cells, synthetic human pseudogene arrays were inserted upstream of the functional human VK and VH regions. Proper expression of chimeric IgM comprised of human variable regions and chicken constant regions is shown. Most importantly, sequencing of DT40 genetic variants confirmed that the human pseudogene arrays contributed to the generation of diversity through gene conversion at both the Igl and Igh loci. These data show that engineered pseudogene arrays produce a diverse pool of human antibody sequences in chicken B cells, and suggest that these constructs will express a functional repertoire of chimeric antibodies in transgenic chickens. PMID:24278246

  4. Long-range PCR facilitates the identification of PMS2-specific mutations.

    PubMed

    Clendenning, Mark; Hampel, Heather; LaJeunesse, Jennifer; Lindblom, Annika; Lockman, Jan; Nilbert, Mef; Senter, Leigha; Sotamaa, Kaisa; de la Chapelle, Albert

    2006-05-01

    Mutations within the DNA mismatch repair gene, "postmeiotic segregation increased 2" (PMS2), have been associated with a predisposition to hereditary nonpolyposis colorectal cancer (HNPCC; Lynch syndrome). The presence of a large family of highly homologous PMS2 pseudogenes has made previous attempts to sequence PMS2 very difficult. Here, we describe a novel method that utilizes long-range PCR as a way to preferentially amplify PMS2 and not the pseudogenes. A second, exon-specific, amplification from diluted long-range products enables us to obtain a clean sequence that shows no evidence of pseudogene contamination. This method has been used to screen a cohort of patients whose tumors were negative for the PMS2 protein by immunohistochemistry and had not shown any mutations within the MLH1 gene. Sequencing of the PMS2 gene from 30 colorectal and 11 endometrial cancer patients identified 10 novel sequence changes as well as 17 sequence changes that had previously been identified. In total, putative pathologic mutations were detected in 11 of the 41 families. Among these were five novel mutations, c.705+1G>T, c.736_741del6ins11, c.862_863del, c.1688G>T, and c.2007-1G>A. We conclude that PMS2 mutation detection in selected Lynch syndrome and Lynch syndrome-like patients is both feasible and desirable. Published 2006 Wiley-Liss, Inc.

  5. RExPrimer: an integrated primer designing tool increases PCR effectiveness by avoiding 3' SNP-in-primer and mis-priming from structural variation

    PubMed Central

    2009-01-01

    Background Polymerase chain reaction (PCR) is very useful in many areas of molecular biology research. It is commonly observed that PCR success is critically dependent on design of an effective primer pair. Current tools for primer design do not adequately address the problem of PCR failure due to mis-priming on target-related sequences and structural variations in the genome. Methods We have developed an integrated graphical web-based application for primer design, called RExPrimer, which was written in Python language. The software uses Primer3 as the primer designing core algorithm. Locally stored sequence information and genomic variant information were hosted on MySQLv5.0 and were incorporated into RExPrimer. Results RExPrimer provides many functionalities for improved PCR primer design. Several databases, namely annotated human SNP databases, insertion/deletion (indel) polymorphisms database, pseudogene database, and structural genomic variation databases were integrated into RExPrimer, enabling an effective without-leaving-the-website validation of the resulting primers. By incorporating these databases, the primers reported by RExPrimer avoid mis-priming to related sequences (e.g. pseudogene, segmental duplication) as well as possible PCR failure because of structural polymorphisms (SNP, indel, and copy number variation (CNV)). To prevent mismatching caused by unexpected SNPs in the designed primers, in particular the 3' end (SNP-in-Primer), several SNP databases covering the broad range of population-specific SNP information are utilized to report SNPs present in the primer sequences. Population-specific SNP information also helps customize primer design for a specific population. Furthermore, RExPrimer offers a graphical user-friendly interface through the use of scalable vector graphic image that intuitively presents resulting primers along with the corresponding gene structure. In this study, we demonstrated the program effectiveness in successfully generating primers for strong homologous sequences. Conclusion The improvements for primer design incorporated into RExPrimer were demonstrated to be effective in designing primers for challenging PCR experiments. Integration of SNP and structural variation databases allows for robust primer design for a variety of PCR applications, irrespective of the sequence complexity in the region of interest. This software is freely available at http://www4a.biotec.or.th/rexprimer. PMID:19958502

  6. Genomic organization of the human gene (CA5) and pseudogene for mitochondrial carbonic anhydrase V and their localization to chromosomes 16q and 16p

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nagao, Yoshiro; Sly, W.S.; Batanian, J.R.

    1995-08-10

    Carbonic anhydrase V (CA V) is expressed in mitochondrial matrix in liver and several other tissues. It is of interest for its putative roles in providing bicarbonate to carbamoyl phosphate synthetase for ureagenesis and to pyruvate carboxylase for gluconeogenesis and its possible importance in explaining certain inherited metabolic disorders with hyperammonemia and hypoglycemia. Following the recent characterization of the cDNA for human CA V, we report the isolation of the human gene from two {lambda} genomic libraries and its characterization. The CA V gene (CA5) is approximately 50 kb long and contains 7 exons and 6 introns. The exon-intron boundariesmore » are found in positions identical to those determined for the previously described CA II, CA III, and CA VII genes. Like the CA VII gene, CA5 does not contain typical TATA and CAAT promoter elements in the 5{prime} flanking region but does contain a TTTAA sequence 147 nucleotides upstream of the initiation codon. CA5 also contains a 12-bp GT-rich segment beginning 13 bp downstream of the polyadenylation signal in the 3{prime} untranslated region of exon 7. FISH analysis allowed CA5 to be assigned to chromosome 16q24.3. An unprocessed pseudogene containing sequence homologous to exons 3-7 and introns 3-6 was also isolated and was assigned by FISH analysis to chromosome 16p11.2-p12. 22 refs., 4 figs., 1 tab.« less

  7. Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster.

    PubMed

    Robertson, Hugh M; Warr, Coral G; Carlson, John R

    2003-11-25

    The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods.

  8. Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster

    PubMed Central

    Robertson, Hugh M.; Warr, Coral G.; Carlson, John R.

    2003-01-01

    The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods. PMID:14608037

  9. Characterization of trh2 Harbouring Vibrio parahaemolyticus Strains Isolated in Germany

    PubMed Central

    Bechlars, Silke; Jäckel, Claudia; Diescher, Susanne; Wüstenhagen, Doreen A.; Kubick, Stefan; Dieckmann, Ralf; Strauch, Eckhard

    2015-01-01

    Background Vibrio parahaemolyticus is a recognized human enteropathogen. Thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH) as well as the type III secretion system 2 (T3SS2) are considered as major virulence factors. As tdh positive strains are not detected in coastal waters of Germany, we focused on the characterization of trh positive strains, which were isolated from mussels, seawater and patients in Germany. Results Ten trh harbouring V. parahaemolyticus strains from Germany were compared to twenty-one trh positive strains from other countries. The complete trh sequences revealed clustering into three different types: trh1 and trh2 genes and a pseudogene Ψtrh. All German isolates possessed alleles of the trh2 gene. MLST analysis indicated a close relationship to Norwegian isolates suggesting that these strains belong to the autochthonous microflora of Northern Europe seawaters. Strains carrying the pseudogene Ψtrh were negative for T3SS2β effector vopC. Transcription of trh and vopC genes was analyzed under different growth conditions. Trh2 gene expression was not altered by bile while trh1 genes were inducible. VopC could be induced by urea in trh2 bearing strains. Most trh1 carrying strains were hemolytic against sheep erythrocytes while all trh2 positive strains did not show any hemolytic activity. TRH variants were synthesized in a prokaryotic cell-free system and their hemolytic activity was analyzed. TRH1 was active against sheep erythrocytes while TRH2 variants were not active at all. Conclusion Our study reveals a high diversity among trh positive V. parahaemolyticus strains. The function of TRH2 hemolysins and the role of the pseudogene Ψtrh as pathogenicity factors are questionable. To assess the pathogenic potential of V. parahaemolyticus strains a differentiation of trh variants and the detection of T3SS2β components like vopC would improve the V. parahaemolyticus diagnostics and could lead to a refinement of the risk assessment in food analyses and clinical diagnostics. PMID:25799574

  10. Quantitation of heteroplasmy of mtDNA sequence variants identified in a population of AD patients and controls by array-based resequencing.

    PubMed

    Coon, Keith D; Valla, Jon; Szelinger, Szabolics; Schneider, Lonnie E; Niedzielko, Tracy L; Brown, Kevin M; Pearson, John V; Halperin, Rebecca; Dunckley, Travis; Papassotiropoulos, Andreas; Caselli, Richard J; Reiman, Eric M; Stephan, Dietrich A

    2006-08-01

    The role of mitochondrial dysfunction in the pathogenesis of Alzheimer's disease (AD) has been well documented. Though evidence for the role of mitochondria in AD seems incontrovertible, the impact of mitochondrial DNA (mtDNA) mutations in AD etiology remains controversial. Though mutations in mitochondrially encoded genes have repeatedly been implicated in the pathogenesis of AD, many of these studies have been plagued by lack of replication as well as potential contamination of nuclear-encoded mitochondrial pseudogenes. To assess the role of mtDNA mutations in the pathogenesis of AD, while avoiding the pitfalls of nuclear-encoded mitochondrial pseudogenes encountered in previous investigations and showcasing the benefits of a novel resequencing technology, we sequenced the entire coding region (15,452 bp) of mtDNA from 19 extremely well-characterized AD patients and 18 age-matched, unaffected controls utilizing a new, reliable, high-throughput array-based resequencing technique, the Human MitoChip. High-throughput, array-based DNA resequencing of the entire mtDNA coding region from platelets of 37 subjects revealed the presence of 208 loci displaying a total of 917 sequence variants. There were no statistically significant differences in overall mutational burden between cases and controls, however, 265 independent sites of statistically significant change between cases and controls were identified. Changed sites were found in genes associated with complexes I (30.2%), III (3.0%), IV (33.2%), and V (9.1%) as well as tRNA (10.6%) and rRNA (14.0%). Despite their statistical significance, the subtle nature of the observed changes makes it difficult to determine whether they represent true functional variants involved in AD etiology or merely naturally occurring dissimilarity. Regardless, this study demonstrates the tremendous value of this novel mtDNA resequencing platform, which avoids the pitfalls of erroneously amplifying nuclear-encoded mtDNA pseudogenes, and our proposed analysis paradigm, which utilizes the availability of raw signal intensity values for each of the four potential alleles to facilitate quantitative estimates of mtDNA heteroplasmy. This information provides a potential new target for burgeoning diagnostics and therapeutics that could truly assist those suffering from this devastating disorder.

  11. Intraspecific variation in mitochondrial genome sequence, structure, and gene content in Silene vulgaris, an angiosperm with pervasive cytoplasmic male sterility.

    PubMed

    Sloan, Daniel B; Müller, Karel; McCauley, David E; Taylor, Douglas R; Storchová, Helena

    2012-12-01

    In angiosperms, mitochondrial-encoded genes can cause cytoplasmic male sterility (CMS), resulting in the coexistence of female and hermaphroditic individuals (gynodioecy). We compared four complete mitochondrial genomes from the gynodioecious species Silene vulgaris and found unprecedented amounts of intraspecific diversity for plant mitochondrial DNA (mtDNA). Remarkably, only about half of overall sequence content is shared between any pair of genomes. The four mtDNAs range in size from 361 to 429 kb and differ in gene complement, with rpl5 and rps13 being intact in some genomes but absent or pseudogenized in others. The genomes exhibit essentially no conservation of synteny and are highly repetitive, with evidence of reciprocal recombination occurring even across short repeats (< 250 bp). Some mitochondrial genes exhibit atypically high degrees of nucleotide polymorphism, while others are invariant. The genomes also contain a variable number of small autonomously mapping chromosomes, which have only recently been identified in angiosperm mtDNA. Southern blot analysis of one of these chromosomes indicated a complex in vivo structure consisting of both monomeric circles and multimeric forms. We conclude that S. vulgaris harbors an unusually large degree of variation in mtDNA sequence and structure and discuss the extent to which this variation might be related to CMS. © 2012 The Authors. New Phytologist © 2012 New Phytologist Trust.

  12. Organization of the SUC gene family in Saccharomyces.

    PubMed Central

    Carlson, M; Botstein, D

    1983-01-01

    The SUC gene family of yeast (Saccharomyces) includes six structural genes for invertase (SUC1 through SUC5 and SUC7) found at unlinked chromosomal loci. A given yeast strain does not usually carry SUC+ alleles at all six loci; the natural negative alleles are called suc0 alleles. Cloned SUC2 DNA probes were used to investigate the physical structure of the SUC gene family in laboratory strains, commercial wine strains, and different Saccharomyces species. The active SUC+ genes are homologous. The suc0 allele at the SUC2 locus (suc2(0) in some strains is a silent gene or pseudogene. Other SUC loci carrying suc0 alleles appear to lack SUC DNA sequences. These findings imply that SUC genes have transposed to different chromosomal locations in closely related Saccharomyces strains. Images PMID:6843548

  13. Long non-coding RNA phosphatase and tensin homolog pseudogene 1 suppresses osteosarcoma cell growth via the phosphoinositide 3-kinase/protein kinase B signaling pathway.

    PubMed

    Yan, Bin; Wubuli, Aikepaer; Liu, Yidong; Wang, Xin

    2018-06-01

    Osteosarcoma is a common type of human carcinoma, which exhibits a high metastasis and recurrence rate. Previous studies have indicated that long non-coding RNA phosphatase and tensin homolog pseudogene 1 (lnPTENP1) has tumor suppressive action by modulating PTEN expression in different types of tumor cells. However, the potential mechanism by which lnPTENP1 has an effect in osteosarcoma cells remains elusive. In the present study, the role of lnPTENP1 in osteosarcoma cells was investigated and the possible mechanisms by which it functions were explored. It was revealed that lnPTENP1 transfection significantly inhibited osteosarcoma cell growth, proliferation, migration and invasion. LnPTENP1 transfection also significantly promoted apoptosis in Mg63 cells treated with tunicamycin. Further analysis revealed that lnPTENP1 transfection regulated osteosarcoma cell growth via the PI3K/AKT signaling pathway. In vivo assays revealed that lnPTENP1 transfection significantly inhibited osteosarcoma tumor growth and significantly increased the protein expression and phosphorylation levels of PI3K and AKT. In conclusion, the results of the present study indicated that lnPTENP1 may inhibit osteosarcoma cell growth via the PI3K/AKT signaling pathway, which may be a potential novel target for human osteosarcoma therapy.

  14. The 5S rDNA in two Abracris grasshoppers (Ommatolampidinae: Acrididae): molecular and chromosomal organization.

    PubMed

    Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti

    2016-08-01

    The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.

  15. An improved strategy and a useful housekeeping gene for RNA analysis from formalin-fixed, paraffin-embedded tissues by PCR.

    PubMed

    Finke, J; Fritzen, R; Ternes, P; Lange, W; Dölken, G

    1993-03-01

    Specific amplification of nucleic acid sequences by PCR has been extensively used for the detection of gene rearrangements and gene expression. Although successful amplification of DNA sequences has been carried out with DNA prepared from formalin-fixed, paraffin-embedded (FFPE) tissues, there are only a few reports regarding RNA analysis in this kind of material. We describe a procedure for RNA extraction from different types of FFPE tissues, involving digestion with proteinase K followed by guanidinium-thiocyanate acid phenol extraction and DNase I digestion. These RNA preparations are suitable for PCR analysis of mRNA and even of intronless genes. Furthermore, the universally expressed porphobilinogen deaminase mRNA proved to be useful as a positive control because of the lack of pseudogenes.

  16. Structure-based analysis of five novel disease-causing mutations in 21-hydroxylase-deficient patients.

    PubMed

    Minutolo, Carolina; Nadra, Alejandro D; Fernández, Cecilia; Taboas, Melisa; Buzzalino, Noemí; Casali, Bárbara; Belli, Susana; Charreau, Eduardo H; Alba, Liliana; Dain, Liliana

    2011-01-11

    Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency is the most frequent inborn error of metabolism, and accounts for 90-95% of CAH cases. The affected enzyme, P450C21, is encoded by the CYP21A2 gene, located together with a 98% nucleotide sequence identity CYP21A1P pseudogene, on chromosome 6p21.3. Even though most patients carry CYP21A1P-derived mutations, an increasing number of novel and rare mutations in disease causing alleles were found in the last years. In the present work, we describe five CYP21A2 novel mutations, p.R132C, p.149C, p.M283V, p.E431K and a frameshift g.2511_2512delGG, in four non-classical and one salt wasting patients from Argentina. All novel point mutations are located in CYP21 protein residues that are conserved throughout mammalian species, and none of them were found in control individuals. The putative pathogenic mechanisms of the novel variants were analyzed in silico. A three-dimensional CYP21 structure was generated by homology modeling and the protein design algorithm FoldX was used to calculate changes in stability of CYP21A2 protein. Our analysis revealed changes in protein stability or in the surface charge of the mutant enzymes, which could be related to the clinical manifestation found in patients.

  17. Structure-Based Analysis of Five Novel Disease-Causing Mutations in 21-Hydroxylase-Deficient Patients

    PubMed Central

    Fernández, Cecilia; Taboas, Melisa; Buzzalino, Noemí; Casali, Bárbara; Belli, Susana; Charreau, Eduardo H.; Alba, Liliana; Dain, Liliana

    2011-01-01

    Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency is the most frequent inborn error of metabolism, and accounts for 90–95% of CAH cases. The affected enzyme, P450C21, is encoded by the CYP21A2 gene, located together with a 98% nucleotide sequence identity CYP21A1P pseudogene, on chromosome 6p21.3. Even though most patients carry CYP21A1P-derived mutations, an increasing number of novel and rare mutations in disease causing alleles were found in the last years. In the present work, we describe five CYP21A2 novel mutations, p.R132C, p.149C, p.M283V, p.E431K and a frameshift g.2511_2512delGG, in four non-classical and one salt wasting patients from Argentina. All novel point mutations are located in CYP21 protein residues that are conserved throughout mammalian species, and none of them were found in control individuals. The putative pathogenic mechanisms of the novel variants were analyzed in silico. A three-dimensional CYP21 structure was generated by homology modeling and the protein design algorithm FoldX was used to calculate changes in stability of CYP21A2 protein. Our analysis revealed changes in protein stability or in the surface charge of the mutant enzymes, which could be related to the clinical manifestation found in patients. PMID:21264314

  18. The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).

    PubMed

    Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai

    2014-12-01

    The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.

  19. Growth hormone deficiency with advanced bone age: phenotypic interaction between GHRH receptor and CYP21A2 mutations diagnosed by sanger and whole exome sequencing.

    PubMed

    Correa, Fernanda A; França, Marcela M; Fang, Qing; Ma, Qianyi; Bachega, Tania A; Rodrigues, Andresa; Ozel, Bilge A; Li, Jun Z; Mendonca, Berenice B; Jorge, Alexander A L; Carvalho, Luciani R; Camper, Sally A; Arnhold, Ivo J P

    2017-12-01

    Isolated growth hormone deficiency (IGHD) is the most common pituitary hormone deficiency and, clinically, patients have delayed bone age. High sequence similarity between CYP21A2 gene and CYP21A1P pseudogene poses difficulties for exome sequencing interpretation. A 7.5 year-old boy born to second-degree cousins presented with severe short stature (height SDS -3.7) and bone age of 6 years. Clonidine and combined pituitary stimulation tests revealed GH deficiency. Pituitary MRI was normal. The patient was successfully treated with rGH. Surprisingly, at 10.8 years, his bone age had advanced to 13 years, but physical exam, LH and testosterone levels remained prepubertal. An ACTH stimulation test disclosed a non-classic congenital adrenal hyperplasia due to 21-hydroxylase deficiency explaining the bone age advancement and, therefore, treatment with cortisone acetate was added. The genetic diagnosis of a homozygous mutation in GHRHR (p.Leu144His), a homozygous CYP21A2 mutation (p.Val282Leu) and CYP21A1P pseudogene duplication was established by Sanger sequencing, MLPA and whole-exome sequencing. We report the unusual clinical presentation of a patient born to consanguineous parents with two recessive endocrine diseases: non-classic congenital adrenal hyperplasia modifying the classical GH deficiency phenotype. We used a method of paired read mapping aided by neighbouring mis-matches to overcome the challenges of exome-sequencing in the presence of a pseudogene.

  20. Loss or major reduction of umami taste sensation in pinnipeds

    NASA Astrophysics Data System (ADS)

    Sato, Jun J.; Wolsan, Mieczyslaw

    2012-08-01

    Umami is one of basic tastes that humans and other vertebrates can perceive. This taste is elicited by L-amino acids and thus has a special role of detecting nutritious, protein-rich food. The T1R1 + T1R3 heterodimer acts as the principal umami receptor. The T1R1 protein is encoded by the Tas1r1 gene. We report multiple inactivating (pseudogenizing) mutations in exon 3 of this gene from four phocid and two otariid species (Pinnipedia). Jiang et al. (Proc Natl Acad Sci U S A 109:4956-4961, 2012) reported two inactivating mutations in exons 2 and 6 of this gene from another otariid species. These findings suggest lost or greatly reduced umami sensory capabilities in these species. The widespread occurrence of a nonfunctional Tas1r1 pseudogene in this clade of strictly carnivorous mammals is surprising. We hypothesize that factors underlying the pseudogenization of Tas1r1 in pinnipeds may be driven by the marine environment to which these carnivorans (Carnivora) have adapted and may include: the evolutionary change in diet from tetrapod prey to fish and cephalopods (because cephalopods and living fish contain little or no synergistic inosine 5'-monophosphate that greatly enhances umami taste), the feeding behavior of swallowing food whole without mastication (because the T1R1 + T1R3 receptor is distributed on the tongue and palate), and the saltiness of sea water (because a high concentration of sodium chloride masks umami taste).

  1. Evolutionary history and metabolic insights of ancient mammalian uricases

    PubMed Central

    Kratzer, James T.; Lanaspa, Miguel A.; Murphy, Michael N.; Cicerchi, Christina; Graves, Christina L.; Tipton, Peter A.; Ortlund, Eric A.; Johnson, Richard J.; Gaucher, Eric A.

    2014-01-01

    Uricase is an enzyme involved in purine catabolism and is found in all three domains of life. Curiously, uricase is not functional in some organisms despite its role in converting highly insoluble uric acid into 5-hydroxyisourate. Of particular interest is the observation that apes, including humans, cannot oxidize uric acid, and it appears that multiple, independent evolutionary events led to the silencing or pseudogenization of the uricase gene in ancestral apes. Various arguments have been made to suggest why natural selection would allow the accumulation of uric acid despite the physiological consequences of crystallized monosodium urate acutely causing liver/kidney damage or chronically causing gout. We have applied evolutionary models to understand the history of primate uricases by resurrecting ancestral mammalian intermediates before the pseudogenization events of this gene family. Resurrected proteins reveal that ancestral uricases have steadily decreased in activity since the last common ancestor of mammals gave rise to descendent primate lineages. We were also able to determine the 3D distribution of amino acid replacements as they accumulated during evolutionary history by crystallizing a mammalian uricase protein. Further, ancient and modern uricases were stably transfected into HepG2 liver cells to test one hypothesis that uricase pseudogenization allowed ancient frugivorous apes to rapidly convert fructose into fat. Finally, pharmacokinetics of an ancient uricase injected in rodents suggest that our integrated approach provides the foundation for an evolutionarily-engineered enzyme capable of treating gout and preventing tumor lysis syndrome in human patients. PMID:24550457

  2. An in-depth comparison of the porcine, murine and human inflammasomes; lessons from the porcine genome and transcriptome.

    PubMed

    Dawson, Harry D; Smith, Allen D; Chen, Celine; Urban, Joseph F

    2017-04-01

    Emerging evidence suggests that swine are a scientifically acceptable intermediate species between rodents and humans to model immune function relevant to humans. The swine genome has recently been sequenced and several preliminary structural and functional analysis of the porcine immunome have been published. Herein we provide an expanded in silico analysis using an improved assembly of the porcine transcriptome that provides an in depth analysis of genes that are related to inflammasomes, responses to Toll-like receptor ligands, and M1 macrophage polarization and Escherichia coli as a model organism. Comparisons of the expansion or contraction of orthologous gene families indicated more similar rates and classes of genes in humans and pigs than in mice; however several novel porcine or artiodactyl-specific paralogs or pseudogenes were identified. Conservation of homology and structural motifs of orthologs revealed that the overall similarity to human proteins was significantly higher for pigs compared to mouse. Despite these similarities, two out of four canonical inflammasome pathways, Absent in melanoma 2 (AIM2) and NLR family and CARD domain containing 4 (NLRC4), were found to be missing in pigs. Pig M1 Mφ polarization in response to interferon-γ (IFN-γ) and lipopolysaccharide (LPS) was assessed, via the transcriptome, using next generation sequencing. Our analysis revealed predominantly human-like responses however some, mouse-like responses were observed, as well as induction of numerous pig or artiodactyl-specific genes. This work supports using swine to model both human immunological and inflammatory responses to infection. However, caution must be exercised as pigs differ from humans in several fundamental pathways. Published by Elsevier B.V.

  3. Unit-length line-1 transcripts in human teratocarcinoma cells.

    PubMed Central

    Skowronski, J; Fanning, T G; Singer, M F

    1988-01-01

    We have characterized the approximately 6.5-kilobase cytoplasmic poly(A)+ Line-1 (L1) RNA present in a human teratocarcinoma cell line, NTera2D1, by primer extension and by analysis of cloned cDNAs. The bulk of the RNA begins (5' end) at the residue previously identified as the 5' terminus of the longest known primate genomic L1 elements, presumed to represent "unit" length. Several of the cDNA clones are close to 6 kilobase pairs, that is, close to full length. The partial sequences of 18 cDNA clones and full sequence of one (5,975 base pairs) indicate that many different genomic L1 elements contribute transcripts to the 6.5-kilobase cytoplasmic poly(A)+ RNA in NTera2D1 cells because no 2 of the 19 cDNAs analyzed had identical sequences. The transcribed elements appear to represent a subset of the total genomic L1s, a subset that has a characteristic consensus sequence in the 3' noncoding region and a high degree of sequence conservation throughout. Two open reading frames (ORFs) of 1,122 (ORF1) and 3,852 (ORF2) bases, flanked by about 800 and 200 bases of sequence at the 5' and 3' ends, respectively, can be identified in the cDNAs. Both ORFs are in the same frame, and they are separated by 33 bases bracketed by two conserved in-frame stop codons. ORF 2 is interrupted by at least one randomly positioned stop codon in the majority of the cDNAs. The data support proposals suggesting that the human L1 family includes one or more functional genes as well as an extraordinarily large number of pseudogenes whose ORFs are broken by stop codons. The cDNA structures suggest that both genes and pseudogenes are transcribed. At least one of the cDNAs (cD11), which was sequenced in its entirety, could, in principle, represent an mRNA for production of the ORF1 polypeptide. The similarity of mammalian L1s to several recently described invertebrate movable elements defines a new widely distributed class of elements which we term class II retrotransposons. Images PMID:2454389

  4. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae

    PubMed Central

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences. PMID:28617867

  5. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae.

    PubMed

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences.

  6. Phylogenetic appearance of Neuropeptide S precursor proteins in tetrapods

    PubMed Central

    Reinscheid, Rainer K.

    2007-01-01

    Sleep and emotional behavior are two hallmarks of vertebrate animal behavior, implying that specialized neuronal circuits and dedicated neurochemical messengers may have been developed during evolution to regulate such complex behaviors. Neuropeptide S (NPS) is a newly identified peptide transmitter that activates a typical G protein-coupled receptor. Central administration of NPS produces profound arousal, enhances wakefulness and suppresses all stages of sleep. In addition, NPS can alleviate behavioral responses to stress by producing anxiolytic-like effects. A bioinformatic analysis of current genome databases revealed that the NPS peptide precursor gene is present in all vertebrates with the exception of fish. A high level of sequence conservation, especially of aminoterminal structures was detected, indicating stringent requirements for agonist-induced receptor activation. Duplication of the NPS precursor gene was only found in one out of two marsupial species with sufficient genome coverage (Monodelphis domestica; opossum), indicating that the duplicated opossum NPS sequence might have arisen as an isolated event. Pharmacological analysis of both Monodelphis NPS peptides revealed that only the closely related NPS peptide retained agonistic activity at NPS receptors. The duplicated precursor might be either a pseudogene or could have evolved different receptor selectivity. Together, these data show that NPS is a relatively recent gene in vertebrate evolution whose appearance might coincide with its specialized physiological functions in terrestrial vertebrates. PMID:17293003

  7. Coincidence of synteny breakpoints with malignancy-related deletions on human chromosome 3

    PubMed Central

    Kost-Alimova, Maria; Kiss, Hajnalka; Fedorova, Ludmila; Yang, Ying; Dumanski, Jan P.; Klein, George; Imreh, Stefan

    2003-01-01

    We have found previously that during tumor growth intact human chromosome 3 transferred into tumor cells regularly looses certain 3p regions, among them the ≈1.4-Mb common eliminated region 1 (CER1) at 3p21.3. Fluorescence in situ hybridization analysis of 12 mouse orthologous loci revealed that CER1 splits into two segments in mouse and therefore contains a murine/human conservation breakpoint region (CBR). Several breaks occurred in tumors within the region surrounding the CBR, and this sequence has features that characterize unstable chromosomal regions: deletions in yeast artificial chromosome clones, late replication, gene and segment duplications, and pseudogene insertions. Sequence analysis of the entire 3p12-22 revealed that other cancer-associated deletions (regions eliminated from monochromosomal hybrids carrying an intact chromosome 3 during tumor growth and homozygous deletions found in human tumors) colocalized nonrandomly with murine/human CBRs and were characterized by an increased number of local gene duplications and murine/human conservation mismatches (single genes that do not match into the conserved chromosomal segment). The CBR within CER1 contains a simple tandem TATAGA repeat capable of forming a 40-bp-long secondary hairpin-like structure. This repeat is nonrandomly localized within the other tumor-associated deletions and in the vicinity of 3p12-22 CBRs. PMID:12738884

  8. Gene flow contributes to diversification of the major fungal pathogen Candida albicans.

    PubMed

    Ropars, Jeanne; Maufrais, Corinne; Diogo, Dorothée; Marcet-Houben, Marina; Perin, Aurélie; Sertour, Natacha; Mosca, Kevin; Permal, Emmanuelle; Laval, Guillaume; Bouchier, Christiane; Ma, Laurence; Schwartz, Katja; Voelz, Kerstin; May, Robin C; Poulain, Julie; Battail, Christophe; Wincker, Patrick; Borman, Andrew M; Chowdhary, Anuradha; Fan, Shangrong; Kim, Soo Hyun; Le Pape, Patrice; Romeo, Orazio; Shin, Jong Hee; Gabaldon, Toni; Sherlock, Gavin; Bougnoux, Marie-Elisabeth; d'Enfert, Christophe

    2018-06-08

    Elucidating population structure and levels of genetic diversity and recombination is necessary to understand the evolution and adaptation of species. Candida albicans is the second most frequent agent of human fungal infections worldwide, causing high-mortality rates. Here we present the genomic sequences of 182 C. albicans isolates collected worldwide, including commensal isolates, as well as ones responsible for superficial and invasive infections, constituting the largest dataset to date for this major fungal pathogen. Although, C. albicans shows a predominantly clonal population structure, we find evidence of gene flow between previously known and newly identified genetic clusters, supporting the occurrence of (para)sexuality in nature. A highly clonal lineage, which experimentally shows reduced fitness, has undergone pseudogenization in genes required for virulence and morphogenesis, which may explain its niche restriction. Candida albicans thus takes advantage of both clonality and gene flow to diversify.

  9. Complete Chloroplast Genome Sequence of Holoparasite Cistanche deserticola (Orobanchaceae) Reveals Gene Loss and Horizontal Gene Transfer from Its Host Haloxylon ammodendron (Chenopodiaceae)

    PubMed Central

    Qiao, Qin; Ren, Zhumei; Zhao, Jiayuan; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M. James C; Li, Jianqiang; Zhong, Yang

    2013-01-01

    Background The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. Principal Findings/Significance Here we report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae. The cp genome of C. deserticola is greatly reduced both in size (102,657 bp) and in gene content, indicating that all genes required for photosynthesis suffer from gene loss and pseudogenization, except for psbM. The striking difference from other holoparasitic plants is that it retains almost a full set of tRNA genes, and it has lower dN/dS for most genes than another close holoparasitic plant, E. virginiana, suggesting that Cistanche deserticola has undergone fewer losses, either due to a reduced level of holoparasitism, or to a recent switch to this life history. We also found that the rpoC2 gene was present in two copies within C. deserticola. Its own copy has much shortened and turned out to be a pseudogene. Another copy, which was not located in its cp genome, was a homolog of the host plant, Haloxylon ammodendron (Chenopodiaceae), suggesting that it was acquired from its host via a horizontal gene transfer. PMID:23554920

  10. Complete chloroplast genome sequence of holoparasite Cistanche deserticola (Orobanchaceae) reveals gene loss and horizontal gene transfer from its host Haloxylon ammodendron (Chenopodiaceae).

    PubMed

    Li, Xi; Zhang, Ti-Cao; Qiao, Qin; Ren, Zhumei; Zhao, Jiayuan; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M James C; Li, Jianqiang; Zhong, Yang

    2013-01-01

    The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. PRINCIPAL FINDINGS/SIGNIFICANCE: Here we report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae. The cp genome of C. deserticola is greatly reduced both in size (102,657 bp) and in gene content, indicating that all genes required for photosynthesis suffer from gene loss and pseudogenization, except for psbM. The striking difference from other holoparasitic plants is that it retains almost a full set of tRNA genes, and it has lower dN/dS for most genes than another close holoparasitic plant, E. virginiana, suggesting that Cistanche deserticola has undergone fewer losses, either due to a reduced level of holoparasitism, or to a recent switch to this life history. We also found that the rpoC2 gene was present in two copies within C. deserticola. Its own copy has much shortened and turned out to be a pseudogene. Another copy, which was not located in its cp genome, was a homolog of the host plant, Haloxylon ammodendron (Chenopodiaceae), suggesting that it was acquired from its host via a horizontal gene transfer.

  11. Mammalian keratin associated proteins (KRTAPs) subgenomes: disentangling hair diversity and adaptation to terrestrial and aquatic environments.

    PubMed

    Khan, Imran; Maldonado, Emanuel; Vasconcelos, Vítor; O'Brien, Stephen J; Johnson, Warren E; Antunes, Agostinho

    2014-09-10

    Adaptation of mammals to terrestrial life was facilitated by the unique vertebrate trait of body hair, which occurs in a range of morphological patterns. Keratin associated proteins (KRTAPs), the major structural hair shaft proteins, are largely responsible for hair variation. We exhaustively characterized the KRTAP gene family in 22 mammalian genomes, confirming the existence of 30 KRTAP subfamilies evolving at different rates with varying degrees of diversification and homogenization. Within the two major classes of KRTAPs, the high cysteine (HS) subfamily experienced strong concerted evolution, high rates of gene conversion/recombination and high GC content. In contrast, high glycine-tyrosine (HGT) KRTAPs showed evidence of positive selection and low rates of gene conversion/recombination. Species with more hair and of higher complexity tended to have more KRATP genes (gene expansion). The sloth, with long and coarse hair, had the most KRTAP genes (175 with 141 being intact). By contrast, the "hairless" dolphin had 35 KRTAPs and the highest pseudogenization rate (74% relative to the 19% mammalian average). Unique hair-related phenotypes, such as scales (armadillo) and spines (hedgehog), were correlated with changes in KRTAPs. Gene expression variation probably also influences hair diversification patterns, for example human have an identical KRTAP repertoire as apes, but much less hair. We hypothesize that differences in KRTAP gene repertoire and gene expression, together with distinct rates of gene conversion/recombination, pseudogenization and positive selection, are likely responsible for micro and macro-phenotypic hair diversification among mammals in response to adaptations to ecological pressures.

  12. Whole-genome analyses of speciation events in pathogenic Brucellae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chain, Patrick S. G.; Comerci, Diego J.; Tolmasky, Marcelo E.

    Despite their high DNA identity and a proposal to group classical Brucella species as biovars of Brucella melitensis, the commonly recognized Brucella species can be distinguished by distinct biochemical and fatty acid characters, as well as by a marked host range (e.g., Brucella suis for swine, B. melitensis for sheep and goats, and Brucella abortus for cattle). Here we present the genome of B. abortus 2308, the virulent prototype biovar 1 strain, and its comparison to the two other human pathogenic Brucella species and to B. abortus field isolate 9-941. The global distribution of pseudogenes, deletions, and insertions supports previousmore » indications that B. abortus and B. melitensis share a common ancestor that diverged from B. suis. With the exception of a dozen genes, the genetic complements of both B. abortus strains are identical, whereas the three species differ in gene content and pseudogenes. The pattern of species-specific gene inactivations affecting transcriptional regulators and outer membrane proteins suggests that these inactivations may play an important role in the establishment of host specificity and may have been a primary driver of speciation in the genus Brucella. Despite being nonmotile, the brucellae contain flagellum gene clusters and display species-specific flagellar gene inactivations, which lead to the putative generation of different versions of flagellum-derived structures and may contribute to differences in host specificity and virulence. Metabolic changes such as the lack of complete metabolic pathways for the synthesis of numerous compounds (e.g., glycogen, biotin, NAD, and choline) are consistent with adaptation of brucellae to an intracellular life-style.« less

  13. Whole-genome analyses of the speciation events in the pathogenic Brucellae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chain, P; Comerci, D; Tolmasky, M

    Despite their high DNA identity and a proposal to group classical Brucella species as biovars of B. melitensis, the commonly recognized Brucella species can be distinguished by distinct biochemical and fatty acid characters as well as by a marked host range (e.g. B. suis for swine, B. melitensis for sheep and goats, B. abortus for cattle). Here we present the genome of B. abortus 2308, the virulent prototype biovar 1 strain, and its comparison to the two other human pathogenic Brucellae species and to the B. abortus field isolate 9-941. The global distribution of pseudogenes, deletions and insertions support previousmore » indications that B. abortus and B. melitensis share a common ancestor that diverged from B. suis. With the exception of a dozen genes, the genetic complement of both B. abortus strains is identical, whereas the three species differ in gene content and pseudogenes. The pattern of species-specific gene inactivations affecting transcriptional regulators and outer membrane proteins suggest that these inactivations may play an important role in the establishment of host-specificity and may have been a primary driver of speciation in the Brucellae. Despite being non-motile, the Brucellae contain flagellum gene clusters and display species-specific flagellar gene inactivations, which lead to the putative generation of different versions of flagellum-derived structures, and may contribute to differences in host-specificity and virulence. Metabolic changes such as the lack of complete metabolic pathways for the synthesis of numerous compounds (e.g. glycogen, biotin, NAD, and choline) are consistent with adaptation of Brucellae to an intracellular lifestyle.« less

  14. Insights into mechanisms of bacterial antigenic variation derived from the complete genome sequence of Anaplasma marginale.

    PubMed

    Palmer, Guy H; Futse, James E; Knowles, Donald P; Brayton, Kelly A

    2006-10-01

    Persistence of Anaplasma spp. in the animal reservoir host is required for efficient tick-borne transmission of these pathogens to animals and humans. Using A. marginale infection of its natural reservoir host as a model, persistent infection has been shown to reflect sequential cycles in which antigenic variants emerge, replicate, and are controlled by the immune system. Variation in the immunodominant outer-membrane protein MSP2 is generated by a process of gene conversion, in which unique hypervariable region sequences (HVRs) located in pseudogenes are recombined into a single operon-linked msp2 expression site. Although organisms expressing whole HVRs derived from pseudogenes emerge early in infection, long-term persistent infection is dependent on the generation of complex mosaics in which segments from different HVRs recombine into the expression site. The resulting combinatorial diversity generates the number of variants both predicted and shown to emerge during persistence.

  15. A pseudogene long noncoding RNA network regulates PTEN transcription and translation in human cells

    PubMed Central

    Johnsson, Per; Ackley, Amanda; Vidarsdottir, Linda; Lui, Weng-Onn; Corcoran, Martin; Grandér, Dan; Morris, Kevin V.

    2013-01-01

    PTEN is a tumor suppressor gene that has been shown to be under the regulatory control of a PTEN pseudogene expressed noncoding RNA, PTENpg1. Here, we characterize a previously unidentified PTENpg1 encoded antisense RNA (asRNA), which regulates PTEN transcription and PTEN mRNA stability. We find two PTENpg1 asRNA isoforms, alpha and beta. The alpha isoform functions in trans, localizes to the PTEN promoter, and epigenetically modulates PTEN transcription by the recruitment of DNMT3a and EZH2. In contrast, the beta isoform interacts with PTENpg1 through an RNA:RNA pairing interaction, which affects PTEN protein output via changes of PTENpg1 stability and microRNA sponge activity. Disruption of this asRNA-regulated network induces cell cycle arrest and sensitizes cells to doxorubicin, suggesting a biological function for the respective PTENpg1 expressed asRNAs. PMID:23435381

  16. A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells.

    PubMed

    Johnsson, Per; Ackley, Amanda; Vidarsdottir, Linda; Lui, Weng-Onn; Corcoran, Martin; Grandér, Dan; Morris, Kevin V

    2013-04-01

    PTEN is a tumor-suppressor gene that has been shown to be under the regulatory control of a PTEN pseudogene expressed noncoding RNA, PTENpg1. Here, we characterize a previously unidentified PTENpg1-encoded antisense RNA (asRNA), which regulates PTEN transcription and PTEN mRNA stability. We find two PTENpg1 asRNA isoforms, α and β. The α isoform functions in trans, localizes to the PTEN promoter and epigenetically modulates PTEN transcription by the recruitment of DNA methyltransferase 3a and Enhancer of Zeste. In contrast, the β isoform interacts with PTENpg1 through an RNA-RNA pairing interaction, which affects PTEN protein output through changes of PTENpg1 stability and microRNA sponge activity. Disruption of this asRNA-regulated network induces cell-cycle arrest and sensitizes cells to doxorubicin, which suggests a biological function for the respective PTENpg1 expressed asRNAs.

  17. Transcriptomic signatures in cartilage ageing

    PubMed Central

    2013-01-01

    Introduction Age is an important factor in the development of osteoarthritis. Microarray studies provide insight into cartilage aging but do not reveal the full transcriptomic phenotype of chondrocytes such as small noncoding RNAs, pseudogenes, and microRNAs. RNA-Seq is a powerful technique for the interrogation of large numbers of transcripts including nonprotein coding RNAs. The aim of the study was to characterise molecular mechanisms associated with age-related changes in gene signatures. Methods RNA for gene expression analysis using RNA-Seq and real-time PCR analysis was isolated from macroscopically normal cartilage of the metacarpophalangeal joints of eight horses; four young donors (4 years old) and four old donors (>15 years old). RNA sequence libraries were prepared following ribosomal RNA depletion and sequencing was undertaken using the Illumina HiSeq 2000 platform. Differentially expressed genes were defined using Benjamini-Hochberg false discovery rate correction with a generalised linear model likelihood ratio test (P < 0.05, expression ratios ± 1.4 log2 fold-change). Ingenuity pathway analysis enabled networks, functional analyses and canonical pathways from differentially expressed genes to be determined. Results In total, the expression of 396 transcribed elements including mRNAs, small noncoding RNAs, pseudogenes, and a single microRNA was significantly different in old compared with young cartilage (± 1.4 log2 fold-change, P < 0.05). Of these, 93 were at higher levels in the older cartilage and 303 were at lower levels in the older cartilage. There was an over-representation of genes with reduced expression relating to extracellular matrix, degradative proteases, matrix synthetic enzymes, cytokines and growth factors in cartilage derived from older donors compared with young donors. In addition, there was a reduction in Wnt signalling in ageing cartilage. Conclusion There was an age-related dysregulation of matrix, anabolic and catabolic cartilage factors. This study has increased our knowledge of transcriptional networks in cartilage ageing by providing a global view of the transcriptome. PMID:23971731

  18. A transcription map of the regions surrounding the CSF1R locus on human chromosome 5q31: Candidate genes for diastrophic dysplasia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clines, G.; Lovett, M.

    1994-09-01

    Diastrophic dysplasia (DTD) is an autosomal recessive disorder of unknown pathogenesis that is characterized by abnormal skeletal and cartilage growth. Phenotypic characteristics of the disorder include short stature, scoliosis, and deformation of the first metacarpal. The diastrophic dysplasia gene has been localized to chromosome 5q31-33, within {approximately}60 kb of the colony stimulating factor 1 receptor gene (CSF1R). We have used direct cDNA selection to build a transcription map across {approximately}250 kb surrounding and including the CSF1R locus. cDNA pools from human placenta, activated T cells, cerebellum, Hela cells, fetal brain, chondrocytes, chondrosarcomas and osteosarcomas were multiplexed in these selections. Aftermore » two rounds of selection, an analysis revealed that {approximately}70% of the selected cDNAs were contained within the contig. DNA sequencing and cosmid mapping data from a collection of 310 clones revealed the presence of three new genes in this region that show no appreciable homologies on sequence database searches, as well as cDNA clones from the CSF1R and the PDGFRB loci (another of the known genes in the region). An additional cDNA was found with 100% homology to the gene encoding human ribosomal protein L7 (RPL7). This cDNA comprised {approximately}25% of all selected clones. However, further analysis of the genomic contig revealed the presence of an RPL7 processed pseudogene in very close proximity to the CSF1R and PDGFRB genes. The selection of processed pseudogenes is one previously anticipated artifact of selection metholodolgies, but has not been previously observed. Mutational analysis of the three new genes is underway in diastrophic dysplasia families, as is derivation of full length cDNA clones and the expansion of this detailed transcription map into a larger genomic contig.« less

  19. Functional redundancy and/or ongoing pseudogenization among F-box protein genes expressed in Arabidopsis male gametophyte.

    PubMed

    Ikram, Sobia; Durandet, Monique; Vesa, Simona; Pereira, Serge; Guerche, Philippe; Bonhomme, Sandrine

    2014-06-01

    F-box protein genes family is one of the largest gene families in plants, with almost 700 predicted genes in the model plant Arabidopsis. F-box proteins are key components of the ubiquitin proteasome system that allows targeted protein degradation. Transcriptome analyses indicate that half of these F-box protein genes are found expressed in microspore and/or pollen, i.e., during male gametogenesis. To assess the role of F-box protein genes during this crucial developmental step, we selected 34 F-box protein genes recorded as highly and specifically expressed in pollen and isolated corresponding insertion mutants. We checked the expression level of each selected gene by RT-PCR and confirmed pollen expression for 25 genes, but specific expression for only 10 of the 34 F-box protein genes. In addition, we tested the expression level of selected F-box protein genes in 24 mutant lines and showed that 11 of them were null mutants. Transmission analysis of the mutations to the progeny showed that none of the single mutations was gametophytic lethal. These unaffected transmission efficiencies suggested leaky mutations or functional redundancy among F-box protein genes. Cytological observation of the gametophytes in the mutants confirmed these results. Combinations of mutations in F-box protein genes from the same subfamily did not lead to transmission defect either, further highlighting functional redundancy and/or a high proportion of pseudogenes among these F-box protein genes.

  20. Efficient Detection of Copy Number Mutations in PMS2 Exons with a Close Homolog.

    PubMed

    Herman, Daniel S; Smith, Christina; Liu, Chang; Vaughn, Cecily P; Palaniappan, Selvi; Pritchard, Colin C; Shirts, Brian H

    2018-07-01

    Detection of 3' PMS2 copy-number mutations that cause Lynch syndrome is difficult because of highly homologous pseudogenes. To improve the accuracy and efficiency of clinical screening for these mutations, we developed a new method to analyze standard capture-based, next-generation sequencing data to identify deletions and duplications in PMS2 exons 9 to 15. The approach captures sequences using PMS2 targets, maps sequences randomly among regions with equal mapping quality, counts reads aligned to homologous exons and introns, and flags read count ratios outside of empirically derived reference ranges. The method was trained on 1352 samples, including 8 known positives, and tested on 719 samples, including 17 known positives. Clinical implementation of the first version of this method detected new mutations in the training (N = 7) and test (N = 2) sets that had not been identified by our initial clinical testing pipeline. The described final method showed complete sensitivity in both sample sets and false-positive rates of 5% (training) and 7% (test), dramatically decreasing the number of cases needing additional mutation evaluation. This approach leveraged the differences between gene and pseudogene to distinguish between PMS2 and PMS2CL copy-number mutations. These methods enable efficient and sensitive Lynch syndrome screening for 3' PMS2 copy-number mutations and may be applied similarly to other genomic regions with highly homologous pseudogenes. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  1. Complete plastid genome of Astragalus mongholicus var. nakaianus (Fabaceae).

    PubMed

    Choi, In-Su; Kim, Joo-Hwan; Choi, Byoung-Hee

    2016-07-01

    The first complete plastid genome (plastome) of the largest angiosperm genus, Astragalus, was sequenced for the Korean endangered endemic species A. mongholicus var. nakaianus. Its genome is relatively short (123,633 bp) because it lacks an Inverted Repeat (IR) region. It comprises 110 genes, including four unique rRNAs, 30 tRNAs, and 76 protein-coding genes. Similar to other closely related plastomes, rpl22 and rps16 are absent. The putative pseudogene with abnormal stop codons is atpE. This plastome has no additional inversions when compared with highly variable plastomes from IRLC tribes Fabeae and Trifolieae. Our phylogenetic analysis confirms the non-monophyly of Galegeae.

  2. Approaches to Fungal Genome Annotation

    PubMed Central

    Haas, Brian J.; Zeng, Qiandong; Pearson, Matthew D.; Cuomo, Christina A.; Wortman, Jennifer R.

    2011-01-01

    Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center’s production genome annotation environment. PMID:22059117

  3. Cytochromes P450

    PubMed Central

    Bak, Søren; Beisson, Fred; Bishop, Gerard; Hamberger, Björn; Höfer, René; Paquette, Suzanne; Werck-Reichhart, Danièle

    2011-01-01

    There are 244 cytochrome P450 genes (and 28 pseudogenes) in the Arabidopsis genome. P450s thus form one of the largest gene families in plants. Contrary to what was initially thought, this family diversification results in very limited functional redundancy and seems to mirror the complexity of plant metabolism. P450s sometimes share less than 20% identity and catalyze extremely diverse reactions leading to the precursors of structural macromolecules such as lignin, cutin, suberin and sporopollenin, or are involved in biosynthesis or catabolism of all hormone and signaling molecules, of pigments, odorants, flavors, antioxidants, allelochemicals and defense compounds, and in the metabolism of xenobiotics. The mechanisms of gene duplication and diversification are getting better understood and together with co-expression data provide leads to functional characterization. PMID:22303269

  4. The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis.

    PubMed

    Duan, Naibin; Sun, Honghe; Wang, Nan; Fei, Zhangjun; Chen, Xuesen

    2016-07-01

    The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis, a widely used apple rootstock, was determined using the Illumina high-throughput sequencing approach. The genome is 422,555 bp in length and has a GC content of 45.21%. It is separated by a pair of inverted repeats of 32,504 bp, to form a large single copy region of 213,055 bp and a small single copy region of 144,492 bp. The genome contains 38 protein-coding genes, four pseudogenes, 25 tRNA genes, and three rRNA genes. The genome is 25,608 bp longer than that of M. domestica, and several structural variations between these two mitogenomes were detected.

  5. Comparative Analysis of Syntenic Genes in Grass Genomes Reveals Accelerated Rates of Gene Structure and Coding Sequence Evolution in Polyploid Wheat1[W][OA

    PubMed Central

    Akhunov, Eduard D.; Sehgal, Sunish; Liang, Hanquan; Wang, Shichen; Akhunova, Alina R.; Kaur, Gaganpreet; Li, Wanlong; Forrest, Kerrie L.; See, Deven; Šimková, Hana; Ma, Yaqin; Hayden, Matthew J.; Luo, Mingcheng; Faris, Justin D.; Doležel, Jaroslav; Gill, Bikram S.

    2013-01-01

    Cycles of whole-genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied by comparing the patterns of gene structure changes, alternative splicing (AS), and codon substitution rates among wheat and model grass genomes. In orthologous gene sets, significantly more acquired and lost exonic sequences were detected in wheat than in model grasses. In wheat, 35% of these gene structure rearrangements resulted in frame-shift mutations and premature termination codons. An increased codon mutation rate in the wheat lineage compared with Brachypodium distachyon was found for 17% of orthologs. The discovery of premature termination codons in 38% of expressed genes was consistent with ongoing pseudogenization of the wheat genome. The rates of AS within the individual wheat subgenomes (21%–25%) were similar to diploid plants. However, we uncovered a high level of AS pattern divergence between the duplicated homeologous copies of genes. Our results are consistent with the accelerated accumulation of AS isoforms, nonsynonymous mutations, and gene structure rearrangements in the wheat lineage, likely due to genetic redundancy created by WGDs. Whereas these processes mostly contribute to the degeneration of a duplicated genome and its diploidization, they have the potential to facilitate the origin of new functional variations, which, upon selection in the evolutionary lineage, may play an important role in the origin of novel traits. PMID:23124323

  6. The drosomycin multigene family: three-disulfide variants from Drosophila takahashii possess antibacterial activity

    PubMed Central

    Gao, Bin; Zhu, Shunyi

    2016-01-01

    Drosomycin (DRS) is a strictly antifungal peptide in Drosophila melanogaster, which contains four disulfide bridges (DBs) with three buried in molecular interior and one exposed on molecular surface to tie the amino- and carboxyl-termini of the molecule together (called wrapper disulfide bridge, WDB). Based on computational analysis of genomes of Drosophila species belonging to the Oriental lineage, we identified a new multigene family of DRS in Drosphila takahashii that includes a total of 11 DRS-encoding genes (termed DtDRS-1 to DtDRS-11) and a pseudogene. Phylogenetic tree and synteny analyses reveal orthologous relationship between DtDRSs and DRSs, indicating that orthologous genes of DRS-1, DRS-2, DRS-3 and DRS-6 have undergone duplication in D. takahashii and three amplifications (DtDRS-9 to DtDRS-11) of DRS-3 have lost WDB. Among the 11 genes, five are transcriptionally active in adult fruitflies. The ortholog of DRS (DtDRS-1) shows high structural and functional similarity to DRS while two WDB-deficient members display antibacterial activity accompanying complete loss or remarkable reduction of antifungal activity. To the best of our knowledge, this is the first report on the presence of three-disulfide antibacterial DRSs in a specific Drosophila species, suggesting a potential role of DB loss in neofunctionalization of a protein via structural adjustment. PMID:27562645

  7. Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.

    PubMed

    Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K

    2015-01-01

    The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.

  8. Systematic analysis of human kinase genes: a large number of genes and alternative splicing events result in functional and structural diversity

    PubMed Central

    Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni

    2005-01-01

    Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747

  9. Identification and characterisation of the BPI/LBP/PLUNC-like gene repertoire in chickens reveals the absence of a LBP gene☆

    PubMed Central

    Chiang, Shih-Chieh; Veldhuizen, Edwin J.A.; Barnes, Frances A.; Craven, C. Jeremy; Haagsman, Henk P.; Bingle, Colin D.

    2011-01-01

    Palate, lung and nasal epithelial clone (PLUNC) proteins are structural homologues to the innate defence molecules LPS-binding protein (LBP) and bactericidal/permeability-increasing protein (BPI). PLUNCs make up the largest portion of the wider BPI/LBP/PLUNC-like protein family and are amongst the most rapidly evolving mammalian genes. In this study we systematically identified and characterised BPI/LBP/PLUNC-like protein-encoding genes in the chicken genome. We identified eleven complete genes (and a pseudogene). Five of them are clustered on a >50 kb locus on chromosome 20, immediately adjacent to BPI. In addition to BPI, we have identified presumptive orthologues LPLUNCs 2, 3, 4 and 6, and BPIL-2. We find no evidence for the existence of single domain containing proteins in birds. Strikingly our analysis also suggests that there is no LBP orthologue in chicken. This observation may in part account for the relative resistance to LPS toxicity observed in birds. Our results indicate significant differences between the avian and mammalian repertoires of BPI/LBP/PLUNC-like genes at the genomic and transcriptional levels and provide a framework for further functional analyses of this gene family in chickens. PMID:20959152

  10. The ENCODE Project at UC Santa Cruz.

    PubMed

    Thomas, Daryl J; Rosenbloom, Kate R; Clawson, Hiram; Hinrichs, Angie S; Trumbower, Heather; Raney, Brian J; Karolchik, Donna; Barber, Galt P; Harte, Rachel A; Hillman-Jackson, Jennifer; Kuhn, Robert M; Rhead, Brooke L; Smith, Kayla E; Thakkapallayil, Archana; Zweig, Ann S; Haussler, David; Kent, W James

    2007-01-01

    The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.

  11. Evolution of a Major Drug Metabolizing Enzyme Defect in the Domestic Cat and Other Felidae: Phylogenetic Timing and the Role of Hypercarnivory

    PubMed Central

    Shrestha, Binu; Reed, J. Michael; Starks, Philip T.; Kaufman, Gretchen E.; Goldstone, Jared V.; Roelke, Melody E.; O'Brien, Stephen J.; Koepfli, Klaus-Peter; Frank, Laurence G.; Court, Michael H.

    2011-01-01

    The domestic cat (Felis catus) shows remarkable sensitivity to the adverse effects of phenolic drugs, including acetaminophen and aspirin, as well as structurally-related toxicants found in the diet and environment. This idiosyncrasy results from pseudogenization of the gene encoding UDP-glucuronosyltransferase (UGT) 1A6, the major species-conserved phenol detoxification enzyme. Here, we established the phylogenetic timing of disruptive UGT1A6 mutations and explored the hypothesis that gene inactivation in cats was enabled by minimal exposure to plant-derived toxicants. Fixation of the UGT1A6 pseudogene was estimated to have occurred between 35 and 11 million years ago with all extant Felidae having dysfunctional UGT1A6. Out of 22 additional taxa sampled, representative of most Carnivora families, only brown hyena (Parahyaena brunnea) and northern elephant seal (Mirounga angustirostris) showed inactivating UGT1A6 mutations. A comprehensive literature review of the natural diet of the sampled taxa indicated that all species with defective UGT1A6 were hypercarnivores (>70% dietary animal matter). Furthermore those species with UGT1A6 defects showed evidence for reduced amino acid constraint (increased dN/dS ratios approaching the neutral selection value of 1.0) as compared with species with intact UGT1A6. In contrast, there was no evidence for reduced amino acid constraint for these same species within UGT1A1, the gene encoding the enzyme responsible for detoxification of endogenously generated bilirubin. Our results provide the first evidence suggesting that diet may have played a permissive role in the devolution of a mammalian drug metabolizing enzyme. Further work is needed to establish whether these preliminary findings can be generalized to all Carnivora. PMID:21464924

  12. Early evolutionary colocalization of the nuclear ribosomal 5S and 45S gene families in seed plants: evidence from the living fossil gymnosperm Ginkgo biloba.

    PubMed

    Galián, J A; Rosato, M; Rosselló, J A

    2012-06-01

    In seed plants, the colocalization of the 5S loci within the intergenic spacer (IGS) of the nuclear 45S tandem units is restricted to the phylogenetically derived Asteraceae family. However, fluorescent in situ hybridization (FISH) colocalization of both multigene families has also been observed in other unrelated seed plant lineages. Previous work has identified colocalization of 45S and 5S loci in Ginkgo biloba using FISH, but these observations have not been confirmed recently by sequencing a 1.8 kb IGS. In this work, we report the presence of the 45S-5S linkage in G. biloba, suggesting that in seed plants the molecular events leading to the restructuring of the ribosomal loci are much older than estimated previously. We obtained a 6.0 kb IGS fragment showing structural features of functional sequences, and a single copy of the 5S gene was inserted in the same direction of transcription as the ribosomal RNA genes. We also obtained a 1.8 kb IGS that was a truncate variant of the 6.0 kb IGS lacking the 5S gene. Several lines of evidence strongly suggest that the 1.8 kb variants are pseudogenes that are present exclusively on the satellite chromosomes bearing the 45S-5S genes. The presence of ribosomal IGS pseudogenes best reconciles contradictory results concerning the presence or absence of the 45S-5S linkage in Ginkgo. Our finding that both ribosomal gene families have been unified to a single 45S-5S unit in Ginkgo indicates that an accurate reassessment of the organization of rDNA genes in basal seed plants is necessary.

  13. Recommended nomenclature for five mammalian carboxylesterase gene families: human, mouse, and rat genes and proteins.

    PubMed

    Holmes, Roger S; Wright, Matthew W; Laulederkind, Stanley J F; Cox, Laura A; Hosokawa, Masakiyo; Imai, Teruko; Ishibashi, Shun; Lehner, Richard; Miyazaki, Masao; Perkins, Everett J; Potter, Phillip M; Redinbo, Matthew R; Robert, Jacques; Satoh, Tetsuo; Yamashita, Tetsuro; Yan, Bingfan; Yokoi, Tsuyoshi; Zechner, Rudolf; Maltais, Lois J

    2010-10-01

    Mammalian carboxylesterase (CES or Ces) genes encode enzymes that participate in xenobiotic, drug, and lipid metabolism in the body and are members of at least five gene families. Tandem duplications have added more genes for some families, particularly for mouse and rat genomes, which has caused confusion in naming rodent Ces genes. This article describes a new nomenclature system for human, mouse, and rat carboxylesterase genes that identifies homolog gene families and allocates a unique name for each gene. The guidelines of human, mouse, and rat gene nomenclature committees were followed and "CES" (human) and "Ces" (mouse and rat) root symbols were used followed by the family number (e.g., human CES1). Where multiple genes were identified for a family or where a clash occurred with an existing gene name, a letter was added (e.g., human CES4A; mouse and rat Ces1a) that reflected gene relatedness among rodent species (e.g., mouse and rat Ces1a). Pseudogenes were named by adding "P" and a number to the human gene name (e.g., human CES1P1) or by using a new letter followed by ps for mouse and rat Ces pseudogenes (e.g., Ces2d-ps). Gene transcript isoforms were named by adding the GenBank accession ID to the gene symbol (e.g., human CES1_AB119995 or mouse Ces1e_BC019208). This nomenclature improves our understanding of human, mouse, and rat CES/Ces gene families and facilitates research into the structure, function, and evolution of these gene families. It also serves as a model for naming CES genes from other mammalian species.

  14. Early evolutionary colocalization of the nuclear ribosomal 5S and 45S gene families in seed plants: evidence from the living fossil gymnosperm Ginkgo biloba

    PubMed Central

    Galián, J A; Rosato, M; Rosselló, J A

    2012-01-01

    In seed plants, the colocalization of the 5S loci within the intergenic spacer (IGS) of the nuclear 45S tandem units is restricted to the phylogenetically derived Asteraceae family. However, fluorescent in situ hybridization (FISH) colocalization of both multigene families has also been observed in other unrelated seed plant lineages. Previous work has identified colocalization of 45S and 5S loci in Ginkgo biloba using FISH, but these observations have not been confirmed recently by sequencing a 1.8 kb IGS. In this work, we report the presence of the 45S–5S linkage in G. biloba, suggesting that in seed plants the molecular events leading to the restructuring of the ribosomal loci are much older than estimated previously. We obtained a 6.0 kb IGS fragment showing structural features of functional sequences, and a single copy of the 5S gene was inserted in the same direction of transcription as the ribosomal RNA genes. We also obtained a 1.8 kb IGS that was a truncate variant of the 6.0 kb IGS lacking the 5S gene. Several lines of evidence strongly suggest that the 1.8 kb variants are pseudogenes that are present exclusively on the satellite chromosomes bearing the 45S–5S genes. The presence of ribosomal IGS pseudogenes best reconciles contradictory results concerning the presence or absence of the 45S–5S linkage in Ginkgo. Our finding that both ribosomal gene families have been unified to a single 45S–5S unit in Ginkgo indicates that an accurate reassessment of the organization of rDNA genes in basal seed plants is necessary. PMID:22354111

  15. Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

    PubMed

    Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

    2017-02-01

    Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. New data on epizootiology and genetics of piroplasms based on sequences of small ribosomal subunit and cytochrome b genes.

    PubMed

    Criado, A; Martinez, J; Buling, A; Barba, J C; Merino, S; Jefferies, R; Irwin, P J

    2006-12-20

    As a continuation of our studies on molecular epizootiology of piroplasmosis in Spain and other countries, we present in this contribution the finding of new hosts for some piroplasms, as well as information on their 18S rRNA gene sequences. Genetic data were complemented with sequences of apocytochrome b gene (whenever possible). The following conclusions were drawn from these molecular studies: Theileria annulata is capable of infecting dogs, since it was diagnosed in a symptomatic animal. According to cytochrome b sequences, isolates from cows and dog present slight differences. The same isolates showed, however, identical sequence in the 18S rRNA gene. This exemplifies well the usefulness of the mitochondrial gene for examining infra-specific variation. Babesia bovis is an occasional parasite of equines, since it was detected in two symptomatic horses. We found evidence of genetic polymorphism occurring in the 18S rRNA gene of Spanish T. equi-like and B. ovis isolates. B. bennetti from Spanish seagull is loosely related to B. ovis, and might represent a genetically distinct branch of babesids. A partial sequence of a cytochrome b pseudogene was obtained for the first time in Babesia canis rossi from South Africa. The pseudogene is distantly related to B. bigemina cytochrome b gene. These new findings confirm the ability of some piroplasms to infect multiple hosts, as well as the existence of a relatively wide genetic polymorphisms with respect to the cytochrome b gene. On the other hand, the existence of mtDNA-like pseudogenes of possible nuclear location in piroplasms is interesting due to their possible impact on molecular phylogeny studies.

  17. Pseudogenization of the umami taste receptor gene Tas1r1 in the giant panda coincided with its dietary switch to bamboo.

    PubMed

    Zhao, Huabin; Yang, Jian-Rong; Xu, Huailiang; Zhang, Jianzhi

    2010-12-01

    Although it belongs to the order Carnivora, the giant panda is a vegetarian with 99% of its diet being bamboo. The draft genome sequence of the giant panda shows that its umami taste receptor gene Tas1r1 is a pseudogene, prompting the proposal that the loss of the umami perception explains why the giant panda is herbivorous. To test this hypothesis, we sequenced all six exons of Tas1r1 in another individual of the giant panda and five other carnivores. We found that the open reading frame (ORF) of Tas1r1 is intact in all these carnivores except the giant panda. The rate ratio (ω) of nonsynonymous to synonymous substitutions in Tas1r1 is significantly higher for the giant panda lineage than for other carnivore lineages. Based on the ω change and the observed number of ORF-disrupting substitutions, we estimated that the functional constraint on the giant panda Tas1r1 was relaxed ∼ 4.2 Ma, with its 95% confidence interval between 1.3 and 10 Ma. Our estimate matches the approximate date of the giant panda's dietary switch inferred from fossil records. It is probable that the giant panda's decreased reliance on meat resulted in the dispensability of the umami taste, leading to Tas1r1 pseudogenization, which in turn reinforced its herbivorous life style because of the diminished attraction of returning to meat eating in the absence of Tas1r1. Nonetheless, additional factors are likely involved because herbivores such as cow and horse still retain an intact Tas1r1.

  18. Pseudogenization of the Umami Taste Receptor Gene Tas1r1 in the Giant Panda Coincided with its Dietary Switch to Bamboo

    PubMed Central

    Zhao, Huabin; Yang, Jian-Rong; Xu, Huailiang; Zhang, Jianzhi

    2010-01-01

    Although it belongs to the order Carnivora, the giant panda is a vegetarian with 99% of its diet being bamboo. The draft genome sequence of the giant panda shows that its umami taste receptor gene Tas1r1 is a pseudogene, prompting the proposal that the loss of the umami perception explains why the giant panda is herbivorous. To test this hypothesis, we sequenced all six exons of Tas1r1 in another individual of the giant panda and five other carnivores. We found that the open reading frame (ORF) of Tas1r1 is intact in all these carnivores except the giant panda. The rate ratio (ω) of nonsynonymous to synonymous substitutions in Tas1r1 is significantly higher for the giant panda lineage than for other carnivore lineages. Based on the ω change and the observed number of ORF-disrupting substitutions, we estimated that the functional constraint on the giant panda Tas1r1 was relaxed ∼4.2 Ma, with its 95% confidence interval between 1.3 and 10 Ma. Our estimate matches the approximate date of the giant panda's dietary switch inferred from fossil records. It is probable that the giant panda's decreased reliance on meat resulted in the dispensability of the umami taste, leading to Tas1r1 pseudogenization, which in turn reinforced its herbivorous life style because of the diminished attraction of returning to meat eating in the absence of Tas1r1. Nonetheless, additional factors are likely involved because herbivores such as cow and horse still retain an intact Tas1r1. PMID:20573776

  19. Characterisation of the subtelomeric regions of Giardia lamblia genome isolate WBC6.

    PubMed

    Prabhu, Anjali; Morrison, Hilary G; Martinez, Charles R; Adam, Rodney D

    2007-04-01

    Giardia trophozoites are polyploid and have five chromosomes. The chromosome homologues demonstrate considerable size heterogeneity due to variation in the subtelomeric regions. We used clones from the genome project with telomeric sequence at one end to identify six subtelomeric regions in addition to previously identified subtelomeric regions, to study the telomeric arrangement of the chromosomes. The subtelomeric regions included two retroposons, one retroposon pseudogene, and two vsp genes, in addition to the previously identified subtelomeric regions that include ribosomal DNA repeats. The presence of vsp genes in a subtelomeric region suggests that telomeric rearrangements may contribute to the generation of vsp diversity. These studies of the subtelomeric regions of Giardia may contribute to our understanding of the factors that maintain stability, while allowing diversity in chromosome structure.

  20. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1987-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113

  1. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1990-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227

  2. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1988-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330

  3. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1989-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889

  4. Chromosomal localization and sequence analysis of a human episomal sequence with in vitro differentiating activity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boccaccio, C.; Deshatrette, J.; Meunier-Rotival, M.

    1994-05-01

    The genomic fragment carrying the human activator of liver function, previously described as an episome capable of inducing differentiation upon transfection into a dedifferentiated rat hepatoma cell line, was mapped on human chromosome 12q24.2-12q24.3. This chromosomal location was indistinguishable by in situ hybridization from that of the gene coding for the hepatic transcription factor HNF1. The sequence of the integrated form of the episome as well as its flanking sequences show that it is rich in retroposons. It contains a human ribosomal protein L21 processed pseudogene, one truncated L1Hs sequence, and 10 Alu repeats, which belong to different subfamilies.

  5. Of the Nine Cytidine Deaminase-Like Genes in Arabidopsis, Eight Are Pseudogenes and Only One Is Required to Maintain Pyrimidine Homeostasis in Vivo1

    PubMed Central

    2016-01-01

    CYTIDINE DEAMINASE (CDA) catalyzes the deamination of cytidine to uridine and ammonia in the catabolic route of C nucleotides. The Arabidopsis (Arabidopsis thaliana) CDA gene family comprises nine members, one of which (AtCDA) was shown previously in vitro to encode an active CDA. A possible role in C-to-U RNA editing or in antiviral defense has been discussed for other members. A comprehensive bioinformatic analysis of plant CDA sequences, combined with biochemical functionality tests, strongly suggests that all Arabidopsis CDA family members except AtCDA are pseudogenes and that most plants only require a single CDA gene. Soybean (Glycine max) possesses three CDA genes, but only two encode functional enzymes and just one has very high catalytic efficiency. AtCDA and soybean CDAs are located in the cytosol. The functionality of AtCDA in vivo was demonstrated with loss-of-function mutants accumulating high amounts of cytidine but also CMP, cytosine, and some uridine in seeds. Cytidine hydrolysis in cda mutants is likely caused by NUCLEOSIDE HYDROLASE1 (NSH1) because cytosine accumulation is strongly reduced in a cda nsh1 double mutant. Altered responses of the cda mutants to fluorocytidine and fluorouridine indicate that a dual specific nucleoside kinase is involved in cytidine as well as uridine salvage. CDA mutants display a reduction in rosette size and have fewer leaves compared with the wild type, which is probably not caused by defective pyrimidine catabolism but by the accumulation of pyrimidine catabolism intermediates reaching toxic concentrations. PMID:27208239

  6. Genomic assessment of the evolution of the prion protein gene family in vertebrates.

    PubMed

    Harrison, Paul M; Khachane, Amit; Kumar, Manish

    2010-05-01

    Prion diseases are devastating neurological disorders caused by the propagation of particles containing an alternative beta-sheet-rich form of the prion protein (PrP). Genes paralogous to PrP, called Doppel and Shadoo, have been identified, that also have neuropathological relevance. To aid in the further functional characterization of PrP and its relatives, we annotated completely the PrP gene family (PrP-GF), in the genomes of 42 vertebrates, through combined strategic application of gene prediction programs and advanced remote homology detection techniques (such as HMMs, PSI-TBLASTN and pGenThreader). We have uncovered several previously undescribed paralogous genes and pseudogenes. We find that current high-quality genomic evidence indicates that the PrP relative Doppel, was likely present in the last common ancestor of present-day Tetrapoda, but was lost in the bird lineage, since its divergence from reptiles. Using the new gene annotations, we have defined the consensus of structural features that are characteristic of the PrP and Doppel structures, across diverse Tetrapoda clades. Furthermore, we describe in detail a transcribed pseudogene derived from Shadoo that is conserved across primates, and that overlaps the meiosis gene, SYCE1, thus possibly regulating its expression. In addition, we analysed the locus of PRNP/PRND for significant conservation across the genomic DNA of eleven mammals, and determined the phylogenetic penetration of non-coding exons. The genomic evidence indicates that the second PRNP non-coding exon found in even-toed ungulates and rodents, is conserved in all high-coverage genome assemblies of primates (human, chimp, orang utan and macaque), and is, at least, likely to have fallen out of use during primate speciation. Furthermore, we have demonstrated that the PRNT gene (at the PRNP human locus) is conserved across at least sixteen mammals, and evolves like a long non-coding RNA, fashioned from fragments of ancient, long, interspersed elements. These annotations and evolutionary analyses will be of further use for functional characterisation of the PrP-GF, and will be updatable in a semi-automated fashion as more genomes accumulate. Copyright 2010 Elsevier Inc. All rights reserved.

  7. Chloroplast Genome of the Folk Medicine and Vegetable Plant Talinum paniculatum (Jacq.) Gaertn.: Gene Organization, Comparative and Phylogenetic Analysis.

    PubMed

    Liu, Xia; Li, Yuan; Yang, Hongyuan; Zhou, Boyang

    2018-04-09

    The complete chloroplast (cp) genome of Talinum paniculatum (Caryophyllale), a source of pharmaceutical efficacy similar to ginseng, and a widely distributed and planted edible vegetable, were sequenced and analyzed. The cp genome size of T. paniculatum is 156,929 bp, with a pair of inverted repeats (IRs) of 25,751 bp separated by a large single copy (LSC) region of 86,898 bp and a small single copy (SSC) region of 18,529 bp. The genome contains 83 protein-coding genes, 37 transfer RNA (tRNA) genes, eight ribosomal RNA (rRNA) genes and four pseudogenes. Fifty one (51) repeat units and ninety two (92) simple sequence repeats (SSRs) were found in the genome. The pseudogene rpl23 (Ribosomal protein L23) was insert AATT than other Caryophyllale species by sequence alignment, which located in IRs region. The gene of trnK-UUU (tRNA-Lys) and rpl16 (Ribosomal protein L16) have larger introns in T. paniculatum , and the existence of matK (maturase K) genes, which usually located in the introns of trnK-UUU , rich sequence divergence in Caryophyllale. Complete cp genome comparison with other eight Caryophyllales species indicated that the differences between T. paniculatum and P. oleracea were very slight, and the most highly divergent regions occurred in intergenic spacers. Comparisons of IR boundaries among nine Caryophyllales species showed that T. paniculatum have larger IRs region and the contraction is relatively slight. The phylogenetic analysis among 35 Caryophyllales species and two outgroup species revealed that T. paniculatum and P. oleracea do not belong to the same family. All these results give good opportunities for future identification, barcoding of Talinum species, understanding the evolutionary mode of Caryophyllale cp genome and molecular breeding of T. paniculatum with high pharmaceutical efficacy.

  8. Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

    USGS Publications Warehouse

    Hoy, Marshal S.; Rodriguez, Rusty J.

    2013-01-01

    Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.                   

  9. Deciphering the Origin, Evolution, and Physiological Function of the Subtelomeric Aryl-Alcohol Dehydrogenase Gene Family in the Yeast Saccharomyces cerevisiae.

    PubMed

    Yang, Dong-Dong; de Billerbeck, Gustavo M; Zhang, Jin-Jing; Rosenzweig, Frank; Francois, Jean-Marie

    2018-01-01

    Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14 , encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5' sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr 73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization. Copyright © 2017 Yang et al.

  10. Deciphering the Origin, Evolution, and Physiological Function of the Subtelomeric Aryl-Alcohol Dehydrogenase Gene Family in the Yeast Saccharomyces cerevisiae

    PubMed Central

    de Billerbeck, Gustavo M.; Zhang, Jin-jing; Rosenzweig, Frank

    2017-01-01

    ABSTRACT Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14, encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5′ sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization. PMID:29079624

  11. Identification and characterization of toll-like receptors (TLRs) in the Chinese tree shrew (Tupaia belangeri chinensis).

    PubMed

    Yu, Dandan; Wu, Yong; Xu, Ling; Fan, Yu; Peng, Li; Xu, Min; Yao, Yong-Gang

    2016-07-01

    In mammals, the toll-like receptors (TLRs) play a major role in initiating innate immune responses against pathogens. Comparison of the TLRs in different mammals may help in understanding the TLR-mediated responses and developing of animal models and efficient therapeutic measures for infectious diseases. The Chinese tree shrew (Tupaia belangeri chinensis), a small mammal with a close relationship to primates, is a viable experimental animal for studying viral and bacterial infections. In this study, we characterized the TLRs genes (tTLRs) in the Chinese tree shrew and identified 13 putative TLRs, which are orthologs of mammalian TLR1-TLR9 and TLR11-TLR13, and TLR10 was a pseudogene in tree shrew. Positive selection analyses using the Maximum likelihood (ML) method showed that tTLR8 and tTLR9 were under positive selection, which might be associated with the adaptation to the pathogen challenge. The mRNA expression levels of tTLRs presented an overall low and tissue-specific pattern, and were significantly upregulated upon Hepatitis C virus (HCV) infection. tTLR4 and tTLR9 underwent alternative splicing, which leads to different transcripts. Phylogenetic analysis and TLR structure prediction indicated that tTLRs were evolutionarily conserved, which might reflect an ancient mechanism and structure in the innate immune response system. Taken together, TLRs had both conserved and unique features in the Chinese tree shrew. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution

    PubMed Central

    Yap, Jia-Yee S.; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y. H.; Wilkins, Marc R.; Rossetto, Maurizio; Delaney, Sven K.

    2015-01-01

    The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine. PMID:26061691

  13. First draft genome sequencing of indole acetic acid producing and plant growth promoting fungus Preussia sp. BSL10.

    PubMed

    Khan, Abdul Latif; Asaf, Sajjad; Khan, Abdur Rahim; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung

    2016-05-10

    Preussia sp. BSL10, family Sporormiaceae, was actively producing phytohormone (indole-3-acetic acid) and extra-cellular enzymes (phosphatases and glucosidases). The fungus was also promoting the growth of arid-land tree-Boswellia sacra. Looking at such prospects of this fungus, we sequenced its draft genome for the first time. The Illumina based sequence analysis reveals an approximate genome size of 31.4Mbp for Preussia sp. BSL10. Based on ab initio gene prediction, total 32,312 coding sequences were annotated consisting of 11,967 coding genes, pseudogenes, and 221 tRNA genes. Furthermore, 321 carbohydrate-active enzymes were predicted and classified into many functional families. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Polyamines are essential for virulence in Salmonella enterica serovar Gallinarum despite evolutionary decay of polyamine biosynthesis genes.

    PubMed

    Schroll, Casper; Christensen, Jens P; Christensen, Henrik; Pors, Susanne E; Thorndahl, Lotte; Jensen, Peter R; Olsen, John E; Jelsbak, Lotte

    2014-05-14

    Serovars of Salmonella enterica exhibit different host-specificities where some have broad host-ranges and others, like S. Gallinarum and S. Typhi, are host-specific for poultry and humans, respectively. With the recent availability of whole genome sequences it has been reported that host-specificity coincides with accumulation of pseudogenes, indicating adaptation of host-restricted serovars to their narrow niches. Polyamines are small cationic amines and in Salmonella they can be synthesized through two alternative pathways directly from l-ornithine to putrescine and from l-arginine via agmatine to putrescine. The first pathway is not active in S. Gallinarum and S. Typhi, and this prompted us to investigate the importance of polyamines for virulence in S. Gallinarum. Bioinformatic analysis of all sequenced genomes of Salmonella revealed that pseudogene formation of the speC gene was exclusive for S. Typhi and S. Gallinarum and happened through independent events. The remaining polyamine biosynthesis pathway was found to be essential for oral infection with S. Gallinarum since single and double mutants in speB and speE, encoding the pathways from agmatine to putrescine and from putrescine to spermidine, were attenuated. In contrast, speB was dispensable after intraperitoneal challenge, suggesting that putrescine was less important for the systemic phase of the disease. In support of this hypothesis, a ΔspeE;ΔpotCD mutant, unable to synthesize and import spermidine, but with retained ability to import and synthesize putrescine, was attenuated after intraperitoneal infection. We therefore conclude that polyamines are essential for virulence of S. Gallinarum. Furthermore, our results point to distinct roles for putrescine and spermidine during systemic infection. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.

    PubMed

    Powell, Bradford C; Hutchison, Clyde A

    2006-01-19

    Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.

  16. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

    PubMed Central

    Powell, Bradford C; Hutchison, Clyde A

    2006-01-01

    Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288

  17. Mitochondrial heteroplasmy and DNA barcoding in Hawaiian Hylaeus (Nesoprosopis) bees (Hymenoptera: Colletidae).

    PubMed

    Magnacca, Karl N; Brown, Mark J F

    2010-06-11

    The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna.

  18. Mitochondrial heteroplasmy and DNA barcoding in Hawaiian Hylaeus (Nesoprosopis) bees (Hymenoptera: Colletidae)

    PubMed Central

    2010-01-01

    Background The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Results Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Conclusions Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna. PMID:20540728

  19. The olfactory receptor gene repertoires in secondary-adapted marine vertebrates: evidence for reduction of the functional proportions in cetaceans.

    PubMed

    Kishida, Takushi; Kubota, Shin; Shirayama, Yoshihisa; Fukami, Hironobu

    2007-08-22

    An olfactory receptor (OR) multigene family is responsible for the well-developed sense of smell possessed by terrestrial tetrapods. Mammalian OR genes had diverged greatly in the terrestrial environment after the fish-tetrapod split, indicating their importance to land habitation. In this study, we analysed OR genes of marine tetrapods (minke whale Balaenoptera acutorostrata, dwarf sperm whale Kogia sima, Dall's porpoise Phocoenoides dalli, Steller's sea lion Eumetopias jubatus and loggerhead sea turtle Caretta caretta) and revealed that the pseudogene proportions of OR gene repertoires in whales were significantly higher than those in their terrestrial relative cattle and also in sea lion and sea turtle. On the other hand, the pseudogene proportion of OR sequences in sea lion was not significantly higher compared with that in their terrestrial relative (dog). It indicates that secondary perfectly adapted marine vertebrates (cetaceans) have lost large amount of their OR genes, whereas secondary-semi-adapted marine vertebrates (sea lions and sea turtles) still have maintained their OR genes, reflecting the importance of terrestrial environment for these animals.

  20. PTESFinder: a computational method to identify post-transcriptional exon shuffling (PTES) events.

    PubMed

    Izuogu, Osagie G; Alhasan, Abd A; Alafghani, Hani M; Santibanez-Koref, Mauro; Elliott, David J; Elliot, David J; Jackson, Michael S

    2016-01-13

    Transcripts, which have been subject to Post-transcriptional exon shuffling (PTES), have an exon order inconsistent with the underlying genomic sequence. These have been identified in a wide variety of tissues and cell types from many eukaryotes, and are now known to be mostly circular, cytoplasmic, and non-coding. Although there is no uniformly ascribed function, several have been shown to be involved in gene regulation. Accurate identification of these transcripts can, however, be difficult due to artefacts from a wide variety of sources. Here, we present a computational method, PTESFinder, to identify these transcripts from high throughput RNAseq data. Uniquely, it systematically excludes potential artefacts emanating from pseudogenes, segmental duplications, and template switching, and outputs both PTES and canonical exon junction counts to facilitate comparative analyses. In comparison with four existing methods, PTESFinder achieves highest specificity and comparable sensitivity at a variety of read depths. PTESFinder also identifies between 13 % and 41.6 % more structures, compared to publicly available methods recently used to identify human circular RNAs. With high sensitivity and specificity, user-adjustable filters that target known sources of false positives, and tailored output to facilitate comparison of transcript levels, PTESFinder will facilitate the discovery and analysis of these poorly understood transcripts.

  1. Xenobiotic-metabolizing enzymes in Bacillus anthracis: molecular and functional analysis of a truncated arylamine N-acetyltransferase isozyme.

    PubMed

    Kubiak, Xavier; Duval, Romain; Pluvinage, Benjamin; Chaffotte, Alain F; Dupret, Jean-Marie; Rodrigues-Lima, Fernando

    2017-07-01

    The arylamine N-acetyltransferases (NATs) are xenobiotic-metabolizing enzymes that play an important role in the detoxification and/or bioactivation of arylamine drugs and xenobiotics. In bacteria, NATs may contribute to the resistance against antibiotics such as isoniazid or sulfamides through their acetylation, which makes this enzyme family a possible drug target. Bacillus anthracis, a bacterial species of clinical significance, expresses three NAT isozymes with distinct structural and enzymatic properties, including an inactive isozyme ((BACAN)NAT3). (BACAN)NAT3 features both a non-canonical Glu residue in its catalytic triad and a truncated C-terminus domain. However, the role these unusual characteristics play in the lack of activity of the (BACAN)NAT3 isozyme remains unclear. Protein engineering, recombinant expression, enzymatic analyses with aromatic amine substrates and phylogenetic analysis approaches were conducted. The deletion of guanine 580 (G580) in the nat3 gene was shown to be responsible for the expression of a truncated (BACAN)NAT3 isozyme. Artificial re-introduction of G580 in the nat3 gene led to a functional enzyme able to acetylate several arylamine drugs displaying structural characteristics comparable with its functional Bacillus cereus homologue ((BACCR)NAT3). Phylogenetic analysis of the nat3 gene in the B. cereus group further indicated that nat3 may constitute a pseudogene of the B. anthracis species. The existence of NATs with distinct properties and evolution in Bacillus species may account for their adaptation to their diverse chemical environments. A better understanding of these isozymes is of importance for their possible use as drug targets. This article is part of a themed section on Drug Metabolism and Antibiotic Resistance in Micro-organisms. To view the other articles in this section visit http://onlinelibrary.wiley.com/doi/10.1111/bph.v174.14/issuetoc. © 2016 The British Pharmacological Society.

  2. Structure and expression of the attacin genes in Hyalophora cecropia.

    PubMed

    Sun, S C; Lindström, I; Lee, J Y; Faye, I

    1991-02-26

    To study the regulation of the immune genes in insects, we have cloned and sequenced the attacin gene locus of the giant silk moth Hyalophora cecropia. The locus contains one acidic and one basic attacin gene as well as two pseudogenes, which are remnants of basic attacin genes. A small insertion element was found within the locus. The two functional attacin genes are transcribed in opposite directions and have two introns inserted at homologous positions. A common sequence, GGGGATTCCT, is found at nucleotide position -48 in the acidic gene and at nucleotide position -58 in the basic gene. Interestingly, this decanucleotide is similar to the consensus of the NF-k B-binding site. Expression studies revealed that both attacins are strongly induced by phorbol 12-myristate 13-acetate, lipopolysaccharide and bacteria. However, only the acidic attacin gene showed a clear response to injury.

  3. yadBC of Yersinia pestis, a new virulence determinant for bubonic plague.

    PubMed

    Forman, Stanislav; Wulff, Christine R; Myers-Morales, Tanya; Cowan, Clarissa; Perry, Robert D; Straley, Susan C

    2008-02-01

    In all Yersinia pestis strains examined, the adhesin/invasin yadA gene is a pseudogene, yet Y. pestis is invasive for epithelial cells. To identify potential surface proteins that are structurally and functionally similar to YadA, we searched the Y. pestis genome for open reading frames with homology to yadA and found three: the bicistronic operon yadBC (YPO1387 and YPO1388 of Y. pestis CO92; y2786 and y2785 of Y. pestis KIM5), which encodes two putative surface proteins, and YPO0902, which lacks a signal sequence and likely is nonfunctional. In this study we characterized yadBC regulation and tested the importance of this operon for Y. pestis adherence, invasion, and virulence. We found that loss of yadBC caused a modest loss of invasiveness for epithelioid cells and a large decrease in virulence for bubonic plague but not for pneumonic plague in mice.

  4. The active gene that encodes human High Mobility Group 1 protein (HMG1) contains introns and maps to chromosome 13

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferrari, S.; Finelli, P.; Rocchi, M.

    The human genome contains a large number of sequences related to the cDNA for High Mobility Group 1 protein (HMG1), which so far has hampered the cloning and mapping of the active HMG1 gene. We show that the human HMG1 gene contains introns, while the HMG1-related sequences do not and most likely are retrotransposed pseudogenes. We identified eight YACs from the ICI and CEPH libraries that contain the human HMG1 gene. The HMG1 gene is similar in structure to the previously characterized murine homologue and maps to human chromosome 13 and q12, as determined by in situ hybridization. The mousemore » Hmg1 gene maps to the telomeric region of murine Chromosome 5, which is syntenic to the human 13q12 band. 18 refs., 3 figs.« less

  5. Comparative genomic analysis of three Leishmania species that cause diverse human disease

    PubMed Central

    Peacock, Christopher S; Seeger, Kathy; Harris, David; Murphy, Lee; Ruiz, Jeronimo C; Quail, Michael A; Peters, Nick; Adlem, Ellen; Tivey, Adrian; Aslett, Martin; Kerhornou, Arnaud; Ivens, Alasdair; Fraser, Audrey; Rajandream, Marie-Adele; Carver, Tim; Norbertczak, Halina; Chillingworth, Tracey; Hance, Zahra; Jagels, Kay; Moule, Sharon; Ormond, Doug; Rutter, Simon; Squares, Rob; Whitehead, Sally; Rabbinowitsch, Ester; Arrowsmith, Claire; White, Brian; Thurston, Scott; Bringaud, Frédéric; Baldauf, Sandra L; Faulconbridge, Adam; Jeffares, Daniel; Depledge, Daniel P; Oyola, Samuel O; Hilley, James D; Brito, Loislene O; Tosi, Luiz R O; Barrell, Barclay; Cruz, Angela K; Mottram, Jeremy C; Smith, Deborah F; Berriman, Matthew

    2008-01-01

    Leishmania parasites cause a broad spectrum of clinical disease. Here we report the sequencing of the genomes of two species of Leishmania: Leishmania infantum and Leishmania braziliensis. The comparison of these sequences with the published genome of Leishmania major reveals marked conservation of synteny and identifies only ∼200 genes with a differential distribution between the three species. L. braziliensis, contrary to Leishmania species examined so far, possesses components of a putative RNA-mediated interference pathway, telomere-associated transposable elements and spliced leader–associated SLACS retrotransposons. We show that pseudogene formation and gene loss are the principal forces shaping the different genomes. Genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage. PMID:17572675

  6. An active second dihydrofolate reductase enzyme is not a feature of rat and mouse, but they do have activity in their mitochondria.

    PubMed

    Hughes, Linda; Carton, Robert; Minguzzi, Stefano; McEntee, Gráinne; Deinum, Eva E; O'Connell, Mary J; Parle-McDermott, Anne

    2015-07-08

    The identification of a second functional dihydrofolate reductase enzyme in humans, DHFRL1, led us to consider whether this is also a feature of rodents. We demonstrate that dihydrofolate reductase activity is also a feature of the mitochondria in both rat and mouse but this is not due to a second enzyme. While our phylogenetic analysis revealed that RNA-mediated DHFR duplication events did occur across the mammal tree, the duplicates in brown rat and mouse are likely to be processed pseudogenes. Humans have evolved the need for two separate enzymes while laboratory rats and mice have just one. Copyright © 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  7. Characterization of the human SDHD gene encoding the small subunit of cytochrome b (cybS) in mitochondrial succinate-ubiquinone oxidoreductase.

    PubMed

    Hirawake, H; Taniwaki, M; Tamura, A; Amino, H; Tomitsuka, E; Kita, K

    1999-08-04

    We have mapped large (cybL) and small (cybS) subunits of cytochrome b in the succinate-ubiquinone oxidoreductase (complex II) of human mitochondria to chromosome 1q21 and 11q23, respectively (H. Hirawake et al., Cytogenet. Cell Genet. 79 (1997) 132-138). In the present study, the human SDHD gene encoding cybS was cloned and characterized. The gene comprises four exons and three introns extending over 19 kb. Sequence analysis of the 5' promoter region showed several motifs for the binding of transcription factors including nuclear respiratory factors NRF-1 and NRF-2 at positions -137 and -104, respectively. In addition to this gene, six pseudogenes of cybS were isolated and mapped on the chromosome.

  8. Comparative genomic analysis of Genlisea (corkscrew plants—Lentibulariaceae) chloroplast genomes reveals an increasing loss of the ndh genes

    PubMed Central

    Silva, Saura R.; Michael, Todd P.; Meer, Elliott J.; Pinheiro, Daniel G.; Miranda, Vitor F. O.

    2018-01-01

    In the carnivorous plant family Lentibulariaceae, all three genome compartments (nuclear, chloroplast, and mitochondria) have some of the highest rates of nucleotide substitutions across angiosperms. While the genera Genlisea and Utricularia have the smallest known flowering plant nuclear genomes, the chloroplast genomes (cpDNA) are mostly structurally conserved except for deletion and/or pseudogenization of the NAD(P)H-dehydrogenase complex (ndh) genes known to be involved in stress conditions of low light or CO2 concentrations. In order to determine how the cpDNA are changing, and to better understand the evolutionary history within the Genlisea genus, we sequenced, assembled and analyzed complete cpDNA from six species (G. aurea, G. filiformis, G. pygmaea, G. repens, G. tuberosa and G. violacea) together with the publicly available G. margaretae cpDNA. In general, the cpDNA structure among the analyzed Genlisea species is highly similar. However, we found that the plastidial ndh genes underwent a progressive process of degradation similar to the other terrestrial Lentibulariaceae cpDNA analyzed to date, but in contrast to the aquatic species. Contrary to current thinking that the terrestrial environment is a more stressful environment and thus requiring the ndh genes, we provide evidence that in the Lentibulariaceae the terrestrial forms have progressive loss while the aquatic forms have the eleven plastidial ndh genes intact. Therefore, the Lentibulariaceae system provides an important opportunity to understand the evolutionary forces that govern the transition to an aquatic environment and may provide insight into how plants manage water stress at a genome scale. PMID:29293597

  9. Comparative genomic analysis of Genlisea (corkscrew plants-Lentibulariaceae) chloroplast genomes reveals an increasing loss of the ndh genes.

    PubMed

    Silva, Saura R; Michael, Todd P; Meer, Elliott J; Pinheiro, Daniel G; Varani, Alessandro M; Miranda, Vitor F O

    2018-01-01

    In the carnivorous plant family Lentibulariaceae, all three genome compartments (nuclear, chloroplast, and mitochondria) have some of the highest rates of nucleotide substitutions across angiosperms. While the genera Genlisea and Utricularia have the smallest known flowering plant nuclear genomes, the chloroplast genomes (cpDNA) are mostly structurally conserved except for deletion and/or pseudogenization of the NAD(P)H-dehydrogenase complex (ndh) genes known to be involved in stress conditions of low light or CO2 concentrations. In order to determine how the cpDNA are changing, and to better understand the evolutionary history within the Genlisea genus, we sequenced, assembled and analyzed complete cpDNA from six species (G. aurea, G. filiformis, G. pygmaea, G. repens, G. tuberosa and G. violacea) together with the publicly available G. margaretae cpDNA. In general, the cpDNA structure among the analyzed Genlisea species is highly similar. However, we found that the plastidial ndh genes underwent a progressive process of degradation similar to the other terrestrial Lentibulariaceae cpDNA analyzed to date, but in contrast to the aquatic species. Contrary to current thinking that the terrestrial environment is a more stressful environment and thus requiring the ndh genes, we provide evidence that in the Lentibulariaceae the terrestrial forms have progressive loss while the aquatic forms have the eleven plastidial ndh genes intact. Therefore, the Lentibulariaceae system provides an important opportunity to understand the evolutionary forces that govern the transition to an aquatic environment and may provide insight into how plants manage water stress at a genome scale.

  10. Horizontal Gene Acquisitions, Mobile Element Proliferation, and Genome Decay in the Host-Restricted Plant Pathogen Erwinia Tracheiphila

    PubMed Central

    Shapiro, Lori R.; Scully, Erin D.; Straub, Timothy J.; Park, Jihye; Stephenson, Andrew G.; Beattie, Gwyn A.; Gleason, Mark L.; Kolter, Roberto; Coelho, Miguel C.; De Moraes, Consuelo M.; Mescher, Mark C.; Zhaxybayeva, Olga

    2016-01-01

    Modern industrial agriculture depends on high-density cultivation of genetically similar crop plants, creating favorable conditions for the emergence of novel pathogens with increased fitness in managed compared with ecologically intact settings. Here, we present the genome sequence of six strains of the cucurbit bacterial wilt pathogen Erwinia tracheiphila (Enterobacteriaceae) isolated from infected squash plants in New York, Pennsylvania, Kentucky, and Michigan. These genomes exhibit a high proportion of recent horizontal gene acquisitions, invasion and remarkable amplification of mobile genetic elements, and pseudogenization of approximately 20% of the coding sequences. These genome attributes indicate that E. tracheiphila recently emerged as a host-restricted pathogen. Furthermore, chromosomal rearrangements associated with phage and transposable element proliferation contribute to substantial differences in gene content and genetic architecture between the six E. tracheiphila strains and other Erwinia species. Together, these data lead us to hypothesize that E. tracheiphila has undergone recent evolution through both genome decay (pseudogenization) and genome expansion (horizontal gene transfer and mobile element amplification). Despite evidence of dramatic genomic changes, the six strains are genetically monomorphic, suggesting a recent population bottleneck and emergence into E. tracheiphila’s current ecological niche. PMID:26992913

  11. Viral unmasking of cellular 5S rRNA pseudogene transcripts induces RIG-I-mediated immunity.

    PubMed

    Chiang, Jessica J; Sparrer, Konstantin M J; van Gent, Michiel; Lässig, Charlotte; Huang, Teng; Osterrieder, Nikolaus; Hopfner, Karl-Peter; Gack, Michaela U

    2018-01-01

    The sensor RIG-I detects double-stranded RNA derived from RNA viruses. Although RIG-I is also known to have a role in the antiviral response to DNA viruses, physiological RNA species recognized by RIG-I during infection with a DNA virus are largely unknown. Using next-generation RNA sequencing (RNAseq), we found that host-derived RNAs, most prominently 5S ribosomal RNA pseudogene 141 (RNA5SP141), bound to RIG-I during infection with herpes simplex virus 1 (HSV-1). Infection with HSV-1 induced relocalization of RNA5SP141 from the nucleus to the cytoplasm, and virus-induced shutoff of host protein synthesis downregulated the abundance of RNA5SP141-interacting proteins, which allowed RNA5SP141 to bind RIG-I and induce the expression of type I interferons. Silencing of RNA5SP141 strongly dampened the antiviral response to HSV-1 and the related virus Epstein-Barr virus (EBV), as well as influenza A virus (IAV). Our findings reveal that antiviral immunity can be triggered by host RNAs that are unshielded following depletion of their respective binding proteins by the virus.

  12. Genome-wide admixture and association study of subclinical atherosclerosis in the Women’s Interagency HIV Study (WIHS)

    PubMed Central

    Shendre, Aditi; Wiener, Howard W.; Irvin, Marguerite R.; Aouizerat, Bradley E.; Overton, Edgar T.; Lazar, Jason; Liu, Chenglong; Hodis, Howard N.; Limdi, Nita A.; Weber, Kathleen M.; Zhi, Degui; Floris-Moore, Michelle A.; Ofotokun, Ighovwerha; Qi, Qibin; Hanna, David B.; Kaplan, Robert C.

    2017-01-01

    Cardiovascular disease (CVD) is a major comorbidity among HIV-infected individuals. Common carotid artery intima-media thickness (cCIMT) is a valid and reliable subclinical measure of atherosclerosis and is known to predict CVD. We performed genome-wide association (GWA) and admixture analysis among 682 HIV-positive and 288 HIV-negative Black, non-Hispanic women from the Women’s Interagency HIV study (WIHS) cohort using a combined and stratified analysis approach. We found some suggestive associations but none of the SNPs reached genome-wide statistical significance in our GWAS analysis. The top GWAS SNPs were rs2280828 in the region intergenic to mediator complex subunit 30 and exostosin glycosyltransferase 1 (MED30 | EXT1) among all women, rs2907092 in the catenin delta 2 (CTNND2) gene among HIV-positive women, and rs7529733 in the region intergenic to family with sequence similarity 5, member C and regulator of G-protein signaling 18 (FAM5C | RGS18) genes among HIV-negative women. The most significant local European ancestry associations were in the region intergenic to the zinc finger and SCAN domain containing 5D gene and NADH: ubiquinone oxidoreductase complex assembly factor 1 (ZSCAN5D | NDUF1) pseudogene on chromosome 19 among all women, in the region intergenic to vomeronasal 1 receptor 6 pseudogene and zinc finger protein 845 (VN1R6P | ZNF845) gene on chromosome 19 among HIV-positive women, and in the region intergenic to the SEC23-interacting protein and phosphatidic acid phosphatase type 2 domain containing 1A (SEC23IP | PPAPDC1A) genes located on chromosome 10 among HIV-negative women. A number of previously identified SNP associations with cCIMT were also observed and included rs2572204 in the ryanodine receptor 3 (RYR3) and an admixture region in the secretion-regulating guanine nucleotide exchange factor (SERGEF) gene. We report several SNPs and gene regions in the GWAS and admixture analysis, some of which are common across HIV-positive and HIV-negative women as demonstrated using meta-analysis, and also across the two analytic approaches (i.e., GWA and admixture). These findings suggest that local European ancestry plays an important role in genetic associations of cCIMT among black women from WIHS along with other environmental factors that are related to CVD and may also be triggered by HIV. These findings warrant confirmation in independent samples. PMID:29206233

  13. Molecular organization and phylogenetic analysis of 5S rDNA in crustaceans of the genus Pollicipes reveal birth-and-death evolution and strong purifying selection.

    PubMed

    Perina, Alejandra; Seoane, David; González-Tizón, Ana M; Rodríguez-Fariña, Fernanda; Martínez-Lage, Andrés

    2011-10-17

    The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection.

  14. Molecular organization and phylogenetic analysis of 5S rDNA in crustaceans of the genus Pollicipes reveal birth-and-death evolution and strong purifying selection

    PubMed Central

    2011-01-01

    Background The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. Results The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. Conclusions These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection. PMID:22004418

  15. Evolution of plant virus movement proteins from the 30K superfamily and of their homologs integrated in plant genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mushegian, Arcady R., E-mail: mushegian2@gmail.com; Elena, Santiago F., E-mail: sfelena@ibmcp.upv.es; The Santa Fe Institute, Santa Fe, NM 87501

    Homologs of Tobacco mosaic virus 30K cell-to-cell movement protein are encoded by diverse plant viruses. Mechanisms of action and evolutionary origins of these proteins remain obscure. We expand the picture of conservation and evolution of the 30K proteins, producing sequence alignment of the 30K superfamily with the broadest phylogenetic coverage thus far and illuminating structural features of the core all-beta fold of these proteins. Integrated copies of pararetrovirus 30K movement genes are prevalent in euphyllophytes, with at least one copy intact in nearly every examined species, and mRNAs detected for most of them. Sequence analysis suggests repeated integrations, pseudogenizations, andmore » positive selection in those provirus genes. An unannotated 30K-superfamily gene in Arabidopsis thaliana genome is likely expressed as a fusion with the At1g37113 transcript. This molecular background of endopararetrovirus gene products in plants may change our view of virus infection and pathogenesis, and perhaps of cellular homeostasis in the hosts. - Highlights: • Sequence region shared by plant virus “30K” movement proteins has an all-beta fold. • Most euphyllophyte genomes contain integrated copies of pararetroviruses. • These integrated virus genomes often include intact movement protein genes. • Molecular evidence suggests that these “30K” genes may be selected for function.« less

  16. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  17. Sequence Polymorphisms and Structural Variations among Four Grapevine (Vitis vinifera L.) Cultivars Representing Sardinian Agriculture

    PubMed Central

    Mercenaro, Luca; Nieddu, Giovanni; Porceddu, Andrea; Pezzotti, Mario; Camiolo, Salvatore

    2017-01-01

    The genetic diversity among grapevine (Vitis vinifera L.) cultivars that underlies differences in agronomic performance and wine quality reflects the accumulation of single nucleotide polymorphisms (SNPs) and small indels as well as larger genomic variations. A combination of high throughput sequencing and mapping against the grapevine reference genome allows the creation of comprehensive sequence variation maps. We used next generation sequencing and bioinformatics to generate an inventory of SNPs and small indels in four widely cultivated Sardinian grape cultivars (Bovale sardo, Cannonau, Carignano and Vermentino). More than 3,200,000 SNPs were identified with high statistical confidence. Some of the SNPs caused the appearance of premature stop codons and thus identified putative pseudogenes. The analysis of SNP distribution along chromosomes led to the identification of large genomic regions with uninterrupted series of homozygous SNPs. We used a digital comparative genomic hybridization approach to identify 6526 genomic regions with significant differences in copy number among the four cultivars compared to the reference sequence, including 81 regions shared between all four cultivars and 4953 specific to single cultivars (representing 1.2 and 75.9% of total copy number variation, respectively). Reads mapping at a distance that was not compatible with the insert size were used to identify a dataset of putative large deletions with cultivar Cannonau revealing the highest number. The analysis of genes mapping to these regions provided a list of candidates that may explain some of the phenotypic differences among the Bovale sardo, Cannonau, Carignano and Vermentino cultivars. PMID:28775732

  18. Draft Genome Sequence of Bioactive-Compound-Producing Cyanobacterium Tolypothrix campylonemoides Strain VB511288

    PubMed Central

    Das, Subhadeep; Singh, Deeksha; Madduluri, Madhavi; Chandrababunaidu, Mathu Malar; Gupta, Akash

    2015-01-01

    We report here the draft genome sequence of Tolypothrix campylonemoides VB511288, isolated from building facades in Santiniketan, India. The members of this genus produce several compounds of commercial importance. The draft assembly is 10,627,177 bases in 135 scaffolds, and it contains 7,886 protein-coding genes, 994 pseudogenes, 18 rRNA genes, and 76 tRNA genes. PMID:25838485

  19. A novel polymorphic cytochrome P450 formed by splicing of CYP3A7 and the pseudogene CYP3AP1.

    PubMed

    Rodriguez-Antona, Cristina; Axelson, Magnus; Otter, Charlotta; Rane, Anders; Ingelman-Sundberg, Magnus

    2005-08-05

    The cytochrome P450 3A7 (CYP3A7) is the most abundant CYP in human liver during fetal development and first months of postnatal age, playing an important role in the metabolism of endogenous hormones, drugs, differentiation factors, and potentially toxic and teratogenic substrates. Here we describe and characterize a novel enzyme, CYP3A7.1L, encompassing the CYP3A7.1 protein with the last four carboxyl-terminal amino acids replaced by a unique sequence of 36 amino acids, generated by splicing of CYP3A7 with CYP3AP1 RNA. The corresponding CYP3A7-3AP1 mRNA had a significant expression in liver, kidney, and gastrointestinal tract, and its presence was found to be tissue-specific and dependent on the developmental stage. Heterologous expression in yeast revealed that CYP3A7.1L was a functional enzyme with a specific activity similar to that of CYP3A7.1 and, in some conditions, a different hydroxylation specificity than CYP3A7.1 using dehydroepiandrosterone as a substrate. CYP3A7.1L was found to be polymorphic due to a mutation at position -6 of the first splicing site of CYP3AP1 (CYP3A7_39256T-->A), which abrogates the pseudogene splicing. This polymorphism had pronounced interethnic differences and was in linkage disequilibrium with other functional polymorphisms described in the CYP3A locus: CYP3A7*2 and CYP3A5*1. Therefore, the resulting CYP3A haplotypes express different sets of enzymes within the population. In conclusion, a novel mechanism, consisting of the splicing of the pseudogene CYP3AP1 to CYP3A7, causes the formation of the novel CYP3A7.1L having a different tissue distribution and functional properties than the parent CYP3A7 enzyme, with possible developmental, physiological, and toxicological consequences.

  20. Evolution of Siglec-11 and Siglec-16 Genes in Hominins

    PubMed Central

    Wang, Xiaoxia; Mitra, Nivedita; Cruz, Pedro; Deng, Liwen; Varki, Nissi; Angata, Takashi; Green, Eric D.; Mullikin, Jim; Hayakawa, Toshiyuki; Varki, Ajit

    2012-01-01

    We previously reported a human-specific gene conversion of SIGLEC11 by an adjacent paralogous pseudogene (SIGLEC16P), generating a uniquely human form of the Siglec-11 protein, which is expressed in the human brain. Here, we show that Siglec-11 is expressed exclusively in microglia in all human brains studied—a finding of potential relevance to brain evolution, as microglia modulate neuronal survival, and Siglec-11 recruits SHP-1, a tyrosine phosphatase that modulates microglial biology. Following the recent finding of a functional SIGLEC16 allele in human populations, further analysis of the human SIGLEC11 and SIGLEC16/P sequences revealed an unusual series of gene conversion events between two loci. Two tandem and likely simultaneous gene conversions occurred from SIGLEC16P to SIGLEC11 with a potentially deleterious intervening short segment happening to be excluded. One of the conversion events also changed the 5′ untranslated sequence, altering predicted transcription factor binding sites. Both of the gene conversions have been dated to ∼1–1.2 Ma, after the emergence of the genus Homo, but prior to the emergence of the common ancestor of Denisovans and modern humans about 800,000 years ago, thus suggesting involvement in later stages of hominin brain evolution. In keeping with this, recombinant soluble Siglec-11 binds ligands in the human brain. We also address a second-round more recent gene conversion from SIGLEC11 to SIGLEC16, with the latter showing an allele frequency of ∼0.1–0.3 in a worldwide population study. Initial pseudogenization of SIGLEC16 was estimated to occur at least 3 Ma, which thus preceded the gene conversion of SIGLEC11 by SIGLEC16P. As gene conversion usually disrupts the converted gene, the fact that ORFs of hSIGLEC11 and hSIGLEC16 have been maintained after an unusual series of very complex gene conversion events suggests that these events may have been subject to hominin-specific selection forces. PMID:22383531

  1. Evolutionary diversification of type-2 HDAC structure, function and regulation in Nicotiana tabacum.

    PubMed

    Nicolas-Francès, Valérie; Grandperret, Vincent; Liegard, Benjamin; Jeandroz, Sylvain; Vasselon, Damien; Aimé, Sébastien; Klinguer, Agnès; Lamotte, Olivier; Julio, Emilie; de Borne, François Dorlhac; Wendehenne, David; Bourque, Stéphane

    2018-04-01

    Type-2 HDACs (HD2s) are plant-specific histone deacetylases that play diverse roles during development and in responses to biotic and abiotic stresses. In this study we characterized the six tobacco genes encoding HD2s that mainly differ by the presence or the absence of a typical zinc finger in their C-terminal part. Of particular interest, these HD2 genes exhibit a highly conserved intron/exon structure. We then further investigated the phylogenetic relationships among the HD2 gene family, and proposed a model of the genetic events that led to the organization of the HD2 family in Solanaceae. Absolute quantification of HD2 mRNAs in N. tabacum and in its precursors, N. tomentosiformis and N. sylvestris, did not reveal any pseudogenization of any of the HD2 genes, but rather specific regulation of HD2 expression in these three species. Functional complementation approaches in Arabidopsis thaliana demonstrated that the four zinc finger-containing HD2 proteins exhibit the same biological function in response to salt stress, whereas the two HD2 proteins without zinc finger have different biological function. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Draft Genome Sequence of Leuconostoc mesenteroides P45 Isolated from Pulque, a Traditional Mexican Alcoholic Fermented Beverage

    PubMed Central

    Riveros-Mckay, Fernando; Campos, Itzia; Giles-Gómez, Martha; Bolívar, Francisco

    2014-01-01

    Leuconostoc mesenteroides P45 was isolated from the traditional Mexican pulque beverage. We report its draft genome sequence, assembled in 6 contigs consisting of 1,874,188 bp and no plasmids. Genome annotation predicted a total of 1,800 genes, 1,687 coding sequences, 52 pseudogenes, 9 rRNAs, 51 tRNAs, 1 noncoding RNA, and 44 frameshifted genes. PMID:25377708

  3. Draft Genome Sequence of Bioactive-Compound-Producing Cyanobacterium Tolypothrix campylonemoides Strain VB511288.

    PubMed

    Das, Subhadeep; Singh, Deeksha; Madduluri, Madhavi; Chandrababunaidu, Mathu Malar; Gupta, Akash; Adhikary, Siba Prasad; Tripathy, Sucheta

    2015-04-02

    We report here the draft genome sequence of Tolypothrix campylonemoides VB511288, isolated from building facades in Santiniketan, India. The members of this genus produce several compounds of commercial importance. The draft assembly is 10,627,177 bases in 135 scaffolds, and it contains 7,886 protein-coding genes, 994 pseudogenes, 18 rRNA genes, and 76 tRNA genes. Copyright © 2015 Das et al.

  4. Draft Genome Sequence of Pseudomonas chlororaphis ATCC 9446, a Nonpathogenic Bacterium with Bioremediation and Industrial Potential.

    PubMed

    Moreno-Avitia, Fabian; Lozano, Luis; Utrilla, Jose; Bolívar, Francisco; Escalante, Adelfo

    2017-06-08

    Pseudomonas chlororaphis strain ATCC 9446 is a biocontrol-related organism. We report here its draft genome sequence assembled into 35 contigs consisting of 6,783,030 bp. Genome annotation predicted a total of 6,200 genes, 6,128 coding sequences, 81 pseudogenes, 58 tRNAs, 4 noncoding RNAs (ncRNAs), and 41 frameshifted genes. Copyright © 2017 Moreno-Avitia et al.

  5. Expression patterns of bark beetle cytochromes P450 during host colonization: Likely physiological functions and potential targets for pest management

    Treesearch

    Dezene P. W. Huber; Melissa Erickson; Christian Leutenegger; Joerg Bohlmann; Steven J. Seybold

    2007-01-01

    Cytochromes P450 family genes (P450s) are found in a diverse array of organisms ranging from bacteria to mammals to plants to arthropods. Although there are exceptions to this rule, organisms generally contain a fairly large number of P450 genes and pseudogenes in their genomes. For instance, among arthropods whose genomes are well characterized, the mosquito,

  6. Evolution of the bovine lysozyme gene family: changes in gene expression and reversion of function.

    PubMed

    Irwin, D M

    1995-09-01

    Recruitment of lysozyme to a digestive function in ruminant artiodactyls is associated with amplification of the gene. At least four of the approximately ten genes are expressed in the stomach, and several are expressed in nonstomach tissues. Characterization of additional lysozymelike sequences in the bovine genome has identified most, if not all, of the members of this gene family. There are at least six stomachlike lysozyme genes, two of which are pseudogenes. The stomach lysozyme pseudogenes show a pattern of concerted evolution similar to that of the functional stomach genes. At least four nonstomach lysozyme genes exist. The nonstomach lysozyme genes are not monophyletic. A gene encoding a tracheal lysozyme was isolated, and the stomach lysozyme of advanced ruminants was found to be more closely related to the tracheal lysozyme than to the stomach lysozyme of the camel or other nonstomach lysozyme genes of ruminants. The tracheal lysozyme shares with stomach lysozymes of advanced ruminants the deletion of amino acid 103, and several other adaptive sequence characteristics of stomach lysozymes. I suggest here that tracheal lysozyme has reverted from a functional stomach lysozyme. Tracheal lysozyme then represents a second instance of a change in lysozyme gene expression and function within ruminants.

  7. Detection of steroid 21-hydroxylase alleles using gene-specific PCR and a multiplexed ligation detection reaction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Day, D.J.; Barany, F.; Speiser, P.W.

    Steroid 21-hydroxylase deficiency is the most common cause of congenital adrenal hyperplasia, an inherited inability to synthesize cortisol that occurs in 1 in 10,000-15,000 births. Affected females are born with ambiguous genitalia, a condition that can be ameliorated by administering dexamethasone to the mother for most of gestation. Prenatal diagnosis is required for accurate treatment of affected females as well as for genetic counseling purposes. Approximately 95% of mutations causing this disorder result from recombinations between the gene encoding the 21-hydroxylase enzyme (CYP21) and a linked, highly homologous pseudogene (CYP21P). Approximately 20% of these mutations are gene deletions, and themore » remainder are gene conversions that transfer any of nine deleterious mutations from the CYP21P pseudogene to CYP21. We describe a methodology for genetic diagnosis of 21-hydroxylase deficiency that utilizes gene-specific PCR amplification in conjunction with thermostable DNA ligase to discriminate single nucleotide variations in a multiplexed ligation detection assay. The assay has been designed to be used with either fluorescent or radioactive detection of ligation products by electrophoresis on denaturing acrylamide gels and is readily adaptable for use in other disease systems. 30 refs., 5 figs.« less

  8. Proteomic analysis of rodent ribosomes revealed heterogeneity including ribosomal proteins L10-like, L22-like 1, and L39-like.

    PubMed

    Sugihara, Yoshihiko; Honda, Hiroki; Iida, Tomoharu; Morinaga, Takuma; Hino, Shingo; Okajima, Tetsuya; Matsuda, Tsukasa; Nadano, Daita

    2010-03-05

    Heterogeneity of ribosome structure, due to variations in ribosomal protein composition, has been shown to be of physiological significance in plants and yeast. Mammalian genomics have demonstrated numerous genes that are paralogous to genes encoding ribosomal proteins. Although the vast majority are considered to be pseudogenes, mRNA expression of a few paralogues, such as human ribosomal protein L39-like/L39-2, has been reported. In the present study, ribosomes from the liver, mammary gland, and testis of rodents were analyzed using a combination of two-dimensional gel electrophoresis under radical-free and highly reducing conditions, and mass spectrometry. This system allowed identification of 78 ribosomal proteins and Rack1 from a single gel. The degree of heterogeneity was far less than that reported for plant and yeast ribosomes, and was in accord with published biochemical and genetic data for mammalian ribosomes. Nevertheless, an uncharacterized paralogue of ribosomal protein L22, ribosomal protein L22-like 1, was identified as a minor ribosomal component. Ribosomal proteins L10-like and L39-like, paralogues of ribosomal proteins L10 and L39, respectively, were found in ribosomes only from the testis. Reverse transcription-polymerase chain reaction yielded supportive evidence for specific expression of L10-like and L39-like in the testis. Newly synthesized L39-like is likely to be transported to the nucleolus, where ribosome biosynthesis occurs, and then incorporated into translating ribosomes in the cytoplasm. Heterogeneity of mammalian testicular ribosomes is structurally non-negligible, and may offer valuable insights into the function of the customized ribosome.

  9. Chromosomal arrangement of leghemoglobin genes in soybean.

    PubMed Central

    Lee, J S; Brown, G G; Verma, D P

    1983-01-01

    A cluster of four different leghemoglobin (Lb) genes was isolated from AluI-HaeIII and EcoRI genomic libraries of soybean in a set of overlapping clones which together include 45 kilobases (kb) of contiguous DNA. These four genes, including a pseudogene, are present in the same orientation and are arranged in the order: 5'-Lba-Lbc1-Lb psi-Lbc3-3'. The intergenic regions average 2.5 kb. In addition to this main Lb locus, there are other Lb genes which do not appear to be contiguous to this locus. A sequence probably common to the 3' region of Lb loci was found flanking the Lbc3 gene. The 3' flanking region of the main Lb locus also contains a sequence that appears to be expressed more abundantly in root tissue. Another sequence which is primarily expressed in root and leaf is found 5' to two Lb loci. Overall, the main leghemoglobin locus is similar in structure to the mammalian globin gene loci. Images PMID:6310504

  10. LCR-initiated rearrangements at the IDS locus, completed with Alu-mediated recombination or non-homologous end joining.

    PubMed

    Oshima, Junko; Lee, Jennifer A; Breman, Amy M; Fernandes, Priscilla H; Babovic-Vuksanovic, Dusica; Ward, Patricia A; Wolfe, Lynne A; Eng, Christine M; Del Gaudio, Daniela

    2011-07-01

    Mucopolysaccharidosis type II (MPS II) is caused by mutations in the IDS gene, which encodes the lysosomal enzyme iduronate-2-sulfatase. In ∼20% of MPS II patients the disorder is caused by gross IDS structural rearrangements. We identified two male cases harboring complex rearrangements involving the IDS gene and the nearby pseudogene, IDSP1, which has been annotated as a low-copy repeat (LCR). In both cases the rearrangement included a partial deletion of IDS and an inverted insertion of the neighboring region. In silico analyses revealed the presence of repetitive elements as well as LCRs at the junctions of rearrangements. Our models illustrate two alternative consequences of rearrangements initiated by non-allelic homologous recombination of LCRs: resolution by a second recombination event (that is, Alu-mediated recombination), or resolution by non-homologous end joining repair. These complex rearrangements have the potential to be recurrent and may be present among those MSP II cases with previously uncharacterized aberrations involving IDS.

  11. Molecular diagnostics for hereditary hearing loss in children.

    PubMed

    Sommen, Manou; Wuyts, Wim; Van Camp, Guy

    2017-08-01

    Hearing loss (HL) is the most common birth defect in industrialized countries with far-reaching social, psychological and cognitive implications. It is an extremely heterogeneous disease, complicating molecular testing. The introduction of next-generation sequencing (NGS) has resulted in great progress in diagnostics allowing to study all known HL genes in a single assay. The diagnostic yield is currently still limited, but has the potential to increase substantially. Areas covered: In this review the utility of NGS and the problems for comprehensive molecular testing for HL are evaluated and discussed. Expert commentary: Different publications have proven the appropriateness of NGS for molecular testing of heterogeneous diseases such as HL. However, several problems still exist, such as pseudogenic background of some genes and problematic copy number variant analysis on targeted NGS data. Another main challenge for the future will be the establishment of population specific mutation-spectra to achieve accurate personalized comprehensive molecular testing for HL.

  12. Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function

    PubMed Central

    Ezkurdia, Iakes; del Pozo, Angela; Frankish, Adam; Rodriguez, Jose Manuel; Harrow, Jennifer; Ashman, Keith; Valencia, Alfonso; Tress, Michael L.

    2012-01-01

    Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints. PMID:22446687

  13. Whole-Genome Sequence of Corynebacterium auriscanis Strain CIP 106629 Isolated from a Dog with Bilateral Otitis from the United Kingdom.

    PubMed

    Tiwari, Sandeep; Jamal, Syed Babar; Oliveira, Leticia Castro; Clermont, Dominique; Bizet, Chantal; Mariano, Diego; de Carvalho, Paulo Vinicius Sanches Daltro; Souza, Flavia; Pereira, Felipe Luiz; de Castro Soares, Siomar; Guimarães, Luis C; Dorella, Fernanda; Carvalho, Alex; Leal, Carlos; Barh, Debmalya; Figueiredo, Henrique; Hassan, Syed Shah; Azevedo, Vasco; Silva, Artur

    2016-08-11

    In this work, we describe a set of features of Corynebacterium auriscanis CIP 106629 and details of the draft genome sequence and annotation. The genome comprises a 2.5-Mbp-long single circular genome with 1,797 protein-coding genes, 5 rRNA, 50 tRNA, and 403 pseudogenes, with a G+C content of 58.50%. Copyright © 2016 Tiwari et al.

  14. European Chlamydia abortus livestock isolate genomes reveal unusual stability and limited diversity, reflected in geographical signatures.

    PubMed

    Seth-Smith, H M B; Busó, Leonor Sánchez; Livingstone, M; Sait, M; Harris, S R; Aitchison, K D; Vretou, Evangelia; Siarkou, V I; Laroucau, K; Sachse, K; Longbottom, D; Thomson, N R

    2017-05-04

    Chlamydia abortus (formerly Chlamydophila abortus) is an economically important livestock pathogen, causing ovine enzootic abortion (OEA), and can also cause zoonotic infections in humans affecting pregnancy outcome. Large-scale genomic studies on other chlamydial species are giving insights into the biology of these organisms but have not yet been performed on C. abortus. Our aim was to investigate a broad collection of European isolates of C. abortus, using next generation sequencing methods, looking at diversity, geographic distribution and genome dynamics. Whole genome sequencing was performed on our collection of 57 C. abortus isolates originating primarily from the UK, Germany, France and Greece, but also from Tunisia, Namibia and the USA. Phylogenetic analysis of a total of 64 genomes shows a deep structural division within the C. abortus species with a major clade displaying limited diversity, in addition to a branch carrying two more distantly related Greek isolates, LLG and POS. Within the major clade, seven further phylogenetic groups can be identified, demonstrating geographical associations. The number of variable nucleotide positions across the sampled isolates is significantly lower than those published for C. trachomatis and C. psittaci. No recombination was identified within C. abortus, and no plasmid was found. Analysis of pseudogenes showed lineage specific loss of some functions, notably with several Pmp and TMH/Inc proteins predicted to be inactivated in many of the isolates studied. The diversity within C. abortus appears to be much lower compared to other species within the genus. There are strong geographical signatures within the phylogeny, indicating clonal expansion within areas of limited livestock transport. No recombination has been identified within this species, showing that different species of Chlamydia may demonstrate different evolutionary dynamics, and that the genome of C. abortus is highly stable.

  15. Frequent loss of lineages and deficient duplications accounted for low copy number of disease resistance genes in Cucurbitaceae

    PubMed Central

    2013-01-01

    Background The sequenced genomes of cucumber, melon and watermelon have relatively few R-genes, with 70, 75 and 55 copies only, respectively. The mechanism for low copy number of R-genes in Cucurbitaceae genomes remains unknown. Results Manual annotation of R-genes in the sequenced genomes of Cucurbitaceae species showed that approximately half of them are pseudogenes. Comparative analysis of R-genes showed frequent loss of R-gene loci in different Cucurbitaceae species. Phylogenetic analysis, data mining and PCR cloning using degenerate primers indicated that Cucurbitaceae has limited number of R-gene lineages (subfamilies). Comparison between R-genes from Cucurbitaceae and those from poplar and soybean suggested frequent loss of R-gene lineages in Cucurbitaceae. Furthermore, the average number of R-genes per lineage in Cucurbitaceae species is approximately 1/3 that in soybean or poplar. Therefore, both loss of lineages and deficient duplications in extant lineages accounted for the low copy number of R-genes in Cucurbitaceae. No extensive chimeras of R-genes were found in any of the sequenced Cucurbitaceae genomes. Nevertheless, one lineage of R-genes from Trichosanthes kirilowii, a wild Cucurbitaceae species, exhibits chimeric structures caused by gene conversions, and may contain a large number of distinct R-genes in natural populations. Conclusions Cucurbitaceae species have limited number of R-gene lineages and each genome harbors relatively few R-genes. The scarcity of R-genes in Cucurbitaceae species was due to frequent loss of R-gene lineages and infrequent duplications in extant lineages. The evolutionary mechanisms for large variation of copy number of R-genes in different plant species were discussed. PMID:23682795

  16. Analysis of novel high-molecular-weight prolamins from Leymus multicaulis (Kar. et Kir.) Tzvelev and L. chinensis (Trin. ex Bunge) Tzvelev.

    PubMed

    Hu, Xinkun; Dai, Shoufen; Song, Zhongping; Xu, Dongyang; Wen, Zhaojin; Wei, Yuming; Liu, Dengcai; Zheng, Youliang; Yan, Zehong

    2018-06-01

    Nine novel high-molecular-weight prolamins (HMW-prolamins) were isolated from Leymus multicaulis and L. chinensis. Based on the structure of the repetitive domains, all nine genes were classified as D-hordeins but not high-molecular-weight glutenin subunits (HMW-GSs) that have been previously isolated in Leymus spp. Four genes, Lmul 1.2, 2.4, 2.7, and Lchi 2.5 were verified by bacterial expression, whereas the other five sequences (1.3 types) were classified as pseudogenes. The four Leymus D-hordein proteins had longer N-termini than those of Hordeum spp. [116/118 vs. 110 amino acid (AA) residues], whereas three (Lmul 1.2, 2.4, and 2.7) contained shorter N-termini than those of the Ps. juncea (116 vs. 118 AA residues). Furthermore, Lmul 1.2 was identified as the smallest D-hordein, and Lmul 1.2 and 2.7 had an additional cysteines. Phylogenetic analysis supported that the nine D-hordeins of Leymus formed two independent clades, with all the 1.3 types clustered with Ps. juncea Ns 1.3, whereas the others were clustered together with the D-hordeins from Hordeum and Ps. juncea and the HMW-GSs from Leymus. Within the clade of four D-hordein genes and HMW-GSs, the HMW-GSs of Leymus formed a separated branch that served as an intermediate between the D-hordeins of Ps. juncea and Leymus. These novel D-hordeins may be potentially utilized in the improvement of food processing properties particularly those relating to extra cysteine residues. The findings of the present study also provide basic information for understanding the HMW-prolamins among Triticeae species, as well as expand the sources of D-hordeins from Hordeum to Leymus.

  17. Functional Analysis of the Chaperone-Usher Fimbrial Gene Clusters of Salmonella enterica serovar Typhi.

    PubMed

    Dufresne, Karine; Saulnier-Bellemare, Julie; Daigle, France

    2018-01-01

    The human-specific pathogen Salmonella enterica serovar Typhi causes typhoid, a major public health issue in developing countries. Several aspects of its pathogenesis are still poorly understood. S . Typhi possesses 14 fimbrial gene clusters including 12 chaperone-usher fimbriae ( stg, sth, bcf , fim, saf , sef , sta, stb, stc, std, ste , and tcf ). These fimbriae are weakly expressed in laboratory conditions and only a few are actually characterized. In this study, expression of all S . Typhi chaperone-usher fimbriae and their potential roles in pathogenesis such as interaction with host cells, motility, or biofilm formation were assessed. All S . Typhi fimbriae were better expressed in minimal broth. Each system was overexpressed and only the fimbrial gene clusters without pseudogenes demonstrated a putative major subunits of about 17 kDa on SDS-PAGE. Six of these (Fim, Saf, Sta, Stb, Std, and Tcf) also show extracellular structure by electron microscopy. The impact of fimbrial deletion in a wild-type strain or addition of each individual fimbrial system to an S . Typhi afimbrial strain were tested for interactions with host cells, biofilm formation and motility. Several fimbriae modified bacterial interactions with human cells (THP-1 and INT-407) and biofilm formation. However, only Fim fimbriae had a deleterious effect on motility when overexpressed. Overall, chaperone-usher fimbriae seem to be an important part of the balance between the different steps (motility, adhesion, host invasion and persistence) of S . Typhi pathogenesis.

  18. Natural Variation in the Pto Pathogen Resistance Gene Within Species of Wild Tomato (Lycopersicon). I. Functional Analysis of Pto Alleles

    PubMed Central

    Rose, Laura E.; Langley, Charles H.; Bernal, Adriana J.; Michelmore, Richard W.

    2005-01-01

    Disease resistance to the bacterial pathogen Pseudomonas syringae pv. tomato (Pst) in the cultivated tomato, Lycopersicon esculentum, and the closely related L. pimpinellifolium is triggered by the physical interaction between plant disease resistance protein, Pto, and the pathogen avirulence protein, AvrPto. To investigate the extent to which variation in the Pto gene is responsible for naturally occurring variation in resistance to Pst, we determined the resistance phenotype of 51 accessions from seven species of Lycopersicon to isogenic strains of Pst differing in the presence of avrPto. One-third of the plants displayed resistance specifically when the pathogen expressed AvrPto, consistent with a gene-for-gene interaction. To test whether this resistance in these species was conferred specifically by the Pto gene, alleles of Pto were amplified and sequenced from 49 individuals and a subset (16) of these alleles was tested in planta using Agrobacterium-mediated transient assays. Eleven alleles conferred a hypersensitive resistance response (HR) in the presence of AvrPto, while 5 did not. Ten amino acid substitutions associated with the absence of AvrPto recognition and HR were identified, none of which had been identified in previous structure-function studies. Additionally, 3 alleles encoding putative pseudogenes of Pto were isolated from two species of Lycopersicon. Therefore, a large proportion, but not all, of the natural variation in the reaction to strains of Pst expressing AvrPto can be attributed to sequence variation in the Pto gene. PMID:15944360

  19. Dynamic evolution at pericentromeres.

    PubMed

    Hall, Anne E; Kettler, Gregory C; Preuss, Daphne

    2006-03-01

    Pericentromeres are exceptional genomic regions: in animals they contain extensive segmental duplications implicated in gene creation, and in plants they sustain rearrangements and insertions uncommon in euchromatin. To examine the mechanisms and patterns of plant pericentromere evolution, we compared pericentromere sequence from four Brassicaceae species separated by <15 million years (Myr). This flowering plant family is ideal for studying relationships between genome reorganization and pericentromere evolution-its members have undergone recent polyploidization and hybridization, with close relatives changing in genome size and chromosome number. Through sequence and hybridization analyses, we examined regions from Arabidopsis arenosa, Capsella rubella, and Olimarabidopsis pumila that are homologous to Arabidopsis thaliana pericentromeres (peri-CENs) III and V, and used FISH to demonstrate they have been maintained near centromere satellite arrays in each species. Sequence analysis revealed a set of highly conserved genes, yet we discovered substantial differences in intergenic length and species-specific changes in sequence content and gene density. We discovered that A. thaliana has undergone recent, significant expansions within its pericentromeres, in some cases measuring hundreds of kilobases; these findings are in marked contrast to euchromatic segments in these species that exhibit only minor length changes. While plant pericentromeres do contain some duplications, we did not find evidence of extensive segmental duplications, as has been documented in primates. Our data support a model in which plant pericentromeres may experience selective pressures distinct from euchromatin, tolerating rapid, dynamic changes in structure and sequence content, including large insertions of mobile elements, 5S rDNA arrays and pseudogenes.

  20. Sheep skeletal muscle transcriptome analysis reveals muscle growth regulatory lncRNAs.

    PubMed

    Chao, Tianle; Ji, Zhibin; Hou, Lei; Wang, Jin; Zhang, Chunlan; Wang, Guizhi; Wang, Jianmin

    2018-01-01

    As widely distributed domestic animals, sheep are an important species and the source of mutton. In this study, we aimed to evaluate the regulatory lncRNAs associated with muscle growth and development between high production mutton sheep (Dorper sheep and Qianhua Mutton Merino sheep) and low production mutton sheep (Small-tailed Han sheep). In total, 39 lncRNAs were found to be differentially expressed. Using co-expression analysis and functional annotation, 1,206 co-expression interactions were found between 32 lncRNAs and 369 genes, and 29 of these lncRNAs were found to be associated with muscle development, metabolism, cell proliferation and apoptosis. lncRNA-mRNA interactions revealed 6 lncRNAs as hub lncRNAs. Moreover, three lncRNAs and their associated co-expressed genes were demonstrated by cis-regulatory gene analyses, and we also found a potential regulatory relationship between the pseudogene lncRNA LOC101121401 and its parent gene FTH1. This study provides a genome-wide resolution of lncRNA and mRNA regulation in muscles from mutton sheep.

  1. Toxin gene determination and evolution in scorpaenoid fish.

    PubMed

    Chuang, Po-Shun; Shiao, Jen-Chieh

    2014-09-01

    In this study, we determine the toxin genes from both cDNA and genomic DNA of four scorpaenoid fish and reconstruct their evolutionary relationship. The deduced protein sequences of the two toxin subunits in Sebastapistes strongia, Scorpaenopsis oxycephala, and Sebastiscus marmoratus are about 700 amino acid, similar to the sizes of the stonefish (Synanceia horrida, and Synanceia verrucosa) and lionfish (Pterois antennata and Pterois volitans) toxins previously published. The intron positions are highly conserved among these species, which indicate the applicability of gene finding by using genomic DNA template. The phylogenetic analysis shows that the two toxin subunits were duplicated prior to the speciation of Scorpaenoidei. The precedence of the gene duplication over speciation indicates that the toxin genes may be common to the whole family of Scorpaeniform. Furthermore, one additional toxin gene has been determined in the genomic DNA of Dendrochirus zebra. The phylogenetic analysis suggests that an additional gene duplication occurred before the speciation of the lionfish (Pteroinae) and a pseudogene may be generally present in the lineage of lionfish. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Limited mitogenomic degradation in response to a parasitic lifestyle in Orobanchaceae

    PubMed Central

    Fan, Weishu; Zhu, Andan; Kozaczek, Melisa; Shah, Neethu; Pabón-Mora, Natalia; González, Favio; Mower, Jeffrey P.

    2016-01-01

    In parasitic plants, the reduction in plastid genome (plastome) size and content is driven predominantly by the loss of photosynthetic genes. The first completed mitochondrial genomes (mitogenomes) from parasitic mistletoes also exhibit significant degradation, but the generality of this observation for other parasitic plants is unclear. We sequenced the complete mitogenome and plastome of the hemiparasite Castilleja paramensis (Orobanchaceae) and compared them with additional holoparasitic, hemiparasitic and nonparasitic species from Orobanchaceae. Comparative mitogenomic analysis revealed minimal gene loss among the seven Orobanchaceae species, indicating the retention of typical mitochondrial function among Orobanchaceae species. Phylogenetic analysis demonstrated that the mobile cox1 intron was acquired vertically from a nonparasitic ancestor, arguing against a role for Orobanchaceae parasites in the horizontal acquisition or distribution of this intron. The C. paramensis plastome has retained nearly all genes except for the recent pseudogenization of four subunits of the NAD(P)H dehydrogenase complex, indicating a very early stage of plastome degradation. These results lend support to the notion that loss of ndh gene function is the first step of plastome degradation in the transition to a parasitic lifestyle. PMID:27808159

  3. Organization of the human [zeta]-crystallin/quinone reductase gene (CRYZ)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gonzalez, P.; Rao, P.V.; Zigler, J.S. Jr.

    1994-05-15

    [zeta]-Crystallin is a protein highly expressed in the lens of guinea pigs and camels, where it comprises about 10% of the total soluble protein. It has recently been characterized as a novel quinone oxidoreductase present in a variety of mammalian tissues. The authors report here the isolation and characterization of the human [zeta]-crystallin gene (CRYZ) and its processed pseudogene. The functional gene is composed of nine exons and spans about 20 kb. The 5[prime]-flanking region of the gene is rich in G and C (58%) and lacks TATA and CAAT boxes. Previous analysis of the guinea pig gene revealed themore » presence of two different promoters, one responsible for the high lens-specific expression and the other for expression at the enzymatic level in numerous tissues. Comparative analysis with the guinea pig gene shows that a region of [approximately]2.5 kb that includes the promoter responsible for the high expression in the lens in guinea pig is not present in the human gene. 34 refs., 6 figs., 1 tab.« less

  4. Sheep skeletal muscle transcriptome analysis reveals muscle growth regulatory lncRNAs

    PubMed Central

    Chao, Tianle; Ji, Zhibin; Hou, Lei; Wang, Jin; Zhang, Chunlan

    2018-01-01

    As widely distributed domestic animals, sheep are an important species and the source of mutton. In this study, we aimed to evaluate the regulatory lncRNAs associated with muscle growth and development between high production mutton sheep (Dorper sheep and Qianhua Mutton Merino sheep) and low production mutton sheep (Small-tailed Han sheep). In total, 39 lncRNAs were found to be differentially expressed. Using co-expression analysis and functional annotation, 1,206 co-expression interactions were found between 32 lncRNAs and 369 genes, and 29 of these lncRNAs were found to be associated with muscle development, metabolism, cell proliferation and apoptosis. lncRNA–mRNA interactions revealed 6 lncRNAs as hub lncRNAs. Moreover, three lncRNAs and their associated co-expressed genes were demonstrated by cis-regulatory gene analyses, and we also found a potential regulatory relationship between the pseudogene lncRNA LOC101121401 and its parent gene FTH1. This study provides a genome-wide resolution of lncRNA and mRNA regulation in muscles from mutton sheep. PMID:29666768

  5. Angiogenesis Research to Improve Therapies for Vascular Leak Syndromes, Intra-abdominal Adhesions, and Arterial Injuries

    DTIC Science & Technology

    2008-04-01

    small-cell lung cancer; TFP1, transferrin pseudogene 1; TGFβ, transforming growth factor-β; TNF, tumour- necrosis factor; uPA, urokinase-type plasminogen...Cell 79, 315–328 (1994). 23. Frater-Schroder, M., Risau, W., Hallmann, R., Gautschi, P. & Bohlen, P. Tumor necrosis factor type α, a potent inhibitor...relatively thin (skin) or avascular (cartilage) tissues, where post- implantation vascularization from the host is sufficient. To overcome the problem

  6. Draft Genome Sequence of Leuconostoc mesenteroides P45 Isolated from Pulque, a Traditional Mexican Alcoholic Fermented Beverage.

    PubMed

    Riveros-Mckay, Fernando; Campos, Itzia; Giles-Gómez, Martha; Bolívar, Francisco; Escalante, Adelfo

    2014-11-06

    Leuconostoc mesenteroides P45 was isolated from the traditional Mexican pulque beverage. We report its draft genome sequence, assembled in 6 contigs consisting of 1,874,188 bp and no plasmids. Genome annotation predicted a total of 1,800 genes, 1,687 coding sequences, 52 pseudogenes, 9 rRNAs, 51 tRNAs, 1 noncoding RNA, and 44 frameshifted genes. Copyright © 2014 Riveros-Mckay et al.

  7. RNA-Mediated Gene Duplication and Retroposons: Retrogenes, LINEs, SINEs, and Sequence Specificity

    PubMed Central

    2013-01-01

    A substantial number of “retrogenes” that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3′-end sequences of various SINEs originated from a corresponding LINE. As the 3′-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require “stringent” recognition of the 3′-end sequence of the RNA template. However, the 3′-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3′-poly(A) repeats. Since the 3′-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution. PMID:23984183

  8. Genome of ‘Ca. Desulfovibrio trichonymphae', an H2-oxidizing bacterium in a tripartite symbiotic system within a protist cell in the termite gut

    PubMed Central

    Kuwahara, Hirokazu; Yuki, Masahiro; Izawa, Kazuki; Ohkuma, Moriya; Hongoh, Yuichi

    2017-01-01

    The cellulolytic protist Trichonympha agilis in the termite gut permanently hosts two symbiotic bacteria, ‘Candidatus Endomicrobium trichonymphae' and ‘Candidatus Desulfovibrio trichonymphae'. The former is an intracellular symbiont, and the latter is almost intracellular but still connected to the outside via a small pore. The complete genome of ‘Ca. Endomicrobium trichonymphae' has previously been reported, and we here present the complete genome of ‘Ca. Desulfovibrio trichonymphae'. The genome is small (1 410 056 bp), has many pseudogenes, and retains biosynthetic pathways for various amino acids and cofactors, which are partially complementary to those of ‘Ca. Endomicrobium trichonymphae'. An amino acid permease gene has apparently been transferred between the ancestors of these two symbionts; a lateral gene transfer has affected their metabolic capacity. Notably, ‘Ca. Desulfovibrio trichonymphae' retains the complex system to oxidize hydrogen by sulfate and/or fumarate, while genes for utilizing other substrates common in desulfovibrios are pseudogenized or missing. Thus, ‘Ca. Desulfovibrio trichonymphae' is specialized to consume hydrogen that may otherwise inhibit fermentation processes in both T. agilis and ‘Ca. Endomicrobium trichonymphae'. The small pore may be necessary to take up sulfate. This study depicts a genome-based model of a multipartite symbiotic system within a cellulolytic protist cell in the termite gut. PMID:27801909

  9. Identification and Evolution of Functional Alleles of the Previously Described Pollen Specific Myrosinase Pseudogene AtTGG6 in Arabidopsis thaliana.

    PubMed

    Fu, Lili; Han, Bingying; Tan, Deguan; Wang, Meng; Ding, Mei; Zhang, Jiaming

    2016-02-22

    Myrosinases are β-thioglucoside glucohydrolases and serve as defense mechanisms against insect pests and pathogens by producing toxic compounds. AtTGG6 in Arabidopsis thaliana was previously reported to be a myrosinase pseudogene but specifically expressed in pollen. However, we found that AlTGG6, an ortholog to AtTGG6 in A. lyrata (an outcrossing relative of A. thaliana) was functional, suggesting that functional AtTGG6 alleles may still exist in A. thaliana. AtTGG6 alleles in 29 A. thaliana ecotypes were cloned and sequenced. Results indicate that ten alleles were functional and encoded Myr II type myrosinase of 512 amino acids, and myrosinase activity was confirmed by overexpressing AtTGG6 in Pichia pastoris. However, the 19 other ecotypes had disabled alleles with highly polymorphic frame-shift mutations and diversified sequences. Thirteen frame-shift mutation types were identified, which occurred independently many times in the evolutionary history within a few thousand years. The functional allele was expressed specifically in pollen similar to the disabled alleles but at a higher expression level, suggesting its role in defense of pollen against insect pests such as pollen beetles. However, the defense function may have become less critical after A. thaliana evolved to self-fertilization, and thus resulted in loss of function in most ecotypes.

  10. The NLP toxin family in Phytophthora sojae includes rapidly evolving groups that lack necrosis-inducing activity.

    PubMed

    Dong, Suomeng; Kong, Guanghui; Qutob, Dinah; Yu, Xiaoli; Tang, Junli; Kang, Jixiong; Dai, Tingting; Wang, Hai; Gijzen, Mark; Wang, Yuanchao

    2012-07-01

    Necrosis- and ethylene-inducing-like proteins (NLP) are widely distributed in eukaryotic and prokaryotic plant pathogens and are considered to be important virulence factors. We identified, in total, 70 potential Phytophthora sojae NLP genes but 37 were designated as pseudogenes. Sequence alignment of the remaining 33 NLP delineated six groups. Three of these groups include proteins with an intact heptapeptide (Gly-His-Arg-His-Asp-Trp-Glu) motif, which is important for necrosis-inducing activity, whereas the motif is not conserved in the other groups. In total, 19 representative NLP genes were assessed for necrosis-inducing activity by heterologous expression in Nicotiana benthamiana. Surprisingly, only eight genes triggered cell death. The expression of the NLP genes in P. sojae was examined, distinguishing 20 expressed and 13 nonexpressed NLP genes. Real-time reverse-transcriptase polymerase chain reaction results indicate that most NLP are highly expressed during cyst germination and infection stages. Amino acid substitution ratios (Ka/Ks) of 33 NLP sequences from four different P. sojae strains resulted in identification of positive selection sites in a distinct NLP group. Overall, our study indicates that expansion and pseudogenization of the P. sojae NLP family results from an ongoing birth-and-death process, and that varying patterns of expression, necrosis-inducing activity, and positive selection suggest that NLP have diversified in function.

  11. Analysis of ZP1 gene reveals differences in zona pellucida composition in carnivores.

    PubMed

    Moros-Nicolás, C; Leza, A; Chevret, P; Guillén-Martínez, A; González-Brusi, L; Boué, F; Lopez-Bejar, M; Ballesta, J; Avilés, M; Izquierdo-Rico, M J

    2018-01-01

    The zona pellucida (ZP) is an extracellular envelope that surrounds mammalian oocytes. This coat participates in the interaction between gametes, induction of the acrosome reaction, block of polyspermy and protection of the oviductal embryo. Previous studies suggested that carnivore ZP was formed by three glycoproteins (ZP2, ZP3 and ZP4), with ZP1 being a pseudogene. However, a recent study in the cat found that all four proteins were expressed. In the present study, in silico and molecular analyses were performed in several carnivores to clarify the ZP composition in this order of mammals. The in silico analysis demonstrated the presence of the ZP1 gene in five carnivores: cheetah, panda, polar bear, tiger and walrus, whereas in the Antarctic fur seal and the Weddell seal there was evidence of pseudogenisation. Molecular analysis showed the presence of four ZP transcripts in ferret ovaries (ZP1, ZP2, ZP3 and ZP4) and three in fox ovaries (ZP2, ZP3 and ZP4). Analysis of the fox ZP1 gene showed the presence of a stop codon. The results strongly suggest that all four ZP genes are expressed in most carnivores, whereas ZP1 pseudogenisation seems to have independently affected three families (Canidae, Otariidae and Phocidae) of the carnivore tree.

  12. Allele Specific shRNA for Nanog, and Its Use to Treat Cancer | NCI Technology Transfer Center | TTC

    Cancer.gov

    The National Cancer Institute announced positive study results indicating that the expression of NanogP8, a pseudogene of Nanog, is upregulated in human colorectal cancer spheroids formed in serum-free medium. The National Cancer Institute's Labortory of Experimental Carcinogenesis seeks parties of interest to co-develop the use of shRNAs incorporated into a lentiviral vector as a gene therapy to inhibit NanogP8, a retrogene upregulated in several carcinomas.

  13. Non-invasive fetal RHD genotyping for RhD negative women stratified into RHD gene deletion or variant groups: comparative accuracy using two blood collection tube types.

    PubMed

    Hyland, Catherine A; Millard, Glenda M; O'Brien, Helen; Schoeman, Elizna M; Lopez, Genghis H; McGowan, Eunike C; Tremellen, Anne; Puddephatt, Rachel; Gaerty, Kirsten; Flower, Robert L; Hyett, Jonathan A; Gardener, Glenn J

    2017-12-01

    Non-invasive fetal RHD genotyping in Australia to reduce anti-D usage will need to accommodate both prolonged sample transport times and a diverse population demographic harbouring a range of RHD blood group gene variants. We compared RHD genotyping accuracy using two blood sample collection tube types for RhD negative women stratified into deleted RHD gene haplotype and RHD gene variant cohorts. Maternal blood samples were collected into EDTA and cell-free (cf)DNA stabilising (BCT) tubes from two sites, one interstate. Automated DNA extraction and polymerase chain reaction (PCR) were used to amplify RHD exons 5 and 10 and CCR5. Automated analysis flagged maternal RHD variants, which were classified by genotyping. Time between sample collection and processing ranged from 2.9 to 187.5 hours. cfDNA levels increased with time for EDTA (range 0.03-138 ng/μL) but not BCT samples (0.01-3.24 ng/μL). For the 'deleted' cohort (n=647) all fetal RHD genotyping outcomes were concordant, excepting for one unexplained false negative EDTA sample. Matched against cord RhD serology, negative predictive values using BCT and EDTA tubes were 100% and 99.6%, respectively. Positive predictive values were 99.7% for both types. Overall 37.2% of subjects carried an RhD negative baby. The 'variant' cohort (n=15) included one novel RHD and eight hybrid or African pseudogene variants. Review for fetal RHD specific signals, based on one exon, showed three EDTA samples discordant to BCT, attributed to high maternal cfDNA levels arising from prolonged transport times. For the deleted haplotype cohort, fetal RHD genotyping accuracy was comparable for samples collected in EDTA and BCT tubes despite higher cfDNA levels in the EDTA tubes. Capacity to predict fetal RHD genotype for maternal carriers of hybrid or pseudogene RHD variants requires stringent control of cfDNA levels. We conclude that fetal RHD genotyping is feasible in the Australian environment to avoid unnecessary anti-D immunoglobulin prophylaxis. Copyright © 2017. Published by Elsevier B.V.

  14. The Genome of the Obligately Intracellular Bacterium Ehrlichia canis Reveals Themes of Complex Membrane Structure and Immune Evasion Strategies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mavromatis, K; Doyle, C Kuyler; Lykidis, A

    2006-01-01

    Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less

  15. The genome of obligately intracellular Ehrlichia canis revealsthemes of complex membrane structure and immune evasion strategies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mavromatis, K.; Kuyler Doyle, C.; Lykidis, A.

    2005-09-01

    Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, a-proteobacterium is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, and 17 putative pseudogenes, and a substantial proportion of non-coding sequence (27 percent). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences, and a unique serine-threonine bias associated with the potential for O-glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein familiesmore » associated with immune evasion were identified, one of which contains poly G:C tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Proteins associated with pathogen-host interactions were identified including a small group of proteins (12) with tandem repeats and another with eukaryotic-like ankyrin domains (7).« less

  16. Evolution of the Class IV HD-Zip Gene Family in Streptophytes

    PubMed Central

    Zalewski, Christopher S.; Floyd, Sandra K.; Furumizu, Chihiro; Sakakibara, Keiko; Stevenson, Dennis W.; Bowman, John L.

    2013-01-01

    Class IV homeodomain leucine zipper (C4HDZ) genes are plant-specific transcription factors that, based on phenotypes in Arabidopsis thaliana, play an important role in epidermal development. In this study, we sampled all major extant lineages and their closest algal relatives for C4HDZ homologs and phylogenetic analyses result in a gene tree that mirrors land plant evolution with evidence for gene duplications in many lineages, but minimal evidence for gene losses. Our analysis suggests an ancestral C4HDZ gene originated in an algal ancestor of land plants and a single ancestral gene was present in the last common ancestor of land plants. Independent gene duplications are evident within several lineages including mosses, lycophytes, euphyllophytes, seed plants, and, most notably, angiosperms. In recently evolved angiosperm paralogs, we find evidence of pseudogenization via mutations in both coding and regulatory sequences. The increasing complexity of the C4HDZ gene family through the diversification of land plants correlates to increasing complexity in epidermal characters. PMID:23894141

  17. Current Research on Non-Coding Ribonucleic Acid (RNA).

    PubMed

    Wang, Jing; Samuels, David C; Zhao, Shilin; Xiang, Yu; Zhao, Ying-Yong; Guo, Yan

    2017-12-05

    Non-coding ribonucleic acid (RNA) has without a doubt captured the interest of biomedical researchers. The ability to screen the entire human genome with high-throughput sequencing technology has greatly enhanced the identification, annotation and prediction of the functionality of non-coding RNAs. In this review, we discuss the current landscape of non-coding RNA research and quantitative analysis. Non-coding RNA will be categorized into two major groups by size: long non-coding RNAs and small RNAs. In long non-coding RNA, we discuss regular long non-coding RNA, pseudogenes and circular RNA. In small RNA, we discuss miRNA, transfer RNA, piwi-interacting RNA, small nucleolar RNA, small nuclear RNA, Y RNA, single recognition particle RNA, and 7SK RNA. We elaborate on the origin, detection method, and potential association with disease, putative functional mechanisms, and public resources for these non-coding RNAs. We aim to provide readers with a complete overview of non-coding RNAs and incite additional interest in non-coding RNA research.

  18. Bacteriological and genetic assessment of game meat from Japanese wild boars.

    PubMed

    Naya, Yuka; Horiuchi, Motohiro; Ishiguro, Naotaka; Shinagawa, Morikazu

    2003-01-15

    Bacterial tests were used to assess bacterial contamination of game meat from Japanese wild boars. The bacterial contamination of wild boar meat was less than that of domestic pork, as determined by aerobic plate counts (APC) and coliform counts. None of the meat examined in this study was contaminated by Salmonella or E. coli O-157. To detect adulteration by domestic pig meat or European wild boar meat, 46 samples of game meat sold as Japanese wild boar were examined genetically. A total of 17 samples showed genetic haplotypes of European and Asian domestic pigs in the D-loop of mitochondrial DNA (mtDNA), and 16 samples showed nuclear glucosephosphate isomerase-processed pseudogene (GPIP) genotypes of European domestic pigs. The European GPIP genotypes of these samples were confirmed by PCR-RFLP analysis. These results indicate that some game meat sold as Japanese wild boar is adulterated by cross-breeding between pigs and wild boars or by contamination with meat from domestic pigs or European wild boars.

  19. The evolution of vertebrate Toll-like receptors

    USGS Publications Warehouse

    Roach, J.C.; Glusman, G.; Rowen, L.; Kaur, A.; Purcell, M.K.; Smith, K.D.; Hood, L.E.; Aderem, A.

    2005-01-01

    The complete sequences of Takifugu Toll-like receptor (TLR) loci and gene predictions from many draft genomes enable comprehensive molecular phylogenetic analysis. Strong selective pressure for recognition of and response to pathogen-associated molecular patterns has maintained a largely unchanging TLR recognition in all vertebrates. There are six major families of vertebrate TLRs. This repertoire is distinct from that of invertebrates. TLRs within a family recognize a general class of pathogen-associated molecular patterns. Most vertebrates have exactly one gene ortholog for each TLR family. The family including TLR1 has more species-specific adaptations than other families. A major family including TLR11 is represented in humans only by a pseudogene. Coincidental evolution plays a minor role in TLR evolution. The sequencing phase of this study produced finished genomic sequences for the 12 Takifugu rubripes TLRs. In addition, we have produced > 70 gene models, including sequences from the opossum, chicken, frog, dog, sea urchin, and sea squirt. ?? 2005 by The National Academy of Sciences of the USA.

  20. Genomic regression of claw keratin, taste receptor and light-associated genes provides insights into biology and evolutionary origins of snakes.

    PubMed

    Emerling, Christopher A

    2017-10-01

    Regressive evolution of anatomical traits often corresponds with the regression of genomic loci underlying such characters. As such, studying patterns of gene loss can be instrumental in addressing questions of gene function, resolving conflicting results from anatomical studies, and understanding the evolutionary history of clades. The evolutionary origins of snakes involved the regression of a number of anatomical traits, including limbs, taste buds and the visual system, and by analyzing serpent genomes, I was able to test three hypotheses associated with the regression of these features. The first concerns two keratins that are putatively specific to claws. Both genes that encode these keratins are pseudogenized/deleted in snake genomes, providing additional evidence of claw-specificity. The second hypothesis is that snakes lack taste buds, an issue complicated by conflicting results in the literature. I found evidence that different snakes have lost one or more taste receptors, but all snakes examined retained at least one gustatory channel. The final hypothesis addressed is that the earliest snakes were adapted to a dim light niche. I found evidence of deleted and pseudogenized genes with light-associated functions in snakes, demonstrating a pattern of gene loss similar to other dim light-adapted clades. Molecular dating estimates suggest that dim light adaptation preceded the loss of limbs, providing some bearing on interpretations of the ecological origins of snakes. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Seven New Complete Plastome Sequences Reveal Rampant Independent Loss of the ndh Gene Family across Orchids and Associated Instability of the Inverted Repeat/Small Single-Copy Region Boundaries.

    PubMed

    Kim, Hyoung Tae; Kim, Jung Sung; Moore, Michael J; Neubig, Kurt M; Williams, Norris H; Whitten, W Mark; Kim, Joo-Hwan

    2015-01-01

    Earlier research has revealed that the ndh loci have been pseudogenized, truncated, or deleted from most orchid plastomes sequenced to date, including in all available plastomes of the two most species-rich subfamilies, Orchidoideae and Epidendroideae. This study sought to resolve deeper-level phylogenetic relationships among major orchid groups and to refine the history of gene loss in the ndh loci across orchids. The complete plastomes of seven orchids, Oncidium sphacelatum (Epidendroideae), Masdevallia coccinea (Epidendroideae), Sobralia callosa (Epidendroideae), Sobralia aff. bouchei (Epidendroideae), Elleanthus sodiroi (Epidendroideae), Paphiopedilum armeniacum (Cypripedioideae), and Phragmipedium longifolium (Cypripedioideae) were sequenced and analyzed in conjunction with all other available orchid and monocot plastomes. Most ndh loci were found to be pseudogenized or lost in Oncidium, Paphiopedilum and Phragmipedium, but surprisingly, all ndh loci were found to retain full, intact reading frames in Sobralia, Elleanthus and Masdevallia. Character mapping suggests that the ndh genes were present in the common ancestor of orchids but have experienced independent, significant losses at least eight times across four subfamilies. In addition, ndhF gene loss was correlated with shifts in the position of the junction of the inverted repeat (IR) and small single-copy (SSC) regions. The Orchidaceae have unprecedented levels of homoplasy in ndh gene presence/absence, which may be correlated in part with the unusual life history of orchids. These results also suggest that ndhF plays a role in IR/SSC junction stability.

  2. Identification of Complete Repertoire of Apis florea Odorant Receptors Reveals Complex Orthologous Relationships with Apis mellifera

    PubMed Central

    Karpe, Snehal D.; Jain, Rikesh; Brockmann, Axel; Sowdhamini, Ramanathan

    2016-01-01

    Abstract We developed a computational pipeline for homology based identification of the complete repertoire of olfactory receptor (OR) genes in the Asian honey bee species, Apis florea. Apis florea is phylogenetically the most basal honey bee species and also the most distant sister species to the Western honey bee Apis mellifera, for which all OR genes had been identified before. Using our pipeline, we identified 180 OR genes in A. florea, which is very similar to the number of ORs identified in A. mellifera (177 ORs). Many characteristics of the ORs including gene structure, synteny of tandemly repeated ORs and basic phylogenetic clustering are highly conserved. The composite phylogenetic tree of A. florea and A. mellifera ORs could be divided into 21 clades which are in harmony with the existing Hymenopteran tree. However, we found a few nonorthologous OR relationships between both species as well as independent pseudogenization of ORs suggesting separate evolutionary changes. Particularly, a subgroup of the OR gene clade XI, which had been hypothesized to code cuticular hydrocarbon receptors showed a high number of species-specific ORs. RNAseq analysis detected a total number of 145 OR transcripts in male and 162 in female antennae. Most of the OR genes were highly expressed on the female antennae. However, we detected five distinct male-biased OR genes, out of which three genes (AfOr11, AfOr18, AfOr170P) were shown to be male-biased in A. mellifera, too, thus corroborating a behavioral function in sex-pheromone communication. PMID:27540087

  3. Contribution of type W human endogenous retroviruses to the human genome: characterization of HERV-W proviral insertions and processed pseudogenes.

    PubMed

    Grandi, Nicole; Cadeddu, Marta; Blomberg, Jonas; Tramontano, Enzo

    2016-09-09

    Human endogenous retroviruses (HERVs) are ancient sequences integrated in the germ line cells and vertically transmitted through the offspring constituting about 8 % of our genome. In time, HERVs accumulated mutations that compromised their coding capacity. A prominent exception is HERV-W locus 7q21.2, producing a functional Env protein (Syncytin-1) coopted for placental syncytiotrophoblast formation. While expression of HERV-W sequences has been investigated for their correlation to disease, an exhaustive description of the group composition and characteristics is still not available and current HERV-W group information derive from studies published a few years ago that, of course, used the rough assemblies of the human genome available at that time. This hampers the comparison and correlation with current human genome assemblies. In the present work we identified and described in detail the distribution and genetic composition of 213 HERV-W elements. The bioinformatics analysis led to the characterization of several previously unreported features and provided a phylogenetic classification of two main subgroups with different age and structural characteristics. New facts on HERV-W genomic context of insertion and co-localization with sequences putatively involved in disease development are also reported. The present work is a detailed overview of the HERV-W contribution to the human genome and provides a robust genetic background useful to clarify HERV-W role in pathologies with poorly understood etiology, representing, to our knowledge, the most complete and exhaustive HERV-W dataset up to date.

  4. Pivotal Impacts of Retrotransposon Based Invasive RNAs on Evolution.

    PubMed

    Habibi, Laleh; Salmani, Hamzeh

    2017-01-01

    RNAs have long been described as the mediators of gene expression; they play a vital role in the structure and function of cellular complexes. Although the role of RNAs in the prokaryotes is mainly confined to these basic functions, the effects of these molecules in regulating the gene expression and enzymatic activities have been discovered in eukaryotes. Recently, a high-resolution analysis of the DNA obtained from different organisms has revealed a fundamental impact of the RNAs in shaping the genomes, heterochromatin formation, and gene creation. Deep sequencing of the human genome revealed that about half of our DNA is comprised of repetitive sequences (remnants of transposable element movements) expanded mostly through RNA-mediated processes. ORF2 encoded by L1 retrotransposons is a cellular reverse transcriptase which is mainly responsible for RNA invasion of various transposable elements (L1s, Alus, and SVAs) and cellular mRNAs in to the genomic DNA. In addition to increasing retroelements copy number; genomic expansion in association with centromere, telomere, and heterochromatin formation as well as pseudogene creation are the evolutionary consequences of this RNA-based activity. Threatening DNA integrity by disrupting the genes and forming excessive double strand breaks is another effect of this invasion. Therefore, repressive mechanisms have been evolved to control the activities of these invasive intracellular RNAs. All these mechanisms now have essential roles in the complex cellular functions. Therefore, it can be concluded that without direct action of RNA networks in shaping the genome and in the development of different cellular mechanisms, the evolution of higher eukaryotes would not be possible.

  5. Pivotal Impacts of Retrotransposon Based Invasive RNAs on Evolution

    PubMed Central

    Habibi, Laleh; Salmani, Hamzeh

    2017-01-01

    RNAs have long been described as the mediators of gene expression; they play a vital role in the structure and function of cellular complexes. Although the role of RNAs in the prokaryotes is mainly confined to these basic functions, the effects of these molecules in regulating the gene expression and enzymatic activities have been discovered in eukaryotes. Recently, a high-resolution analysis of the DNA obtained from different organisms has revealed a fundamental impact of the RNAs in shaping the genomes, heterochromatin formation, and gene creation. Deep sequencing of the human genome revealed that about half of our DNA is comprised of repetitive sequences (remnants of transposable element movements) expanded mostly through RNA-mediated processes. ORF2 encoded by L1 retrotransposons is a cellular reverse transcriptase which is mainly responsible for RNA invasion of various transposable elements (L1s, Alus, and SVAs) and cellular mRNAs in to the genomic DNA. In addition to increasing retroelements copy number; genomic expansion in association with centromere, telomere, and heterochromatin formation as well as pseudogene creation are the evolutionary consequences of this RNA-based activity. Threatening DNA integrity by disrupting the genes and forming excessive double strand breaks is another effect of this invasion. Therefore, repressive mechanisms have been evolved to control the activities of these invasive intracellular RNAs. All these mechanisms now have essential roles in the complex cellular functions. Therefore, it can be concluded that without direct action of RNA networks in shaping the genome and in the development of different cellular mechanisms, the evolution of higher eukaryotes would not be possible. PMID:29067016

  6. Genetic Analyses of the NF1 Gene in Turkish Neurofibromatosis Type I Patients and Definition of three Novel Variants

    PubMed Central

    Ulusal, SD; Gürkan, H; Atlı, E; Özal, SA; Çiftdemir, M; Tozkır, H; Karal, Y; Güçlü, H; Eker, D; Görker, I

    2017-01-01

    Abstract Neurofibromatosis Type I (NF1) is a multi systemic autosomal dominant neurocutaneous disorder predisposing patients to have benign and/or malignant lesions predominantly of the skin, nervous system and bone. Loss of function mutations or deletions of the NF1 gene is responsible for NF1 disease. Involvement of various pathogenic variants, the size of the gene and presence of pseudogenes makes it difficult to analyze. We aimed to report the results of 2 years of multiplex ligation-dependent probe amplification (MLPA) and next generation sequencing (NGS) for genetic diagnosis of NF1 applied at our genetic diagnosis center. The MLPA, semiconductor sequencing and Sanger sequencing were performed in genomic DNA samples from 24 unrelated patients and their affected family members referred to our center suspected of having NF1. In total, three novel and 12 known pathogenic variants and a whole gene deletion were determined. We suggest that next generation sequencing is a practical tool for genetic analysis of NF1. Deletion/duplication analysis with MLPA may also be helpful for patients clinically diagnosed to carry NF1 but do not have a detectable mutation in NGS. PMID:28924536

  7. Recolonization and radiation in Larix (Pinaceae): evidence from nuclear ribosomal DNA paralogues.

    PubMed

    Wei, Xiao-Xin; Wang, Xiao-Quan

    2004-10-01

    Gene paralogy frequently causes the conflict between gene tree and species tree, but sometimes the coexistence of a few paralogous copies could provide more markers for tracing the phylogeographical process of some organisms. In the present study, nrDNA ITS paralogues were cloned from all but one species of Larix, an Eocene genus having two sections, Larix and Multiserialis, with a huge circumboreal distribution and an Eastern Asia-Western North America disjunction, respectively. A total of 96 distinct clones, excluding five putative pseudogenes or recombinants, were obtained and used in the gene genealogy analysis. The clones from all Eurasian species of section Larix are mixed together, suggesting that recolonization and recent morphological differentiation could have played important roles in the evolution of this section. In contrast, the species diversification of the Eurasian section Multiserialis may result from radiation in the east Himalayas and its vicinity, considering extensive nrDNA founder effects in this group. Our study also suggests that the distribution pattern analysis of members of multiple gene family would be very useful in tracking the evolutionary history of some taxa with recent origin or rapid radiation that cannot be resolved by other molecular markers.

  8. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions.

    PubMed

    Conte, Matthew A; Gammerdinger, William J; Bartie, Kerry L; Penman, David J; Kocher, Thomas D

    2017-05-02

    Tilapias are the second most farmed fishes in the world and a sustainable source of food. Like many other fish, tilapias are sexually dimorphic and sex is a commercially important trait in these fish. In this study, we developed a significantly improved assembly of the tilapia genome using the latest genome sequencing methods and show how it improves the characterization of two sex determination regions in two tilapia species. A homozygous clonal XX female Nile tilapia (Oreochromis niloticus) was sequenced to 44X coverage using Pacific Biosciences (PacBio) SMRT sequencing. Dozens of candidate de novo assemblies were generated and an optimal assembly (contig NG50 of 3.3Mbp) was selected using principal component analysis of likelihood scores calculated from several paired-end sequencing libraries. Comparison of the new assembly to the previous O. niloticus genome assembly reveals that recently duplicated portions of the genome are now well represented. The overall number of genes in the new assembly increased by 27.3%, including a 67% increase in pseudogenes. The new tilapia genome assembly correctly represents two recent vasa gene duplication events that have been verified with BAC sequencing. At total of 146Mbp of additional transposable element sequence are now assembled, a large proportion of which are recent insertions. Large centromeric satellite repeats are assembled and annotated in cichlid fish for the first time. Finally, the new assembly identifies the long-range structure of both a ~9Mbp XY sex determination region on LG1 in O. niloticus, and a ~50Mbp WZ sex determination region on LG3 in the related species O. aureus. This study highlights the use of long read sequencing to correctly assemble recent duplications and to characterize repeat-filled regions of the genome. The study serves as an example of the need for high quality genome assemblies and provides a framework for identifying sex determining genes in tilapia and related fish species.

  9. Chloroplast Genome Sequence of Pigeonpea (Cajanus cajan (L.) Millspaugh) and Cajanus scarabaeoides (L.) Thouars: Genome Organization and Comparison with Other Legumes

    PubMed Central

    Kaila, Tanvi; Chaduvla, Pavan K.; Saxena, Swati; Bahadur, Kaushlendra; Gahukar, Santosh J.; Chaudhury, Ashok; Sharma, T. R.; Singh, N. K.; Gaikwad, Kishor

    2016-01-01

    Pigeonpea (Cajanus cajan (L.) Millspaugh), a diploid (2n = 22) legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides (L.) Thouars were generated. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harboring the C. scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of C. cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of C. scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50 kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in C. scarabaeoides and C. cajan respectively. RNA editing was observed at 37 sites in both C. scarabaeoides and C. cajan, with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes. PMID:28018385

  10. Chloroplast Genome Sequence of Pigeonpea (Cajanus cajan (L.) Millspaugh) and Cajanus scarabaeoides (L.) Thouars: Genome Organization and Comparison with Other Legumes.

    PubMed

    Kaila, Tanvi; Chaduvla, Pavan K; Saxena, Swati; Bahadur, Kaushlendra; Gahukar, Santosh J; Chaudhury, Ashok; Sharma, T R; Singh, N K; Gaikwad, Kishor

    2016-01-01

    Pigeonpea ( Cajanus cajan (L.) Millspaugh), a diploid (2n = 22) legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides (L.) Thouars were generated. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harboring the C. scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of C. cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of C. scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50 kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in C. scarabaeoides and C. cajan respectively. RNA editing was observed at 37 sites in both C. scarabaeoides and C. cajan , with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes.

  11. Mitis group streptococci express variable pilus islet 2 pili.

    PubMed

    Zähner, Dorothea; Gandhi, Ashish R; Yi, Hong; Stephens, David S

    2011-01-01

    Streptococcus oralis, Streptococcus mitis, and Streptococcus sanguinis are members of the Mitis group of streptococci and agents of oral biofilm, dental plaque and infective endocarditis, disease processes that involve bacteria-bacteria and bacteria-host interactions. Their close relative, the human pathogen S. pneumoniae uses pilus-islet 2 (PI-2)-encoded pili to facilitate adhesion to eukaryotic cells. PI-2 pilus-encoding genetic islets were identified in S. oralis, S. mitis, and S. sanguinis, but were absent from other isolates of these species. The PI-2 islets resembled the genetic organization of the PI-2 islet of S. pneumoniae, but differed in the genes encoding the structural pilus proteins PitA and PitB. Two and three variants of pitA (a pseudogene in S. pneumoniae) and pitB, respectively, were identified that showed ≈20% difference in nucleotide as well as corresponding protein sequence. Species-independent combinations of pitA and pitB variants indicated prior intra- and interspecies horizontal gene transfer events. Polyclonal antisera developed against PitA and PitB of S. oralis type strain ATCC35037 revealed that PI-2 pili in oral streptococci were composed of PitA and PitB. Electronmicrographs showed pilus structures radiating >700 nm from the bacterial surface in the wild type strain, but not in an isogenic PI-2 deletion mutant. Anti-PitB-antiserum only reacted with pili containing the same PitB variant, whereas anti-PitA antiserum was cross-reactive with the other PitA variant. Electronic multilocus sequence analysis revealed that all PI-2-encoding oral streptococci were closely-related and cluster with non-PI-2-encoding S. oralis strains. This is the first identification of PI-2 pili in Mitis group oral streptococci. The findings provide a striking example of intra- and interspecies horizontal gene transfer. The PI-2 pilus diversity provides a possible key to link strain-specific bacterial interactions and/or tissue tropisms with pathogenic traits in the Mitis group streptococci.

  12. Mouse HLA-DPA homologue H2-Pa: A pseudogene that maps between H2-Pb and H2-Oa

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arimura, Y.; Koda, T.; Kishi, M.

    1996-12-31

    The major histocompatibility complex (MHC) class II subregion contains several subclasses of genes. The classical class II genes, HLA-DP, DQ, and DR homologues, present antigens directly to CD4{sup +} T cells. HLA-DM homologues facilitate the efficacy and transport of antigens to the cell surface by removing the CLIP peptides from the classical class II molecules. HLA-DNA/DOB homologues show unusual expression patterns and limited polymorphism, but their function is yet to be elucidated. 15 refs., 2 figs.

  13. A Search for Gene Fusions/Translocations in Breast Cancer

    DTIC Science & Technology

    2012-10-01

    specific pseudogenes. 15. SUBJECT TERMS Gene fusions, sequencing, MAST,Notch 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18 ...Figure 5B) as well as in in vivo intravasation and metastasis in chicken chorioallantoic mem- brane xenograft assay (Figure 5C). In contrast...m p h o b la st o id (n = 8) Pa n cr ea ti c B en ig n (n = 3) Pr o st at e B en ig n (n = 18 ) C an ce r S p ec ifi c Sample Frequency (%)0 100

  14. Targeting Tumor Oct4 to Deplete Prostate Tumor and Metastasis Initiating Cells

    DTIC Science & Technology

    2016-10-01

    Award Number: W81XWH-13-1-0461 TITLE: Targeting Tumor Oct4 to Deplete Prostate Tumor- and Metastasis-Initiating Cells PRINCIPAL INVESTIGATOR: Daotai...29 2016 4. TITLE AND SUBTILE Targeting Tumor Oct4 to Deplete Prostate Tumor- and Metastasis-Initiating Cells 5a. CONTRACT NUMBER 5b. GRANT NUMBER...the c-MYC oncogene. POU5F1B is a pseudogene of embryonic Oct4 (POU5F1). A recent study found that tumor Oct4 found in prostate cancer cells is due

  15. Evolutionary constraints and the neutral theory. [mutation-caused nucleotide substitutions in DNA

    NASA Technical Reports Server (NTRS)

    Jukes, T. H.; Kimura, M.

    1984-01-01

    The neutral theory of molecular evolution postulates that nucleotide substitutions inherently take place in DNA as a result of point mutations followed by random genetic drift. In the absence of selective constraints, the substitution rate reaches the maximum value set by the mutation rate. The rate in globin pseudogenes is about 5 x 10 to the -9th substitutions per site per year in mammals. Rates slower than this indicate the presence of constraints imposed by negative (natural) selection, which rejects and discards deleterious mutations.

  16. Massive Losses of Taste Receptor Genes in Toothed and Baleen Whales

    PubMed Central

    Feng, Ping; Zheng, Jinsong; Rossiter, Stephen J.; Wang, Ding; Zhao, Huabin

    2014-01-01

    Taste receptor genes are functionally important in animals, with a surprising exception in the bottlenose dolphin, which shows extensive losses of sweet, umami, and bitter taste receptor genes. To examine the generality of taste gene loss, we examined seven toothed whales and five baleen whales and sequenced the complete repertoire of three sweet/umami (T1Rs) and ten bitter (T2Rs) taste receptor genes. We found all amplified T1Rs and T2Rs to be pseudogenes in all 12 whales, with a shared premature stop codon in 10 of the 13 genes, which demonstrated massive losses of taste receptor genes in the common ancestor of whales. Furthermore, we analyzed three genome sequences from two toothed whales and one baleen whale and found that the sour taste marker gene Pkd2l1 is a pseudogene, whereas the candidate salty taste receptor genes are intact and putatively functional. Additionally, we examined three genes that are responsible for taste signal transduction and found the relaxation of functional constraints on taste signaling pathways along the ancestral branch leading to whales. Together, our results strongly suggest extensive losses of sweet, umami, bitter, and sour tastes in whales, and the relaxation of taste function most likely arose in the common ancestor of whales between 36 and 53 Ma. Therefore, whales represent the first animal group to lack four of five primary tastes, probably driven by the marine environment with high concentration of sodium, the feeding behavior of swallowing prey whole, and the dietary switch from plants to meat in the whale ancestor. PMID:24803572

  17. Mitochondrial DNA as a non-invasive biomarker: Accurate quantification using real time quantitative PCR without co-amplification of pseudogenes and dilution bias

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Malik, Afshan N., E-mail: afshan.malik@kcl.ac.uk; Shahni, Rojeen; Rodriguez-de-Ledesma, Ana

    2011-08-19

    Highlights: {yields} Mitochondrial dysfunction is central to many diseases of oxidative stress. {yields} 95% of the mitochondrial genome is duplicated in the nuclear genome. {yields} Dilution of untreated genomic DNA leads to dilution bias. {yields} Unique primers and template pretreatment are needed to accurately measure mitochondrial DNA content. -- Abstract: Circulating mitochondrial DNA (MtDNA) is a potential non-invasive biomarker of cellular mitochondrial dysfunction, the latter known to be central to a wide range of human diseases. Changes in MtDNA are usually determined by quantification of MtDNA relative to nuclear DNA (Mt/N) using real time quantitative PCR. We propose that themore » methodology for measuring Mt/N needs to be improved and we have identified that current methods have at least one of the following three problems: (1) As much of the mitochondrial genome is duplicated in the nuclear genome, many commonly used MtDNA primers co-amplify homologous pseudogenes found in the nuclear genome; (2) use of regions from genes such as {beta}-actin and 18S rRNA which are repetitive and/or highly variable for qPCR of the nuclear genome leads to errors; and (3) the size difference of mitochondrial and nuclear genomes cause a 'dilution bias' when template DNA is diluted. We describe a PCR-based method using unique regions in the human mitochondrial genome not duplicated in the nuclear genome; unique single copy region in the nuclear genome and template treatment to remove dilution bias, to accurately quantify MtDNA from human samples.« less

  18. A-to-I RNA editing promotes developmental stage–specific gene and lncRNA expression

    PubMed Central

    Goldstein, Boaz; Agranat-Tamir, Lily; Light, Dean; Ben-Naim Zgayer, Orna; Fishman, Alla; Lamm, Ayelet T.

    2017-01-01

    A-to-I RNA editing is a conserved widespread phenomenon in which adenosine (A) is converted to inosine (I) by adenosine deaminases (ADARs) in double-stranded RNA regions, mainly noncoding. Mutations in ADAR enzymes in Caenorhabditis elegans cause defects in normal development but are not lethal as in human and mouse. Previous studies in C. elegans indicated competition between RNA interference (RNAi) and RNA editing mechanisms, based on the observation that worms that lack both mechanisms do not exhibit defects, in contrast to the developmental defects observed when only RNA editing is absent. To study the effects of RNA editing on gene expression and function, we established a novel screen that enabled us to identify thousands of RNA editing sites in nonrepetitive regions in the genome. These include dozens of genes that are edited at their 3′ UTR region. We found that these genes are mainly germline and neuronal genes, and that they are down-regulated in the absence of ADAR enzymes. Moreover, we discovered that almost half of these genes are edited in a developmental-specific manner, indicating that RNA editing is a highly regulated process. We found that many pseudogenes and other lncRNAs are also extensively down-regulated in the absence of ADARs in the embryo but not in the fourth larval (L4) stage. This down-regulation is not observed upon additional knockout of RNAi. Furthermore, levels of siRNAs aligned to pseudogenes in ADAR mutants are enhanced. Taken together, our results suggest a role for RNA editing in normal growth and development by regulating silencing via RNAi. PMID:28031250

  19. Evolutionary maintenance of selfish homing endonuclease genes in the absence of horizontal transfer.

    PubMed

    Yahara, Koji; Fukuyo, Masaki; Sasaki, Akira; Kobayashi, Ichizo

    2009-11-03

    Homing endonuclease genes are "selfish" mobile genetic elements whose endonuclease promotes the spread of its own gene by creating a break at a specific target site and using the host machinery to repair the break by copying and inserting the gene at this site. Horizontal transfer across the boundary of a species or population within which mating takes place has been thought to be necessary for their evolutionary persistence. This is based on the assumption that they will become fixed in a host population, where opportunities of homing will disappear, and become susceptible to degeneration. To test this hypothesis, we modeled behavior of a homing endonuclease gene that moves during meiosis through double-strand break repair. We mathematically explored conditions for persistence of the homing endonuclease gene and elucidated their parameter dependence as phase diagrams. We found that, if the cost of the pseudogene is lower than that of the homing endonuclease gene, the 2 forms can persist in a population through autonomous periodic oscillation. If the cost of the pseudogene is higher, 2 types of dynamics appear that enable evolutionary persistence: bistability dependent on initial frequency or fixation irrespective of initial frequency. The prediction of long persistence in the absence of horizontal transfer was confirmed by stochastic simulations in finite populations. The average time to extinction of the endonuclease gene was found to be thousands of meiotic generations or more based on realistic parameter values. These results provide a solid theoretical basis for an understanding of these and other extremely selfish elements.

  20. Brucella Genetic Variability in Wildlife Marine Mammals Populations Relates to Host Preference and Ocean Distribution

    PubMed Central

    Suárez-Esquivel, Marcela; Baker, Kate S.; Ruiz-Villalobos, Nazareth; Hernández-Mora, Gabriela; Barquero-Calvo, Elías; González-Barrientos, Rocío; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Chacón-Díaz, Carlos; Cloeckaert, Axel; Chaves-Olarte, Esteban; Thomson, Nicholas R.; Moreno, Edgardo

    2017-01-01

    Abstract Intracellular bacterial pathogens probably arose when their ancestor adapted from a free-living environment to an intracellular one, leading to clonal bacteria with smaller genomes and less sources of genetic plasticity. Still, this plasticity is needed to respond to the challenges posed by the host. Members of the Brucella genus are facultative-extracellular intracellular bacteria responsible for causing brucellosis in a variety of mammals. The various species keep different host preferences, virulence, and zoonotic potential despite having 97–99% similarity at genome level. Here, we describe elements of genetic variation in Brucella ceti isolated from wildlife dolphins inhabiting the Pacific Ocean and the Mediterranean Sea. Comparison with isolates obtained from marine mammals from the Atlantic Ocean and the broader Brucella genus showed distinctive traits according to oceanic distribution and preferred host. Marine mammal isolates display genetic variability, represented by an important number of IS711 elements as well as specific IS711 and SNPs genomic distribution clustering patterns. Extensive pseudogenization was found among isolates from marine mammals as compared with terrestrial ones, causing degradation in pathways related to energy, transport of metabolites, and regulation/transcription. Brucella ceti isolates infecting particularly dolphin hosts, showed further degradation of metabolite transport pathways as well as pathways related to cell wall/membrane/envelope biogenesis and motility. Thus, gene loss through pseudogenization is a source of genetic variation in Brucella, which in turn, relates to adaptation to different hosts. This is relevant to understand the natural history of bacterial diseases, their zoonotic potential, and the impact of human interventions such as domestication. PMID:28854602

  1. Evolutionary maintenance of selfish homing endonuclease genes in the absence of horizontal transfer

    PubMed Central

    Yahara, Koji; Fukuyo, Masaki; Sasaki, Akira; Kobayashi, Ichizo

    2009-01-01

    Homing endonuclease genes are “selfish” mobile genetic elements whose endonuclease promotes the spread of its own gene by creating a break at a specific target site and using the host machinery to repair the break by copying and inserting the gene at this site. Horizontal transfer across the boundary of a species or population within which mating takes place has been thought to be necessary for their evolutionary persistence. This is based on the assumption that they will become fixed in a host population, where opportunities of homing will disappear, and become susceptible to degeneration. To test this hypothesis, we modeled behavior of a homing endonuclease gene that moves during meiosis through double-strand break repair. We mathematically explored conditions for persistence of the homing endonuclease gene and elucidated their parameter dependence as phase diagrams. We found that, if the cost of the pseudogene is lower than that of the homing endonuclease gene, the 2 forms can persist in a population through autonomous periodic oscillation. If the cost of the pseudogene is higher, 2 types of dynamics appear that enable evolutionary persistence: bistability dependent on initial frequency or fixation irrespective of initial frequency. The prediction of long persistence in the absence of horizontal transfer was confirmed by stochastic simulations in finite populations. The average time to extinction of the endonuclease gene was found to be thousands of meiotic generations or more based on realistic parameter values. These results provide a solid theoretical basis for an understanding of these and other extremely selfish elements. PMID:19837694

  2. Seven New Complete Plastome Sequences Reveal Rampant Independent Loss of the ndh Gene Family across Orchids and Associated Instability of the Inverted Repeat/Small Single-Copy Region Boundaries

    PubMed Central

    Moore, Michael J.; Neubig, Kurt M.; Williams, Norris H.; Whitten, W. Mark; Kim, Joo-Hwan

    2015-01-01

    Earlier research has revealed that the ndh loci have been pseudogenized, truncated, or deleted from most orchid plastomes sequenced to date, including in all available plastomes of the two most species-rich subfamilies, Orchidoideae and Epidendroideae. This study sought to resolve deeper-level phylogenetic relationships among major orchid groups and to refine the history of gene loss in the ndh loci across orchids. The complete plastomes of seven orchids, Oncidium sphacelatum (Epidendroideae), Masdevallia coccinea (Epidendroideae), Sobralia callosa (Epidendroideae), Sobralia aff. bouchei (Epidendroideae), Elleanthus sodiroi (Epidendroideae), Paphiopedilum armeniacum (Cypripedioideae), and Phragmipedium longifolium (Cypripedioideae) were sequenced and analyzed in conjunction with all other available orchid and monocot plastomes. Most ndh loci were found to be pseudogenized or lost in Oncidium, Paphiopedilum and Phragmipedium, but surprisingly, all ndh loci were found to retain full, intact reading frames in Sobralia, Elleanthus and Masdevallia. Character mapping suggests that the ndh genes were present in the common ancestor of orchids but have experienced independent, significant losses at least eight times across four subfamilies. In addition, ndhF gene loss was correlated with shifts in the position of the junction of the inverted repeat (IR) and small single-copy (SSC) regions. The Orchidaceae have unprecedented levels of homoplasy in ndh gene presence/absence, which may be correlated in part with the unusual life history of orchids. These results also suggest that ndhF plays a role in IR/SSC junction stability. PMID:26558895

  3. Evolution of Olfactory Receptor Genes in Primates Dominated by Birth-and-Death Process

    PubMed Central

    Dong, Dong; He, Guimei; Zhang, Shuyi

    2009-01-01

    Olfactory receptor (OR) is a large family of G protein–coupled receptors that can detect odorant in order to generate the sense of smell. They constitute one of the largest multiple gene families in animals including primates. To better understand the variation in odor perception and evolution of OR genes among primates, we computationally identified OR gene repertoires in orangutans, marmosets, and mouse lemurs and investigated the birth-and-death process of OR genes in the primate lineage. The results showed that 1) all the primate species studied have no more than 400 intact OR genes, fewer than rodents and canine; 2) Despite the similar number of OR genes in the genome, the makeup of the OR gene repertoires between different primate species is quite different as they had undergone dramatic birth-and-death evolution with extensive gene losses in the lineages leading to current species; 3) Apes and Old World monkey (OWM) have similar fraction of pseudogenes, whereas New World monkey (NWM) have fewer pseudogenes. To measure the selective pressure that had affected the OR gene repertoires in primates, we compared the ratio of nonsynonymous with synonymous substitution rates by using 70 one-to-one orthologous quintets among five primate species. We found that OR genes showed relaxed selective constraints in apes (humans, chimpanzees, and orangutans) than in OWMs (macaques) and NWMs (marmosets). We concluded that OR gene repertoires in primates have evolved in such a way to adapt to their respective living environments. Differential selective constraints might play important role in the primate OR gene evolution in each primate species. PMID:20333195

  4. Proximal 15q familial euchromatic variant and PWS/AS critical region duplication in the same patient: a cytogenetic pitfall.

    PubMed

    Carelle-Calmels, Nadège; Girard-Lemaire, Françoise; Guérin, Eric; Bieth, Eric; Rudolf, Gabrielle; Biancalana, Valérie; Pecheur, Hélène; Demil, Houria; Schneider, Thierry; de Saint-Martin, Anne; Caron, Olivier; Legrain, Michèle; Gaston, Valérie; Flori, Elisabeth

    2008-01-01

    Cytogenetically detectable elongation of the 15q proximal region can be associated with Prader-Willi/Angelman critical region interstitial duplications or with inherited juxtacentromeric euchromatic variants. The first category has been reported in association with developmental delay and autistic disorders. These pathogenic recurrent duplications are more frequently of maternal origin and originate from unequal meiotic crossovers between chromosome 15 low-copy repeats. 15q juxtacentromeric euchromatic variants reflect polymorphic copy number variations of segments containing pseudogenes and usually segregate without apparent phenotypic consequence. Pathogenic relevant 15q11-q13 duplications are not distinguishable from the innocuous euchromatic variants with conventional cytogenetic methods. We report cytogenetic and molecular studies of a patient with hypotonia, developmental delay and epilepsy, carrying, on the same chromosome 15, both a de novo 15q11-q13 interstitial duplication and an inherited 15q juxtacentromeric amplification from maternal origin. The duplication, initially suspected by fluorescent in situ hybridization (FISH), has been confirmed by molecular studies. The 15q juxtacentromeric region amplification, which segregates in the family for at least three generations, has been confirmed by FISH using BAC probes overlapping the NF1 and GABRA5 pseudogenes. This report emphasizes the importance to distinguish proximal 15q polymorphic variants from clinically significant duplications. In any patient with inherited 15q proximal variant but unexplained developmental delay suggesting 15q11-q13 pathology, a pathogenic rearrangement has to be searched with adapted strategies, in order to detect deletions as well as duplications of this region.

  5. Functional PMS2 Hybrid Alleles Containing a Pseudogene-Specific Missense Variant Trace Back to a Single Ancient Intrachromosomal Recombination Event

    PubMed Central

    Ganster, Christina; Wernstedt, Annekatrin; Kehrer-Sawatzki, Hildegard; Messiaen, Ludwine; Schmidt, Konrad; Rahner, Nils; Heinimann, Karl; Fonatsch, Christa; Zschocke, Johannes; Wimmer, Katharina

    2012-01-01

    Sequence exchange between PMS2 and its pseudogene PMS2CL, embedded in an inverted duplication on chromosome 7p22, has been reported to be an ongoing process that leads to functional PMS2 hybrid alleles containing PMS2- and PMS2CL-specific sequence variants at the 5′-and the 3′-end, respectively. The frequency of PMS2 hybrid alleles, their biological significance, and the mechanisms underlying their formation are largely unknown. Here we show that overall hybrid alleles account for one-third of 384 PMS2 alleles analyzed in individuals of different ethnic backgrounds. Depending on the population, 14–60% of hybrid alleles carry PMS2CL-specific sequences in exons 13–15, the remainder only in exon 15. We show that exons 13–15 hybrid alleles, named H1 hybrid alleles, constitute different haplotypes but trace back to a single ancient intrachromosomal recombination event with crossover. Taking advantage of an ancestral sequence variant specific for all H1 alleles we developed a simple gDNA-based polymerase chain reaction (PCR) assay that can be used to identify H1-allele carriers with high sensitivity and specificity (100 and 99%, respectively). Because H1 hybrid alleles harbor missense variant p.N775S of so far unknown functional significance, we assessed the H1-carrier frequency in 164 colorectal cancer patients. So far, we found no indication that the variant plays a major role with regard to cancer susceptibility. PMID:20186689

  6. Functional PMS2 hybrid alleles containing a pseudogene-specific missense variant trace back to a single ancient intrachromosomal recombination event.

    PubMed

    Ganster, Christina; Wernstedt, Annekatrin; Kehrer-Sawatzki, Hildegard; Messiaen, Ludwine; Schmidt, Konrad; Rahner, Nils; Heinimann, Karl; Fonatsch, Christa; Zschocke, Johannes; Wimmer, Katharina

    2010-05-01

    Sequence exchange between PMS2 and its pseudogene PMS2CL, embedded in an inverted duplication on chromosome 7p22, has been reported to be an ongoing process that leads to functional PMS2 hybrid alleles containing PMS2- and PMS2CL-specific sequence variants at the 5'-and the 3'-end, respectively. The frequency of PMS2 hybrid alleles, their biological significance, and the mechanisms underlying their formation are largely unknown. Here we show that overall hybrid alleles account for one-third of 384 PMS2 alleles analyzed in individuals of different ethnic backgrounds. Depending on the population, 14-60% of hybrid alleles carry PMS2CL-specific sequences in exons 13-15, the remainder only in exon 15. We show that exons 13-15 hybrid alleles, named H1 hybrid alleles, constitute different haplotypes but trace back to a single ancient intrachromosomal recombination event with crossover. Taking advantage of an ancestral sequence variant specific for all H1 alleles we developed a simple gDNA-based polymerase chain reaction (PCR) assay that can be used to identify H1-allele carriers with high sensitivity and specificity (100 and 99%, respectively). Because H1 hybrid alleles harbor missense variant p.N775S of so far unknown functional significance, we assessed the H1-carrier frequency in 164 colorectal cancer patients. So far, we found no indication that the variant plays a major role with regard to cancer susceptibility. (c) 2010 Wiley-Liss, Inc.

  7. Copy number variation at the 7q11.23 segmental duplications is a susceptibility factor for the Williams-Beuren syndrome deletion

    PubMed Central

    Cuscó, Ivon; Corominas, Roser; Bayés, Mònica; Flores, Raquel; Rivera-Brugués, Núria; Campuzano, Victoria; Pérez-Jurado, Luis A.

    2008-01-01

    Large copy number variants (CNVs) have been recently found as structural polymorphisms of the human genome of still unknown biological significance. CNVs are significantly enriched in regions with segmental duplications or low-copy repeats (LCRs). Williams-Beuren syndrome (WBS) is a neurodevelopmental disorder caused by a heterozygous deletion of contiguous genes at 7q11.23 mediated by nonallelic homologous recombination (NAHR) between large flanking LCRs and facilitated by a structural variant of the region, a ∼2-Mb paracentric inversion present in 20%–25% of WBS-transmitting progenitors. We now report that eight out of 180 (4.44%) WBS-transmitting progenitors are carriers of a CNV, displaying a chromosome with large deletion of LCRs. The prevalence of this CNV among control individuals and non-transmitting progenitors is much lower (1%, n = 600), thus indicating that it is a predisposing factor for the WBS deletion (odds ratio 4.6-fold, P = 0.002). LCR duplications were found in 2.22% of WBS-transmitting progenitors but also in 1.16% of controls, which implies a non–statistically significant increase in WBS-transmitting progenitors. We have characterized the organization and breakpoints of these CNVs, encompassing ∼100–300 kb of genomic DNA and containing several pseudogenes but no functional genes. Additional structural variants of the region have also been defined, all generated by NAHR between different blocks of segmental duplications. Our data further illustrate the highly dynamic structure of regions rich in segmental duplications, such as the WBS locus, and indicate that large CNVs can act as susceptibility alleles for disease-associated genomic rearrangements in the progeny. PMID:18292220

  8. PrionHome: a database of prions and other sequences relevant to prion phenomena.

    PubMed

    Harbi, Djamel; Parthiban, Marimuthu; Gendoo, Deena M A; Ehsani, Sepehr; Kumar, Manish; Schmitt-Ulms, Gerold; Sowdhamini, Ramanathan; Harrison, Paul M

    2012-01-01

    Prions are units of propagation of an altered state of a protein or proteins; prions can propagate from organism to organism, through cooption of other protein copies. Prions contain no necessary nucleic acids, and are important both as both pathogenic agents, and as a potential force in epigenetic phenomena. The original prions were derived from a misfolded form of the mammalian Prion Protein PrP. Infection by these prions causes neurodegenerative diseases. Other prions cause non-Mendelian inheritance in budding yeast, and sometimes act as diseases of yeast. We report the bioinformatic construction of the PrionHome, a database of >2000 prion-related sequences. The data was collated from various public and private resources and filtered for redundancy. The data was then processed according to a transparent classification system of prionogenic sequences (i.e., sequences that can make prions), prionoids (i.e., proteins that propagate like prions between individual cells), and other prion-related phenomena. There are eight PrionHome classifications for sequences. The first four classifications are derived from experimental observations: prionogenic sequences, prionoids, other prion-related phenomena, and prion interactors. The second four classifications are derived from sequence analysis: orthologs, paralogs, pseudogenes, and candidate-prionogenic sequences. Database entries list: supporting information for PrionHome classifications, prion-determinant areas (where relevant), and disordered and compositionally-biased regions. Also included are literature references for the PrionHome classifications, transcripts and genomic coordinates, and structural data (including comparative models made for the PrionHome from manually curated alignments). We provide database usage examples for both vertebrate and fungal prion contexts. Using the database data, we have performed a detailed analysis of the compositional biases in known budding-yeast prionogenic sequences, showing that the only abundant bias pattern is for asparagine bias with subsidiary serine bias. We anticipate that this database will be a useful experimental aid and reference resource. It is freely available at: http://libaio.biol.mcgill.ca/prion.

  9. PrionHome: A Database of Prions and Other Sequences Relevant to Prion Phenomena

    PubMed Central

    Harbi, Djamel; Parthiban, Marimuthu; Gendoo, Deena M. A.; Ehsani, Sepehr; Kumar, Manish; Schmitt-Ulms, Gerold; Sowdhamini, Ramanathan; Harrison, Paul M.

    2012-01-01

    Prions are units of propagation of an altered state of a protein or proteins; prions can propagate from organism to organism, through cooption of other protein copies. Prions contain no necessary nucleic acids, and are important both as both pathogenic agents, and as a potential force in epigenetic phenomena. The original prions were derived from a misfolded form of the mammalian Prion Protein PrP. Infection by these prions causes neurodegenerative diseases. Other prions cause non-Mendelian inheritance in budding yeast, and sometimes act as diseases of yeast. We report the bioinformatic construction of the PrionHome, a database of >2000 prion-related sequences. The data was collated from various public and private resources and filtered for redundancy. The data was then processed according to a transparent classification system of prionogenic sequences (i.e., sequences that can make prions), prionoids (i.e., proteins that propagate like prions between individual cells), and other prion-related phenomena. There are eight PrionHome classifications for sequences. The first four classifications are derived from experimental observations: prionogenic sequences, prionoids, other prion-related phenomena, and prion interactors. The second four classifications are derived from sequence analysis: orthologs, paralogs, pseudogenes, and candidate-prionogenic sequences. Database entries list: supporting information for PrionHome classifications, prion-determinant areas (where relevant), and disordered and compositionally-biased regions. Also included are literature references for the PrionHome classifications, transcripts and genomic coordinates, and structural data (including comparative models made for the PrionHome from manually curated alignments). We provide database usage examples for both vertebrate and fungal prion contexts. Using the database data, we have performed a detailed analysis of the compositional biases in known budding-yeast prionogenic sequences, showing that the only abundant bias pattern is for asparagine bias with subsidiary serine bias. We anticipate that this database will be a useful experimental aid and reference resource. It is freely available at: http://libaio.biol.mcgill.ca/prion. PMID:22363733

  10. Comparative genomics of Clavibacter michiganensis subspecies, pathogens of important agricultural crops.

    PubMed

    Tambong, James T

    2017-01-01

    Subspecies of Clavibacter michiganensis are important phytobacterial pathogens causing devastating diseases in several agricultural crops. The genome organizations of these pathogens are poorly understood. Here, the complete genomes of 5 subspecies (C. michiganensis subsp. michiganensis, Cmi; C. michiganensis subsp. sepedonicus, Cms; C. michiganensis subsp. nebraskensis, Cmn; C. michiganensis subsp. insidiosus, Cmi and C. michiganensis subsp. capsici, Cmc) were analyzed. This study assessed the taxonomic position of the subspecies based on 16S rRNA and genome-based DNA homology and concludes that there is ample evidence to elevate some of the subspecies to species-level. Comparative genomics analysis indicated distinct genomic features evident on the DNA structural atlases and annotation features. Based on orthologous gene analysis, about 2300 CDSs are shared across all the subspecies; and Cms showed the highest number of subspecies-specific CDS, most of which are mobile elements suggesting that Cms could be more prone to translocation of foreign genes. Cms and Cmi had the highest number of pseudogenes, an indication of potential degenerating genomes. The stress response factors that may be involved in cold/heat shock, detoxification, oxidative stress, osmoregulation, and carbon utilization are outlined. For example, the wco-cluster encoding for extracellular polysaccharide II is highly conserved while the sucrose-6-phosphate hydrolase that catalyzes the hydrolysis of sucrose-6-phosphate yielding glucose-6-phosphate and fructose is highly divergent. A unique second form of the enzyme is only present in Cmn NCPPB 2581. Also, twenty-eight plasmid-borne CDSs in the other subspecies were found to have homologues in the chromosomal genome of Cmn which is known not to carry plasmids. These CDSs include pathogenesis-related factors such as Endocellulases E1 and Beta-glucosidase. The results presented here provide an insight of the functional organization of the genomes of five core C. michiganensis subspecies, enabling a better understanding of these phytobacteria.

  11. Comparative genomics of Clavibacter michiganensis subspecies, pathogens of important agricultural crops

    PubMed Central

    2017-01-01

    Subspecies of Clavibacter michiganensis are important phytobacterial pathogens causing devastating diseases in several agricultural crops. The genome organizations of these pathogens are poorly understood. Here, the complete genomes of 5 subspecies (C. michiganensis subsp. michiganensis, Cmi; C. michiganensis subsp. sepedonicus, Cms; C. michiganensis subsp. nebraskensis, Cmn; C. michiganensis subsp. insidiosus, Cmi and C. michiganensis subsp. capsici, Cmc) were analyzed. This study assessed the taxonomic position of the subspecies based on 16S rRNA and genome-based DNA homology and concludes that there is ample evidence to elevate some of the subspecies to species-level. Comparative genomics analysis indicated distinct genomic features evident on the DNA structural atlases and annotation features. Based on orthologous gene analysis, about 2300 CDSs are shared across all the subspecies; and Cms showed the highest number of subspecies-specific CDS, most of which are mobile elements suggesting that Cms could be more prone to translocation of foreign genes. Cms and Cmi had the highest number of pseudogenes, an indication of potential degenerating genomes. The stress response factors that may be involved in cold/heat shock, detoxification, oxidative stress, osmoregulation, and carbon utilization are outlined. For example, the wco-cluster encoding for extracellular polysaccharide II is highly conserved while the sucrose-6-phosphate hydrolase that catalyzes the hydrolysis of sucrose-6-phosphate yielding glucose-6-phosphate and fructose is highly divergent. A unique second form of the enzyme is only present in Cmn NCPPB 2581. Also, twenty-eight plasmid-borne CDSs in the other subspecies were found to have homologues in the chromosomal genome of Cmn which is known not to carry plasmids. These CDSs include pathogenesis-related factors such as Endocellulases E1 and Beta-glucosidase. The results presented here provide an insight of the functional organization of the genomes of five core C. michiganensis subspecies, enabling a better understanding of these phytobacteria. PMID:28319117

  12. A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements

    PubMed Central

    Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.

    2008-01-01

    X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625

  13. Plastome Evolution in Hemiparasitic Mistletoes

    PubMed Central

    Petersen, Gitte; Cuenca, Argelia; Seberg, Ole

    2015-01-01

    Santalales is an order of plants consisting almost entirely of parasites. Some, such as Osyris, are facultative root parasites whereas others, such as Viscum, are obligate stem parasitic mistletoes. Here, we report the complete plastome sequences of one species of Osyris and three species of Viscum, and we investigate the evolutionary aspects of structural changes and changes in gene content in relation to parasitism. Compared with typical angiosperms plastomes, the four Santalales plastomes are all reduced in size (10–22% compared with Vitis), and they have experienced rearrangements, mostly but not exclusively in the border areas of the inverted repeats. Additionally, a number of protein-coding genes (matK, infA, ccsA, rpl33, and all 11 ndh genes) as well as two transfer RNA genes (trnG-UCC and trnV-UAC) have been pseudogenized or completely lost. Most of the remaining plastid genes have a significantly changed selection pattern compared with other dicots, and the relaxed selection of photosynthesis genes is noteworthy. Although gene loss obviously reduces plastome size, intergenic regions were also shortened. As plastome modifications are generally most prominent in Viscum, they are most likely correlated with the increased nutritional dependence on the host compared with Osyris. PMID:26319577

  14. Intra-isolate genome variation in arbuscular mycorrhizal fungi persists in the transcriptome.

    PubMed

    Boon, E; Zimmerman, E; Lang, B F; Hijri, M

    2010-07-01

    Arbuscular mycorrhizal fungi (AMF) are heterokaryotes with an unusual genetic makeup. Substantial genetic variation occurs among nuclei within a single mycelium or isolate. AMF reproduce through spores that contain varying fractions of this heterogeneous population of nuclei. It is not clear whether this genetic variation on the genome level actually contributes to the AMF phenotype. To investigate the extent to which polymorphisms in nuclear genes are transcribed, we analysed the intra-isolate genomic and cDNA sequence variation of two genes, the large subunit ribosomal RNA (LSU rDNA) of Glomus sp. DAOM-197198 (previously known as G. intraradices) and the POL1-like sequence (PLS) of Glomus etunicatum. For both genes, we find high sequence variation at the genome and transcriptome level. Reconstruction of LSU rDNA secondary structure shows that all variants are functional. Patterns of PLS sequence polymorphism indicate that there is one functional gene copy, PLS2, which is preferentially transcribed, and one gene copy, PLS1, which is a pseudogene. This is the first study that investigates AMF intra-isolate variation at the transcriptome level. In conclusion, it is possible that, in AMF, multiple nuclear genomes contribute to a single phenotype.

  15. Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine.

    PubMed

    Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

    2002-06-01

    We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5' of Xist that was recently shown to attract histone modification early after the onset of X inactivation.

  16. Tuberculosis patients co-infected with Mycobacterium bovis and Mycobacterium tuberculosis in an urban area of Brazil.

    PubMed

    Silva, Marcio Roberto; Rocha, Adalgiza da Silva; da Costa, Ronaldo Rodrigues; de Alencar, Andrea Padilha; de Oliveira, Vania Maria; Fonseca Júnior, Antônio Augusto; Sales, Mariana Lázaro; Issa, Marina de Azevedo; Filho, Paulo Martins Soares; Pereira, Omara Tereza Vianello; dos Santos, Eduardo Calazans; Mendes, Rejane Silva; Ferreira, Angela Maria de Jesus; Mota, Pedro Moacyr Pinto Coelho; Suffys, Philip Noel; Guimarães, Mark Drew Crosland

    2013-05-01

    In this cross-sectional study, mycobacteria specimens from 189 tuberculosis (TB) patients living in an urban area in Brazil were characterised from 2008-2010 using phenotypic and molecular speciation methods (pncA gene and oxyR pseudogene analysis). Of these samples, 174 isolates simultaneously grew on Löwenstein-Jensen (LJ) and Stonebrink (SB)-containing media and presented phenotypic and molecular profiles of Mycobacterium tuberculosis, whereas 12 had molecular profiles of M. tuberculosis based on the DNA analysis of formalin-fixed paraffin wax-embedded tissue samples (paraffin blocks). One patient produced two sputum isolates, the first of which simultaneously grew on LJ and SB media and presented phenotypic and molecular profiles of M. tuberculosis, and the second of which only grew on SB media and presented phenotypic profiles of Mycobacterium bovis. One patient provided a bronchial lavage isolate, which simultaneously grew on LJ and SB media and presented phenotypic and molecular profiles of M. tuberculosis, but had molecular profiles of M. bovis from paraffin block DNA analysis, and one sample had molecular profiles of M. tuberculosis and M. bovis identified from two distinct paraffin blocks. Moreover, we found a low prevalence (1.6%) of M. bovis among these isolates, which suggests that local health service procedures likely underestimate its real frequency and that it deserves more attention from public health officials.

  17. Characterisation of Antigen B Protein Species Present in the Hydatid Cyst Fluid of Echinococcus canadensis G7 Genotype

    PubMed Central

    Folle, Ana Maite; Kitano, Eduardo S.; Lima, Analía; Gil, Magdalena; Cucher, Marcela; Mourglia-Ettlin, Gustavo; Iwai, Leo K.; Rosenzvit, Mara; Batthyány, Carlos

    2017-01-01

    The larva of cestodes belonging to the Echinococcus granulosus sensu lato (s.l.) complex causes cystic echinococcosis (CE). It is a globally distributed zoonosis with significant economic and public health impact. The most immunogenic and specific Echinococcus-genus antigen for human CE diagnosis is antigen B (AgB), an abundant lipoprotein of the hydatid cyst fluid (HF). The AgB protein moiety (apolipoprotein) is encoded by five genes (AgB1-AgB5), which generate mature 8 kDa proteins (AgB8/1-AgB8/5). These genes seem to be differentially expressed among Echinococcus species. Since AgB immunogenicity lies on its protein moiety, differences in AgB expression within E. granulosus s.l. complex might have diagnostic and epidemiological relevance for discriminating the contribution of distinct species to human CE. Interestingly, AgB2 was proposed as a pseudogene in E. canadensis, which is the second most common cause of human CE, but proteomic studies for verifying it have not been performed yet. Herein, we analysed the protein and lipid composition of AgB obtained from fertile HF of swine origin (E. canadensis G7 genotype). AgB apolipoproteins were identified and quantified using mass spectrometry tools. Results showed that AgB8/1 was the major protein component, representing 71% of total AgB apolipoproteins, followed by AgB8/4 (15.5%), AgB8/3 (13.2%) and AgB8/5 (0.3%). AgB8/2 was not detected. As a methodological control, a parallel analysis detected all AgB apolipoproteins in bovine fertile HF (G1/3/5 genotypes). Overall, E. canadensis AgB comprised mostly AgB8/1 together with a heterogeneous mixture of lipids, and AgB8/2 was not detected despite using high sensitivity proteomic techniques. This endorses genomic data supporting that AgB2 behaves as a pseudogene in G7 genotype. Since recombinant AgB8/2 has been found to be diagnostically valuable for human CE, our findings indicate that its use as antigen in immunoassays could contribute to false negative results in areas where E. canadensis circulates. Furthermore, the presence of anti-AgB8/2 antibodies in serum may represent a useful parameter to rule out E. canadensis infection when human CE is diagnosed. PMID:28045899

  18. Characterisation of Antigen B Protein Species Present in the Hydatid Cyst Fluid of Echinococcus canadensis G7 Genotype.

    PubMed

    Folle, Ana Maite; Kitano, Eduardo S; Lima, Analía; Gil, Magdalena; Cucher, Marcela; Mourglia-Ettlin, Gustavo; Iwai, Leo K; Rosenzvit, Mara; Batthyány, Carlos; Ferreira, Ana María

    2017-01-01

    The larva of cestodes belonging to the Echinococcus granulosus sensu lato (s.l.) complex causes cystic echinococcosis (CE). It is a globally distributed zoonosis with significant economic and public health impact. The most immunogenic and specific Echinococcus-genus antigen for human CE diagnosis is antigen B (AgB), an abundant lipoprotein of the hydatid cyst fluid (HF). The AgB protein moiety (apolipoprotein) is encoded by five genes (AgB1-AgB5), which generate mature 8 kDa proteins (AgB8/1-AgB8/5). These genes seem to be differentially expressed among Echinococcus species. Since AgB immunogenicity lies on its protein moiety, differences in AgB expression within E. granulosus s.l. complex might have diagnostic and epidemiological relevance for discriminating the contribution of distinct species to human CE. Interestingly, AgB2 was proposed as a pseudogene in E. canadensis, which is the second most common cause of human CE, but proteomic studies for verifying it have not been performed yet. Herein, we analysed the protein and lipid composition of AgB obtained from fertile HF of swine origin (E. canadensis G7 genotype). AgB apolipoproteins were identified and quantified using mass spectrometry tools. Results showed that AgB8/1 was the major protein component, representing 71% of total AgB apolipoproteins, followed by AgB8/4 (15.5%), AgB8/3 (13.2%) and AgB8/5 (0.3%). AgB8/2 was not detected. As a methodological control, a parallel analysis detected all AgB apolipoproteins in bovine fertile HF (G1/3/5 genotypes). Overall, E. canadensis AgB comprised mostly AgB8/1 together with a heterogeneous mixture of lipids, and AgB8/2 was not detected despite using high sensitivity proteomic techniques. This endorses genomic data supporting that AgB2 behaves as a pseudogene in G7 genotype. Since recombinant AgB8/2 has been found to be diagnostically valuable for human CE, our findings indicate that its use as antigen in immunoassays could contribute to false negative results in areas where E. canadensis circulates. Furthermore, the presence of anti-AgB8/2 antibodies in serum may represent a useful parameter to rule out E. canadensis infection when human CE is diagnosed.

  19. Draft genome sequence of a KPC-2-producing Klebsiella pneumoniae ST340 carrying blaCTX-M-15 and blaCTX-M-59 genes: a rich genome of mobile genetic elements and genes encoding antibiotic resistance.

    PubMed

    Casella, Tiago; de Morais, Andressa Batista Zequini; de Paula Barcelos, Diego Diniz; Tolentino, Fernanda Modesto; Cerdeira, Louise Teixeira; Bueno, Maria Fernanda Campagnari; Francisco, Gabriela Rodrigues; de Andrade, Leonardo Neves; da Costa Darini, Ana Lucia; de Oliveira Garcia, Doroti; Lincopan, Nilton; Nogueira, Mara Corrêa Lelles

    2018-06-01

    Klebsiella pneumoniae is considered an opportunistic pathogen and an important agent of nosocomial and community infections. It presents the ability to capture and harbour several antimicrobial resistance genes and, in this context, the extensive use of carbapenems to treat serious infections has been responsible for the selection of several resistance genes. This study reports the draft genome sequence of a KPC-2-producing K. pneumoniae strain (Kp10) simultaneously harbouring bla CTX-M-15 and bla CTX-M-59 genes isolated from urine culture of a patient with Parkinson's disease. Classical microbiological methods were applied to isolate and identify the strain, and PCR and sequencing were used to identify and characterise the genes and the genetic environment. Whole-genome sequencing (WGS) was performed using a Nextera XT DNA library and a NextSeq platform. WGS analysis revealed the presence of 5915 coding genes, 46 RNA-encoding genes and 255 pseudogenes. Kp10 belonged to sequence type 340 (ST340) of clonal complex 258 (CC258) and carried 20 transferable genes associated with antimicrobial resistance, comprising seven drug classes. Although the simultaneous presence of different bla CTX-M genes in the same strain is rarely reported, the bla KPC-2 , bla CTX-M-15 and bla CTX-M-59 genes were not associated with the same genetic mobile structure in Kp10. These results confirm the capacity of K. pneumoniae to harbour several antimicrobial resistance genes. Thus, this draft genome could help in future epidemiological studies regarding the dissemination of clinically relevant resistance genes. Copyright © 2018 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.

  20. Analysis of the arabinoxylan arabinofuranohydrolase gene family in barley does not support their involvement in the remodelling of endosperm cell walls during development.

    PubMed

    Laidlaw, Hunter K C; Lahnstein, Jelle; Burton, Rachel A; Fincher, Geoffrey B; Jobling, Stephen A

    2012-05-01

    Arabinoxylan arabinofuranohydrolases (AXAHs) are family GH51 enzymes that have been implicated in the removal of arabinofuranosyl residues from the (1,4)-β-xylan backbone of heteroxylans. Five genes encoding barley AXAHs range in size from 4.6 kb to 7.1 kb and each contains 16 introns. The barley HvAXAH genes map to chromosomes 2H, 4H, and 5H. A small cluster of three HvAXAH genes is located on chromosome 4H and there is evidence for gene duplication and the presence of pseudogenes in barley. The cDNAs corresponding to barley and wheat AXAH genes were cloned, and transcript levels of the genes were profiled across a range of tissues at different developmental stages. Two HvAXAH cDNAs that were successfully expressed in Nicotiana benthamiana leaves exhibited similar activities against 4-nitrophenyl α-L-arabinofuranoside, but HvAXAH2 activity was significantly higher against wheat flour arabinoxylan, compared with HvAXAH1. HvAXAH2 also displayed activity against (1,5)-α-L-arabinopentaose and debranched arabinan. Western blotting with an anti-HvAXAH antibody was used to define further the locations of the AXAH enzymes in developing barley grain, where high levels were detected in the outer layers of the grain but little or no protein was detected in the endosperm. The chromosomal locations of the genes do not correspond to any previously identified genomic regions shown to influence heteroxylan structure. The data are therefore consistent with a role for AXAH in depolymerizing arabinoxylans in maternal tissues during grain development, but do not provide compelling evidence for a role in remodelling arabinoxylans during endosperm or coleoptile development in barley as previously proposed.

  1. Systematic analysis and evolution of 5S ribosomal DNA in metazoans.

    PubMed

    Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M

    2013-11-01

    Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12,766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades.

  2. Systematic analysis and evolution of 5S ribosomal DNA in metazoans

    PubMed Central

    Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M

    2013-01-01

    Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12 766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades. PMID:23838690

  3. Characterisation of a collagen gene subfamily from the potato cyst nematode Globodera pallida.

    PubMed

    Gray, L J; Curtis, R H; Jones, J T

    2001-01-24

    We have isolated two full-length genomic DNA sequences, which encode the cuticle collagen proteins GP-COL-1 and GP-COL-2, from the potato cyst nematode Globodera pallida. A third, partial collagen gene ORF termed gp-col-t(t=truncated) has also been isolated and appears to represent an unexpressed pseudogene. The gp-col-1 and gp-col-2 genes both contain three short (<97 bp) introns which disrupt coding regions predicted to specify proteins with molecular weights of 33 and 32.7 kDa respectively. All three sequences show high similarity to each other and to the previously isolated G. pallida cDNA clone gp-col-8. The conserved pattern of cysteine residues and non-(Gly-X-Y)(n) region sequence similarity observed in all four G. pallida genes suggests that these molecules form part of the same subfamily of collagens. Southern analysis indicates that this subfamily is likely to contain further members. The G. pallida collagen sequences show striking similarity to twelve genes from Caenorhabditis elegans which collectively represent the recently classified Group 1a collagen subfamily. No data exists on the function of this subfamily in C. elegans. gp-col-1 and gp-col-2 are developmentally regulated with transcripts of both genes detected in adult virgin and gravid females but not in pre-parasitic second stage juveniles. A similar expression pattern is observed for the Group 1a collagen lemmi 5 from Meloidogyne incognita perhaps indicating a generic link between subfamily and function during the various changes in cuticular structure which accompany nematode growth and reproduction. Immunochemical studies indicate that the GP-COL-1 protein is specifically located in the hypodermis of G. pallida adult females.

  4. Homoeologous cloning of omega-secalin gene family in a wheat 1BL/1RS translocation.

    PubMed

    Chai, Jian Fang; Liu, Xu; Jia, Ji Zeng

    2005-08-01

    Wheat 1BL/1RS translocations are widely planted in China as well as in most of the wheat producing area in the world for their good qualities of disease resistance and high yield. 1BL/1RS translocations are however poor in bread making, partially caused by a family of small monomeric proteins, omega-secalins, which are encoded by genes on 1RS. Based on published sequence of a rye omega-secalin gene we designed a pair of primers to cover the whole mature protein coding sequence. A major band could be amplified from 1BL/1RS translocations but not from euploid wheat. Using this primer set we conducted PCR amplification by using high fidelity Pfu polymerase on the genomic DNAs and cDNAs purified from a 1BL/1RS translocation Lankao 906. Sequencing analysis indicated that this gene family contains several members of 1150 bp, 1076 bp, 1075 bp, 1052 bp and 1004 bp genes, including two pseudogenes and three active genes. The gene transcripts were differentially expressed in developing seeds.

  5. Extracellular RNA profiles with human age.

    PubMed

    Dluzen, Douglas F; Noren Hooten, Nicole; De, Supriyo; Wood, William H; Zhang, Yongqing; Becker, Kevin G; Zonderman, Alan B; Tanaka, Toshiko; Ferrucci, Luigi; Evans, Michele K

    2018-05-24

    Circulating extracellular RNAs (exRNAs) are potential biomarkers of disease. We thus hypothesized that age-related changes in exRNAs can identify age-related processes. We profiled both large and small RNAs in human serum to investigate changes associated with normal aging. exRNA was sequenced in 13 young (30-32 years) and 10 old (80-85 years) African American women to identify all RNA transcripts present in serum. We identified age-related differences in several RNA biotypes, including mitochondrial transfer RNAs, mitochondrial ribosomal RNA, and unprocessed pseudogenes. Age-related differences in unique RNA transcripts were further validated in an expanded cohort. Pathway analysis revealed that EIF2 signaling, oxidative phosphorylation, and mitochondrial dysfunction were among the top pathways shared between young and old. Protein interaction networks revealed distinct clusters of functionally-related protein-coding genes in both age groups. These data provide timely and relevant insight into the exRNA repertoire in serum and its change with aging. Published 2018. This article is a U.S. Government work and is in the public domain in the USA. Aging Cell published by the Anatomical Society and John Wiley & Sons Ltd.

  6. Looking back on a decade of barcoding crustaceans

    PubMed Central

    Raupach, Michael J.; Radulovici, Adriana E.

    2015-01-01

    Abstract Species identification represents a pivotal component for large-scale biodiversity studies and conservation planning but represents a challenge for many taxa when using morphological traits only. Consequently, alternative identification methods based on molecular markers have been proposed. In this context, DNA barcoding has become a popular and accepted method for the identification of unknown animals across all life stages by comparison to a reference library. In this review we examine the progress of barcoding studies for the Crustacea using the Web of Science data base from 2003 to 2014. All references were classified in terms of taxonomy covered, subject area (identification/library, genetic variability, species descriptions, phylogenetics, methods, pseudogenes/numts), habitat, geographical area, authors, journals, citations, and the use of the Barcode of Life Data Systems (BOLD). Our analysis revealed a total number of 164 barcoding studies for crustaceans with a preference for malacostracan crustaceans, in particular Decapoda, and for building reference libraries in order to identify organisms. So far, BOLD did not establish itself as a popular informatics platform among carcinologists although it offers many advantages for standardized data storage, analyses and publication. PMID:26798245

  7. Identification and functional analysis of the NLP-encoding genes from the phytopathogenic oomycete Phytophthora capsici.

    PubMed

    Chen, Xiao-Ren; Huang, Shen-Xin; Zhang, Ye; Sheng, Gui-Lin; Li, Yan-Peng; Zhu, Feng

    2018-03-23

    Phytophthora capsici is a hemibiotrophic, phytopathogenic oomycete that infects a wide range of crops, resulting in significant economic losses worldwide. By means of a diverse arsenal of secreted effector proteins, hemibiotrophic pathogens may manipulate plant cell death to establish a successful infection and colonization. In this study, we described the analysis of the gene family encoding necrosis- and ethylene-inducing peptide 1 (Nep1)-like proteins (NLPs) in P. capsici, and identified 39 real NLP genes and 26 NLP pseudogenes. Out of the 65 predicted NLP genes, 48 occur in groups with two or more genes, whereas the remainder appears to be singletons distributed randomly among the genome. Phylogenetic analysis of the 39 real NLPs delineated three groups. Key residues/motif important for the effector activities are degenerated in most NLPs, including the nlp24 peptide consisting of the conserved region I (11-aa immunogenic part) and conserved region II (the heptapeptide GHRHDWE motif) that is important for phytotoxic activity. Transcriptional profiling of eight selected NLP genes indicated that they were differentially expressed during the developmental and plant infection phases of P. capsici. Functional analysis of ten cloned NLPs demonstrated that Pc11951, Pc107869, Pc109174 and Pc118548 were capable of inducing cell death in the Solanaceae, including Nicotiana benthamiana and hot pepper. This study provides an overview of the P. capsici NLP gene family, laying a foundation for further elucidating the pathogenicity mechanism of this devastating pathogen.

  8. Organization and transient expression of the gene for human U11 snRNA

    PubMed Central

    Clemens, Suter-Crazzolara; Walter, Keller

    1991-01-01

    The nucleotide sequence of U11 small nuclear RNA, a minor U RNA from HeLa cells, was determined. Computer analysis of the sequence (135 residues) predicts two strong hairpin loops which are separated by seventeen nucleotides containing an Sm binding site (AAUUUUUUGG). A synthetic gene was constructed in which the coding region of U11 RNA is under the control of a T7 promoter. This vector can be used to produce U11 RNA in vitro. Southern hybridization and PCR analysis of HeLa genomic DNA suggest that U11 RNA is encoded by a single copy gene, and that at least three genomic regions could be U11 RNA pseudogenes. A HeLa genomic copy of a U11 gene was isolated by inverted PCR. This gene contains the U11 RNA coding sequence and several sequence elements unique for the U RNA genes. These include a Distal Sequence Element (DSE, ATTTGCATA) present between positions −215 and −223 relative to the start of transcription; a Proximal Sequence Element (PSE, TTCACCTTTACCAAAAATG) located between positions −43 and −63 ; and a 3′box (GTTAGGCGAAATATTA) between positions +150 and +166. Transfection of HeLa cells with this gene revealed that it is functioning in vivo and can produce U11 RNA. PMID:1820214

  9. "Orphan" retrogenes in the human genome.

    PubMed

    Ciomborowska, Joanna; Rosikiewicz, Wojciech; Szklarczyk, Damian; Makałowski, Wojciech; Makałowska, Izabela

    2013-02-01

    Gene duplicates generated via retroposition were long thought to be pseudogenized and consequently decayed. However, a significant number of these genes escaped their evolutionary destiny and evolved into functional genes. Despite multiple studies, the number of functional retrogenes in human and other genomes remains unclear. We performed a comparative analysis of human, chicken, and worm genomes to identify "orphan" retrogenes, that is, retrogenes that have replaced their progenitors. We located 25 such candidates in the human genome. All of these genes were previously known, and the majority has been intensively studied. Despite this, they have never been recognized as retrogenes. Analysis revealed that the phenomenon of replacing parental genes with their retrocopies has been taking place over the entire span of animal evolution. This process was often species specific and contributed to interspecies differences. Surprisingly, these retrogenes, which should evolve in a more relaxed mode, are subject to a very strong purifying selection, which is, on average, two and a half times stronger than other human genes. Also, for retrogenes, they do not show a typical overall tendency for a testis-specific expression. Notably, seven of them are associated with human diseases. Recognizing them as "orphan" retrocopies, which have different regulatory machinery than their parents, is important for any disease studies in model organisms, especially when discoveries made in one species are transferred to humans.

  10. Identification of members of the gonadotropin-releasing hormone (GnRH), corticotropin-releasing factor (CRF) families in the genome of the holocephalan, Callorhinchus milii (elephant shark).

    PubMed

    Nock, Tanya G; Chand, Dhan; Lovejoy, David A

    2011-04-01

    The gonadotropin-releasing hormone (GnRH) and corticotropin-releasing family (CRF) are two neuropeptides families that are strongly conserved throughout evolution. Recently, the genome of the holocephalan, Callorhinchus milii (elephant shark) has been sequenced. The phylogenetic position of C. milii, along with the relatively slow evolution of the cartilaginous fish suggests that neuropeptides in this species may resemble the earliest gnathostome forms. The genome of the elephant shark was screened, in silico, using the various conserved motifs of both the vertebrate CRF paralogs and the insect diuretic hormone sequences to identify the structure of the C. milii CRF/DH-like peptides. A similar approach was taken to identify the GnRH peptides using conserved motifs in both vertebrate and invertebrate forms. Two CRF peptides, a urotensin-1 peptide and a urocortin 3 peptide were found in the genome. There was only about 50% sequence identity between the two CRF peptides suggesting an early divergence. In addition, the urocortin 2 peptide seems to have been lost and was identified as a pseudogene in C. milii. In contrast to the number of CRF family peptides, only a GnRH-II preprohormone with the conserved mature decapeptide was found. This confirms early studies about the identity of GnRH in the Holocephali, and suggests that the Holocephali and Elasmobranchii differ with respect to GnRH structure and function. Copyright © 2011 Elsevier Inc. All rights reserved.

  11. Wheat CBF gene family: identification of polymorphisms in the CBF coding sequence.

    PubMed

    Mohseni, Sara; Che, Hua; Djillali, Zakia; Dumont, Estelle; Nankeu, Joseph; Danyluk, Jean

    2012-12-01

    Expression of cold-regulated genes needed for protection against freezing stress is mediated, in part, by the CBF transcription factor family. Previous studies with temperate cereals suggested that the CBF gene family in wheat was large, and that CBF genes were at the base of an important low temperature tolerance trait. Therefore, the goal of our study was to identify the CBF repertoire in the freezing-tolerant hexaploid wheat cultivar Norstar, and then to examine if the coding region of CBF genes in two spring cultivars contain polymorphisms that could affect the protein sequence and structure. Our analyses reveal that hexaploid wheat contains a complex CBF family consisting of at least 65 CBF genes of which 60 are known to be expressed in the cultivar Norstar. They represent 27 paralogous genes with 1-3 homeologous copies for the A, B, and D genomes. The cultivar Norstar contains two pseudogenes and at least 24 additional proteins having sequences and (or) structures that deviate from the consensus in the conserved AP2 DNA-binding and (or) C-terminal activation-domains. This suggests that in cultivars such as Norstar, low temperature tolerance may be increased through breeding of additional optimal alleles. The examination of the CBF repertoire present in the two spring cultivars, Chinese Spring and Manitou, reveals that they have additional polymorphisms affecting conserved positions in these domains. Understanding the effects of these polymorphisms will provide additional information for the selection of optimum CBF alleles in Triticeae breeding programs.

  12. A set of highly conserved RNA-binding proteins, alphaCP-1 and alphaCP-2, implicated in mRNA stabilization, are coexpressed from an intronless gene and its intron-containing paralog.

    PubMed

    Makeyev, A V; Chkheidze, A N; Liebhaber, S A

    1999-08-27

    Gene families normally expand by segmental genomic duplication and subsequent sequence divergence. Although copies of partially or fully processed mRNA transcripts are occasionally retrotransposed into the genome, they are usually nonfunctional ("processed pseudogenes"). The two major cytoplasmic poly(C)-binding proteins in mammalian cells, alphaCP-1 and alphaCP-2, are implicated in a spectrum of post-transcriptional controls. These proteins are highly similar in structure and are encoded by closely related mRNAs. Based on this close relationship, we were surprised to find that one of these proteins, alphaCP-2, was encoded by a multiexon gene, whereas the second gene, alphaCP-1, was identical to and colinear with its mRNA. The alphaCP-1 and alphaCP-2 genes were shown to be single copy and were mapped to separate chromosomes. The linkage groups encompassing each of the two loci were concordant between mice and humans. These data suggested that the alphaCP-1 gene was generated by retrotransposition of a fully processed alphaCP-2 mRNA and that this event occurred well before the mammalian radiation. The stringent structural conservation of alphaCP-1 and its ubiquitous tissue distribution suggested that the retrotransposed alphaCP-1 gene was rapidly recruited to a function critical to the cell and distinct from that of its alphaCP-2 progenitor.

  13. Massive losses of taste receptor genes in toothed and baleen whales.

    PubMed

    Feng, Ping; Zheng, Jinsong; Rossiter, Stephen J; Wang, Ding; Zhao, Huabin

    2014-05-06

    Taste receptor genes are functionally important in animals, with a surprising exception in the bottlenose dolphin, which shows extensive losses of sweet, umami, and bitter taste receptor genes. To examine the generality of taste gene loss, we examined seven toothed whales and five baleen whales and sequenced the complete repertoire of three sweet/umami (T1Rs) and ten bitter (T2Rs) taste receptor genes. We found all amplified T1Rs and T2Rs to be pseudogenes in all 12 whales, with a shared premature stop codon in 10 of the 13 genes, which demonstrated massive losses of taste receptor genes in the common ancestor of whales. Furthermore, we analyzed three genome sequences from two toothed whales and one baleen whale and found that the sour taste marker gene Pkd2l1 is a pseudogene, whereas the candidate salty taste receptor genes are intact and putatively functional. Additionally, we examined three genes that are responsible for taste signal transduction and found the relaxation of functional constraints on taste signaling pathways along the ancestral branch leading to whales. Together, our results strongly suggest extensive losses of sweet, umami, bitter, and sour tastes in whales, and the relaxation of taste function most likely arose in the common ancestor of whales between 36 and 53 Ma. Therefore, whales represent the first animal group to lack four of five primary tastes, probably driven by the marine environment with high concentration of sodium, the feeding behavior of swallowing prey whole, and the dietary switch from plants to meat in the whale ancestor. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Evolution of the beta-amylase gene in the temperate grasses: Non-purifying selection, recombination, semiparalogy, homeology and phylogenetic signal.

    PubMed

    Minaya, Miguel; Díaz-Pérez, Antonio; Mason-Gamer, Roberta; Pimentel, Manuel; Catalán, Pilar

    2015-10-01

    Low-copy nuclear genes (LCNGs) have complex genetic architectures and evolutionary dynamics. However, unlike multicopy nuclear genes, LCNGs are rarely subject to gene conversion or concerted evolution, and they have higher mutation rates than organellar or nuclear ribosomal DNA markers, so they have great potential for improving the robustness of phylogenetic reconstructions at all taxonomic levels. In this study, our first objective is to evaluate the evolutionary dynamics of the LCNG β-amylase by testing for potential pseudogenization, paralogy, homeology, recombination, and phylogenetic incongruence within a broad representation of the main Pooideae lineages. Our second objective is to determine whether β-amylase shows sufficient phylogenetic signal to reconstruct the evolutionary history of the Pooid grasses. A multigenic (ITS, matK, ndhF, trnTL, and trnLF) tree of the study group provided a framework for assessing the β-amylase phylogeny. Eight accessions showed complete absence of selection, suggesting putative pseudogenic copies or other relaxed selection pressures; resolution of Vulpia alopecuros 2x clones indicated its potential (semi) paralogy; and homeologous copies of allopolyploid species Festuca simensis, F. fenas, and F. arundinacea tracked their Mediterranean origin. Two recombination events were found within early-diverged Pooideae lineages, and five within the PACCMAD clade. The unexpected phylogenetic relationships of 37 grass species (26% of the sampled species) highlight the frequent occurrence of non-treelike evolutionary events, so this LCNG should be used with caution as a phylogenetic marker. However, once the pitfalls are identified and removed, the phylogenetic reconstruction of the grasses based on the β-amylase exon+intron positions is optimal at all taxonomic levels. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. An efficient and comprehensive strategy for genetic diagnostics of polycystic kidney disease.

    PubMed

    Eisenberger, Tobias; Decker, Christian; Hiersche, Milan; Hamann, Ruben C; Decker, Eva; Neuber, Steffen; Frank, Valeska; Bolz, Hanno J; Fehrenbach, Henry; Pape, Lars; Toenshoff, Burkhard; Mache, Christoph; Latta, Kay; Bergmann, Carsten

    2015-01-01

    Renal cysts are clinically and genetically heterogeneous conditions. Autosomal dominant polycystic kidney disease (ADPKD) is the most frequent life-threatening genetic disease and mainly caused by mutations in PKD1. The presence of six PKD1 pseudogenes and tremendous allelic heterogeneity make molecular genetic testing challenging requiring laborious locus-specific amplification. Increasing evidence suggests a major role for PKD1 in early and severe cases of ADPKD and some patients with a recessive form. Furthermore it is becoming obvious that clinical manifestations can be mimicked by mutations in a number of other genes with the necessity for broader genetic testing. We established and validated a sequence capture based NGS testing approach for all genes known for cystic and polycystic kidney disease including PKD1. Thereby, we demonstrate that the applied standard mapping algorithm specifically aligns reads to the PKD1 locus and overcomes the complication of unspecific capture of pseudogenes. Employing careful and experienced assessment of NGS data, the method is shown to be very specific and equally sensitive as established methods. An additional advantage over conventional Sanger sequencing is the detection of copy number variations (CNVs). Sophisticated bioinformatic read simulation increased the high analytical depth of the validation study and further demonstrated the strength of the approach. We further raise some awareness of limitations and pitfalls of common NGS workflows when applied in complex regions like PKD1 demonstrating that quality of NGS needs more than high coverage of the target region. By this, we propose a time- and cost-efficient diagnostic strategy for comprehensive molecular genetic testing of polycystic kidney disease which is highly automatable and will be of particular value when therapeutic options for PKD emerge and genetic testing is needed for larger numbers of patients.

  16. A-to-I RNA editing promotes developmental stage-specific gene and lncRNA expression.

    PubMed

    Goldstein, Boaz; Agranat-Tamir, Lily; Light, Dean; Ben-Naim Zgayer, Orna; Fishman, Alla; Lamm, Ayelet T

    2017-03-01

    A-to-I RNA editing is a conserved widespread phenomenon in which adenosine (A) is converted to inosine (I) by adenosine deaminases (ADARs) in double-stranded RNA regions, mainly noncoding. Mutations in ADAR enzymes in Caenorhabditis elegans cause defects in normal development but are not lethal as in human and mouse. Previous studies in C. elegans indicated competition between RNA interference (RNAi) and RNA editing mechanisms, based on the observation that worms that lack both mechanisms do not exhibit defects, in contrast to the developmental defects observed when only RNA editing is absent. To study the effects of RNA editing on gene expression and function, we established a novel screen that enabled us to identify thousands of RNA editing sites in nonrepetitive regions in the genome. These include dozens of genes that are edited at their 3' UTR region. We found that these genes are mainly germline and neuronal genes, and that they are down-regulated in the absence of ADAR enzymes. Moreover, we discovered that almost half of these genes are edited in a developmental-specific manner, indicating that RNA editing is a highly regulated process. We found that many pseudogenes and other lncRNAs are also extensively down-regulated in the absence of ADARs in the embryo but not in the fourth larval (L4) stage. This down-regulation is not observed upon additional knockout of RNAi. Furthermore, levels of siRNAs aligned to pseudogenes in ADAR mutants are enhanced. Taken together, our results suggest a role for RNA editing in normal growth and development by regulating silencing via RNAi. © 2017 Goldstein et al.; Published by Cold Spring Harbor Laboratory Press.

  17. Complex Evolutionary Dynamics of Massively Expanded Chemosensory Receptor Families in an Extreme Generalist Chelicerate Herbivore.

    PubMed

    Ngoc, Phuong Cao Thi; Greenhalgh, Robert; Dermauw, Wannes; Rombauts, Stephane; Bajda, Sabina; Zhurov, Vladimir; Grbić, Miodrag; Van de Peer, Yves; Van Leeuwen, Thomas; Rouzé, Pierre; Clark, Richard M

    2016-12-14

    While mechanisms to detoxify plant produced, anti-herbivore compounds have been associated with plant host use by herbivores, less is known about the role of chemosensory perception in their life histories. This is especially true for generalists, including chelicerate herbivores that evolved herbivory independently from the more studied insect lineages. To shed light on chemosensory perception in a generalist herbivore, we characterized the chemosensory receptors (CRs) of the chelicerate two-spotted spider mite, Tetranychus urticae, an extreme generalist. Strikingly, T. urticae has more CRs than reported in any other arthropod to date. Including pseudogenes, 689 gustatory receptors were identified, as were 136 degenerin/Epithelial Na+ Channels (ENaCs) that have also been implicated as CRs in insects. The genomic distribution of T. urticae gustatory receptors indicates recurring bursts of lineage-specific proliferations, with the extent of receptor clusters reminiscent of those observed in the CR-rich genomes of vertebrates or C. elegans Although pseudogenization of many gustatory receptors within clusters suggests relaxed selection, a subset of receptors is expressed. Consistent with functions as CRs, the genomic distribution and expression of ENaCs in lineage-specific T. urticae expansions mirrors that observed for gustatory receptors. The expansion of ENaCs in T. urticae to > 3-fold that reported in other animals was unexpected, raising the possibility that ENaCs in T. urticae have been co-opted to fulfill a major role performed by unrelated CRs in other animals. More broadly, our findings suggest an elaborate role for chemosensory perception in generalist herbivores that are of key ecological and agricultural importance. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. Differential Retention of Gene Functions in a Secondary Metabolite Cluster.

    PubMed

    Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W

    2017-08-01

    In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.

  19. Brucella Genetic Variability in Wildlife Marine Mammals Populations Relates to Host Preference and Ocean Distribution.

    PubMed

    Suárez-Esquivel, Marcela; Baker, Kate S; Ruiz-Villalobos, Nazareth; Hernández-Mora, Gabriela; Barquero-Calvo, Elías; González-Barrientos, Rocío; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Chacón-Díaz, Carlos; Cloeckaert, Axel; Chaves-Olarte, Esteban; Thomson, Nicholas R; Moreno, Edgardo; Guzmán-Verri, Caterina

    2017-07-01

    Intracellular bacterial pathogens probably arose when their ancestor adapted from a free-living environment to an intracellular one, leading to clonal bacteria with smaller genomes and less sources of genetic plasticity. Still, this plasticity is needed to respond to the challenges posed by the host. Members of the Brucella genus are facultative-extracellular intracellular bacteria responsible for causing brucellosis in a variety of mammals. The various species keep different host preferences, virulence, and zoonotic potential despite having 97-99% similarity at genome level. Here, we describe elements of genetic variation in Brucella ceti isolated from wildlife dolphins inhabiting the Pacific Ocean and the Mediterranean Sea. Comparison with isolates obtained from marine mammals from the Atlantic Ocean and the broader Brucella genus showed distinctive traits according to oceanic distribution and preferred host. Marine mammal isolates display genetic variability, represented by an important number of IS711 elements as well as specific IS711 and SNPs genomic distribution clustering patterns. Extensive pseudogenization was found among isolates from marine mammals as compared with terrestrial ones, causing degradation in pathways related to energy, transport of metabolites, and regulation/transcription. Brucella ceti isolates infecting particularly dolphin hosts, showed further degradation of metabolite transport pathways as well as pathways related to cell wall/membrane/envelope biogenesis and motility. Thus, gene loss through pseudogenization is a source of genetic variation in Brucella, which in turn, relates to adaptation to different hosts. This is relevant to understand the natural history of bacterial diseases, their zoonotic potential, and the impact of human interventions such as domestication. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Independent pseudogenization of CYP2J19 in penguins, owls and kiwis implicates gene in red carotenoid synthesis.

    PubMed

    Emerling, Christopher A

    2018-01-01

    Carotenoids have important roles in bird behavior, including pigmentation for sexual signaling and improving color vision via retinal oil droplets. Yellow carotenoids are diet-derived, but red carotenoids (ketocarotenoids) are typically synthesized from yellow precursors via a carotenoid ketolase. Recent research on passerines has provided evidence that a cytochrome p450 enzyme, CYP2J19, is responsible for this reaction, though it is unclear if this function is phylogenetically restricted. Here I provide evidence that CYP2J19 is the carotenoid ketolase common to Aves using the genomes of 65 birds and the retinal transcriptomes of 15 avian taxa. CYP2J19 is functionally intact and robustly transcribed in all taxa except for several species adapted to foraging in dim light conditions. Two penguins, an owl and a kiwi show evidence of genetic lesions and relaxed selection in their genomic copy of CYP2J19, and six owls show evidence of marked reduction in CYP2J19 retinal transcription compared to nine diurnal avian taxa. Furthermore, one of the owls appears to transcribe a CYP2J19 pseudogene. Notably, none of these taxa are known to use red carotenoids for sexual signaling and several species of owls and penguins represent the only birds known to completely lack red retinal oil droplets. The remaining avian taxa belong to groups known to possess red oil droplets, are known or expected to deposit red carotenoids in skin and/or plumage, and/or frequently forage in bright light. The loss and reduced expression of CYP2J19 is likely an adaptation to maximize retinal sensitivity, given that oil droplets reduce the amount of light available to the retina. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. The ZNF75 zinc finger gene subfamily: Isolation and mapping of the four members in humans and great apes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Villa, A.; Strina, D.; Frattini, A.

    We have previously reported the characterization of the human ZNF75 gene located on Xq26, which has only limited homology (less than 65%) to other ZF genes in the databases. Here, we describe three human zinc finger genes with 86 to 95% homology to ZNF75 at the nucleotide level, which represent all the members of the human ZNF75 subfamily. One of these, ZNF75B, is a pseudogene mapped to chromosome 12q13. The other two, ZNF75A and ZNF75C, maintain on ORF in the sequenced region, and at least the latter is expressed in the U937 cell line. They were mapped to chromosomes 16more » and 11, respectively. All these genes are conserved in chimpanzees, gorillas, and orangutans. The ZNF75B homologue is a pseudogene in all three great apes, and in chimpanzee it is located on chromosome 10 (phylogenetic XII), at p13 (corresponding to the human 12q13). The chimpanzee homologue of ZNF75 is also located on the Xq26 chromosome, in the same region, as detected by in situ hybridization. As expected, nucleotide changes were clearly more abundant between human and organutan than between human and chimpanzee or gorilla homologues. Members of the same class were more similar to each other than to the other homologues within the same species. This suggests that the duplication and/or retrotranscription events occurred in a common ancestor long before great ape speciation. This, together with the existance of at least two genes in cows and horses, suggests a relatively high conservation of this gene family. 20 refs., 5 figs., 1 tab.« less

  2. Molecular cloning and chromosomal localization of a pseudogene related to the human Acyl-CoA binding protein/diazepam binding inhibitor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gersuk, V.H.; Rose, T.M.; Todaro, G.J.

    The acyl-CoA binding protein (ACBP) and the diazepam binding inhibitor (DBI) or endozepine are independent isolates of a single 86-amino-acid, 10-kDa protein. ACBP/DBI is highly conserved between species and has been identified in several diverse organisms, including human, cow, rat, frog, duck, insects, plants, and yeast. Although the genomic locus has not yet been cloned in humans, complementary DNA clones with different 5{prime} ends have been isolated and characterized. These cDNA clones appear to be encoded by a single gene. However, Southern blot analyses, in situ hybridizations, and somatic cell hybrid chromosomal mapping all suggest that there are multiple ACBP/DBI-relatedmore » sequences in the genome. To identify potential members of this gene family, degenerate oligonucleotides corresponding to highly conserved regions of ACBP/DBI were used to screen a human genomic DNA library using the polymerase chain reaction. A novel gene, DBIP1, that is closely related to ACBP/DBI but is clearly distinct was identified. DBIP1 bears extensive sequence homology to ACBP/DBI but lacks the introns predicted by rat and duck genomic sequence studies. A 1-base deletion in the coding region results in a frameshift and, along with the absence of introns and the lack of a detectable transcript, suggests that DBIP1 is a pseudogene. ACBP/DBI has previously been mapped to chromosome 2, although this was recently disputed, and a chromosome 6 location was suggested. We show that ACBP/DBI is correctly placed on chromosome 2 and that the gene identified on chromosome 6 is DBIP1. 33 refs., 3 figs., 1 tab.« less

  3. Selective amplification of an mRNA and related pseudogene for a human ADP-ribosylation factor, a guanine nucleotide-dependent protein activator of cholera toxin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Monaco, L.; Murtagh, J.J.; Newman, K.B.

    1990-03-01

    ADP-ribosylation factors (ARFs) are {approx}20-kDa proteins that act as GTP-dependent allosteric activators of cholera toxin. With deoxyinosine-containing degenerate oligonucleotide primers corresponding to conserved GTP-binding domains in ARFs, the polymerase chain reaction (PCR) was used to amplify simultaneously from human DNA portions of three ARF genes that include codons for 102 amino acids, with intervening sequences. Amplification products that differed in size because of differences in intron sizes were separated by agarose gel electrophoresis. One amplified DNA contained no introns and had a sequence different from those of known AFRs. Based on this sequence, selective oligonucleotide probes were prepared and usedmore » to isolate clone {Psi}ARF 4, a putative ARF pseudogene, from a human genomic library in {lambda} phage EMBL3. Reverse transcription-PCR was then used to clone from human poly(A){sup +} RNA the cDNA corresponding to the expressed homolog of {Psi}ARF 4, referred to as human ARF 4. It appears that {Psi}ARF 4 arose during human evolution by integration of processed ARF 4 mRNA into the genome. Human ARF 4 differs from previously identified mammalian ARFs 1, 2, and 3. Hybridization of ARF 4-specific oligonucleotide probes with human, bovine, and rat RNA revealed a single 1.8-kilobase mRNA, which was clearly distinguished from the 1.9-kilobase mRNA for ARF 1 in these tissues. The PCR provides a powerful tool for investigating diversity in this and other multigene families, especially with primers targeted at domains believed to have functional significance.« less

  4. Evaluation of six nucleic acid amplification tests used for diagnosis of Neisseria gonorrhoeae in Russia compared with an international strictly validated real-time porA pseudogene polymerase chain reaction.

    PubMed

    Shipitsyna, E; Zolotoverkhaya, E; Hjelmevoll, S O; Maximova, A; Savicheva, A; Sokolovsky, E; Skogen, V; Domeika, M; Unemo, M

    2009-11-01

    In Russia, laboratory diagnosis of gonorrhoea has been mainly based on microscopy only and, in some settings, relatively rare suboptimal culturing. In recent years, Russian developed and manufactured nucleic acid amplification tests (NAAT) have been implemented for routine diagnosis of Neisseria gonorrhoeae. However, these NAATs have never been validated to any international well-recognized diagnostic NAAT. This study aims to evaluate the performance characteristics of six Russian NAATs for N. gonorrhoeae diagnostics. In total, 496 symptomatic patients were included. Five polymerase chain reaction (PCR) assays and one real-time nucleic acid sequence based amplification (NASBA) assay, developed by three Russian companies, were evaluated on urogenital samples, i.e. cervical and first voided urine (FVU) samples from females (n = 319), urethral and FVU samples from males (n = 127), and extragenital samples, i.e. rectal and pharyngeal samples, from 50 additional female patients with suspicion of gonorrhoea. As reference method, an international strictly validated real-time porA pseudogene PCR was applied. The prevalence of N. gonorrhoeae was 2.7% and 16% among the patients providing urogenital and extragenital samples, respectively. The Russian NAATs and the reference method displayed high level of concordance (99.4-100%). The sensitivities, specificities, positive predictive values and negative predictive values of the Russian tests in different specimens were 66.7-100%, 100%, 100%, and 99.4-100%, respectively. Russian N. gonorrhoeae diagnostic NAATs comprise relatively good performance characteristics. However, larger studies are crucial and, beneficially, the Russian assays should also be evaluated to other international highly sensitive and specific, and ideally Food and Drug Administration approved, NAATs such as Aptima Combo 2 (Gen-Probe).

  5. Few mitochondrial DNA sequences are inserted into the turkey (Meleagris gallopavo) nuclear genome: evolutionary analyses and informativity in the domestic lineage.

    PubMed

    Schiavo, G; Strillacci, M G; Ribani, A; Bovo, S; Roman-Ponce, S I; Cerolini, S; Bertolini, F; Bagnato, A; Fontanesi, L

    2018-06-01

    Mitochondrial DNA (mtDNA) insertions have been detected in the nuclear genome of many eukaryotes. These sequences are pseudogenes originated by horizontal transfer of mtDNA fragments into the nuclear genome, producing nuclear DNA sequences of mitochondrial origin (numt). In this study we determined the frequency and distribution of mtDNA-originated pseudogenes in the turkey (Meleagris gallopavo) nuclear genome. The turkey reference genome (Turkey_2.01) was aligned with the reference linearized mtDNA sequence using last. A total of 32 numt sequences (corresponding to 18 numt regions derived by unique insertional events) were identified in the turkey nuclear genome (size ranging from 66 to 1415 bp; identity against the modern turkey mtDNA corresponding region ranging from 62% to 100%). Numts were distributed in nine chromosomes and in one scaffold. They derived from parts of 10 mtDNA protein-coding genes, ribosomal genes, the control region and 10 tRNA genes. Seven numt regions reported in the turkey genome were identified in orthologues positions in the Gallus gallus genome and therefore were present in the ancestral genome that in the Cretaceous originated the lineages of the modern crown Galliformes. Five recently integrated turkey numts were validated by PCR in 168 turkeys of six different domestic populations. None of the analysed numts were polymorphic (i.e. absence of the inserted sequence, as reported in numts of recent integration in other species), suggesting that the reticulate speciation model is not useful for explaining the origin of the domesticated turkey lineage. © 2018 Stichting International Foundation for Animal Genetics.

  6. Functional analysis and transcriptional output of the Göttingen minipig genome.

    PubMed

    Heckel, Tobias; Schmucki, Roland; Berrera, Marco; Ringshandl, Stephan; Badi, Laura; Steiner, Guido; Ravon, Morgane; Küng, Erich; Kuhn, Bernd; Kratochwil, Nicole A; Schmitt, Georg; Kiialainen, Anna; Nowaczyk, Corinne; Daff, Hamina; Khan, Azinwi Phina; Lekolool, Isaac; Pelle, Roger; Okoth, Edward; Bishop, Richard; Daubenberger, Claudia; Ebeling, Martin; Certa, Ulrich

    2015-11-14

    In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development. Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies. Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed.

  7. Exponential decay of GC content detected by strand-symmetric substitution rates influences the evolution of isochore structure.

    PubMed

    Karro, J E; Peifer, M; Hardison, R C; Kollmann, M; von Grünberg, H H

    2008-02-01

    The distribution of guanine and cytosine nucleotides throughout a genome, or the GC content, is associated with numerous features in mammals; understanding the pattern and evolutionary history of GC content is crucial to our efforts to annotate the genome. The local GC content is decaying toward an equilibrium point, but the causes and rates of this decay, as well as the value of the equilibrium point, remain topics of debate. By comparing the results of 2 methods for estimating local substitution rates, we identify 620 Mb of the human genome in which the rates of the various types of nucleotide substitutions are the same on both strands. These strand-symmetric regions show an exponential decay of local GC content at a pace determined by local substitution rates. DNA segments subjected to higher rates experience disproportionately accelerated decay and are AT rich, whereas segments subjected to lower rates decay more slowly and are GC rich. Although we are unable to draw any conclusions about causal factors, the results support the hypothesis proposed by Khelifi A, Meunier J, Duret L, and Mouchiroud D (2006. GC content evolution of the human and mouse genomes: insights from the study of processed pseudogenes in regions of different recombination rates. J Mol Evol. 62:745-752.) that the isochore structure has been reshaped over time. If rate variation were a determining factor, then the current isochore structure of mammalian genomes could result from the local differences in substitution rates. We predict that under current conditions strand-symmetric portions of the human genome will stabilize at an average GC content of 30% (considerably less than the current 42%), thus confirming that the human genome has not yet reached equilibrium.

  8. The Genomics of Microbial Domestication in the Fermented Food Environment

    PubMed Central

    Gibbons, John G; Rinker, David C

    2015-01-01

    Shortly after the agricultural revolution, the domestication of bacteria, yeasts, and molds, played an essential role in enhancing the stability, quality, flavor, and texture of food products. These domestication events were likely the result of human food production practices that entailed the continual recycling of isolated microbial communities in the presence of abundant agricultural food sources. We suggest that within these novel agrarian food niches the metabolic requirements of those microbes became regular and predictable resulting in rapid genomic specialization through such mechanisms as pseudogenization, genome decay, interspecific hybridization, gene duplication, and horizontal gene transfer. The ultimate result was domesticated strains of microorganisms with enhanced fermentative capacities. PMID:26338497

  9. Genome-wide analysis of ionotropic receptor gene repertoire in Lepidoptera with an emphasis on its functions of Helicoverpa armigera.

    PubMed

    Liu, Nai-Yong; Xu, Wei; Dong, Shuang-Lin; Zhu, Jia-Ying; Xu, Yu-Xing; Anderson, Alisha

    2018-05-22

    The functions of the Ionotropic Receptor (IR) family have been well studied in Drosophila melanogaster, but only limited information is available in Lepidoptera. Here, we conducted a large-scale genome-wide analysis of the IR gene repertoire in 13 moths and 16 butterflies. Combining a homology-based approach and manual efforts, totally 996 IR candidates are identified including 31 pseudogenes and 825 full-length sequences, representing the most current comprehensive annotation in lepidopteran species. The phylogeny, expression and sequence characteristics classify Lepidoptera IRs into three sub-families: antennal IRs (A-IRs), divergent IRs (D-IRs) and Lepidoptera-specific IRs (LS-IRs), which is distinct from the case of Drosophila IRs. In comparison to LS-IRs and D-IRs, A-IRs members share a higher degree of protein identity and are distinguished into 16 orthologous groups in the phylogeny, showing conservation of gene structure. Analysis of selective forces on 27 orthologous groups reveals that these lepidopteran IRs have evolved under strong purifying selection (dN/dS≪1). Most notably, lineage-specific gene duplications that contribute primarily to gene number variations across Lepidoptera not only exist in D-IRs, but are present in the two other sub-families including members of IR41a, 76b, 87a, 100a and 100b. Expression profiling analysis reveals that over 80% (21/26) of Helicoverpa armigera A-IRs are expressed more highly in antennae of adults or larvae than other tissues, consistent with its proposed function in olfaction. However, some are also detected in taste organs like proboscises and legs. These results suggest that some A-IRs in H. armigera likely bear a dual function with their involvement in olfaction and gustation. Results from mating experiments show that two HarmIRs (IR1.2 and IR75d) expression is significantly up-regulated in antennae of mated female moths. However, no expression difference is observed between unmated female and male adults, suggesting an association with female host-searching behaviors. Our current study has greatly extended the IR gene repertoire resource in Lepidoptera, and more importantly, identifies potential IR candidates for olfactory, gustatory and oviposition behaviors in the cotton bollworm. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.

  10. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

    PubMed

    Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

    2010-07-16

    Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.

  11. N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana *

    PubMed Central

    Ndah, Elvis; Jonckheere, Veronique

    2017-01-01

    Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. PMID:28432195

  12. N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana.

    PubMed

    Willems, Patrick; Ndah, Elvis; Jonckheere, Veronique; Stael, Simon; Sticker, Adriaan; Martens, Lennart; Van Breusegem, Frank; Gevaert, Kris; Van Damme, Petra

    2017-06-01

    Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  13. Functional analysis and tissue-differential expression of four FAD2 genes in amphidiploid Brassica napus derived from Brassica rapa and Brassica oleracea.

    PubMed

    Lee, Kyeong-Ryeol; In Sohn, Soo; Jung, Jin Hee; Kim, Sun Hee; Roh, Kyung Hee; Kim, Jong-Bum; Suh, Mi Chung; Kim, Hyun Uk

    2013-12-01

    Fatty acid desaturase 2 (FAD2), which resides in the endoplasmic reticulum (ER), plays a crucial role in producing linoleic acid (18:2) through catalyzing the desaturation of oleic acid (18:1) by double bond formation at the delta 12 position. FAD2 catalyzes the first step needed for the production of polyunsaturated fatty acids found in the glycerolipids of cell membranes and the triacylglycerols in seeds. In this study, four FAD2 genes from amphidiploid Brassica napus genome were isolated by PCR amplification, with their enzymatic functions predicted by sequence analysis of the cDNAs. Fatty acid analysis of budding yeast transformed with each of the FAD2 genes showed that whereas BnFAD2-1, BnFAD2-2, and BnFAD2-4 are functional enzymes, and BnFAD2-3 is nonfunctional. The four FAD2 genes of B. napus originated from synthetic hybridization of its diploid progenitors Brassica rapa and Brassica oleracea, each of which has two FAD2 genes identical to those of B. napus. The BnFAD2-3 gene of B. napus, a nonfunctional pseudogene mutated by multiple nucleotide deletions and insertions, was inherited from B. rapa. All BnFAD2 isozymes except BnFAD2-3 localized to the ER. Nonfunctional BnFAD2-3 localized to the nucleus and chloroplasts. Four BnFAD2 genes can be classified on the basis of their expression patterns. © 2013.

  14. Comprehensive Mutation Analysis of PMS2 in a Large Cohort of Probands Suspected of Lynch Syndrome or Constitutional Mismatch Repair Deficiency Syndrome.

    PubMed

    van der Klift, Heleen M; Mensenkamp, Arjen R; Drost, Mark; Bik, Elsa C; Vos, Yvonne J; Gille, Hans J J P; Redeker, Bert E J W; Tiersma, Yvonne; Zonneveld, José B M; García, Encarna Gómez; Letteboer, Tom G W; Olderode-Berends, Maran J W; van Hest, Liselotte P; van Os, Theo A; Verhoef, Senno; Wagner, Anja; van Asperen, Christi J; Ten Broeke, Sanne W; Hes, Frederik J; de Wind, Niels; Nielsen, Maartje; Devilee, Peter; Ligtenberg, Marjolijn J L; Wijnen, Juul T; Tops, Carli M J

    2016-11-01

    Monoallelic PMS2 germline mutations cause 5%-15% of Lynch syndrome, a midlife cancer predisposition, whereas biallelic PMS2 mutations cause approximately 60% of constitutional mismatch repair deficiency (CMMRD), a rare childhood cancer syndrome. Recently improved DNA- and RNA-based strategies are applied to overcome problematic PMS2 mutation analysis due to the presence of pseudogenes and frequent gene conversion events. Here, we determined PMS2 mutation detection yield and mutation spectrum in a nationwide cohort of 396 probands. Furthermore, we studied concordance between tumor IHC/MSI (immunohistochemistry/microsatellite instability) profile and mutation carrier state. Overall, we found 52 different pathogenic PMS2 variants explaining 121 Lynch syndrome and nine CMMRD patients. In vitro mismatch repair assays suggested pathogenicity for three missense variants. Ninety-one PMS2 mutation carriers (70%) showed isolated loss of PMS2 in their tumors, for 31 (24%) no or inconclusive IHC was available, and eight carriers (6%) showed discordant IHC (presence of PMS2 or loss of both MLH1 and PMS2). Ten cases with isolated PMS2 loss (10%; 10/97) harbored MLH1 mutations. We confirmed that recently improved mutation analysis provides a high yield of PMS2 mutations in patients with isolated loss of PMS2 expression. Application of universal tumor prescreening methods will however miss some PMS2 germline mutation carriers. © 2016 WILEY PERIODICALS, INC.

  15. Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects.

    PubMed

    Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling; Wang, Xianhui; Kang, Le

    2017-06-01

    The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain-containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. © The Authors 2017. Published by Oxford University Press.

  16. Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects

    PubMed Central

    Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling

    2017-01-01

    Abstract The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain–containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. PMID:28444351

  17. Analyses of sweet receptor gene (Tas1r2) and preference for sweet stimuli in species of Carnivora.

    PubMed

    Li, Xia; Glaser, Dieter; Li, Weihua; Johnson, Warren E; O'Brien, Stephen J; Beauchamp, Gary K; Brand, Joseph G

    2009-01-01

    The extent to which taste receptor specificity correlates with, or even predicts, diet choice is not known. We recently reported that the insensitivity to sweeteners shown by species of Felidae can be explained by their lacking of a functional Tas1r2 gene. To broaden our understanding of the relationship between the structure of the sweet receptors and preference for sugars and artificial sweeteners, we measured responses to 12 sweeteners in 6 species of Carnivora and sequenced the coding regions of Tas1r2 in these same or closely related species. The lion showed no preference for any of the 12 sweet compounds tested, and it possesses the pseudogenized Tas1r2. All other species preferred some of the natural sugars, and their Tas1r2 sequences, having complete open reading frames, predict functional sweet receptors. In addition to preferring natural sugars, the lesser panda also preferred 3 (neotame, sucralose, and aspartame) of the 6 artificial sweeteners. Heretofore, it had been reported that among vertebrates, only Old World simians could taste aspartame. The observation that the lesser panda highly preferred aspartame could be an example of evolutionary convergence in the identification of sweet stimuli.

  18. Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs

    PubMed Central

    Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L

    2016-01-01

    Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs—Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa. PMID:26419336

  19. A Novel Method for Accurate Operon Predictions in All SequencedProkaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Huang, Katherine H.; Alm, Eric J.

    2004-12-01

    We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacterpylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, andmore » its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from sixphylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.« less

  20. The zebrafish reference genome sequence and its relationship to the human genome.

    PubMed

    Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

    2013-04-25

    Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.

  1. EcoGene 3.0

    PubMed Central

    Zhou, Jindan; Rudd, Kenneth E.

    2013-01-01

    EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection. PMID:23197660

  2. The zebrafish reference genome sequence and its relationship to the human genome

    PubMed Central

    Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.

    2013-01-01

    Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743

  3. EcoGene 3.0.

    PubMed

    Zhou, Jindan; Rudd, Kenneth E

    2013-01-01

    EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection.

  4. A cluster of novel serotonin receptor 3-like genes on human chromosome 3.

    PubMed

    Karnovsky, Alla M; Gotow, Lisa F; McKinley, Denise D; Piechan, Julie L; Ruble, Cara L; Mills, Cynthia J; Schellin, Kathleen A B; Slightom, Jerry L; Fitzgerald, Laura R; Benjamin, Christopher W; Roberds, Steven L

    2003-11-13

    The ligand-gated ion channel family includes receptors for serotonin (5-hydroxytryptamine, 5-HT), acetylcholine, GABA, and glutamate. Drugs targeting subtypes of these receptors have proven useful for the treatment of various neuropsychiatric and neurological disorders. To identify new ligand-gated ion channels as potential therapeutic targets, drafts of human genome sequence were interrogated. Portions of four novel genes homologous to 5-HT(3A) and 5-HT(3B) receptors were identified within human sequence databases. We named the genes 5-HT(3C1)-5-HT(3C4). Radiation hybrid (RH) mapping localized these genes to chromosome 3q27-28. All four genes shared similar intron-exon organizations and predicted protein secondary structure with 5-HT(3A) and 5-HT(3B). Orthologous genes were detected by Southern blotting in several species including dog, cow, and chicken, but not in rodents, suggesting that these novel genes are not present in rodents or are very poorly conserved. Two of the novel genes are predicted to be pseudogenes, but two other genes are transcribed and spliced to form appropriate open reading frames. The 5-HT(3C1) transcript is expressed almost exclusively in small intestine and colon, suggesting a possible role in the serotonin-responsiveness of the gut.

  5. Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes1[OPEN

    PubMed Central

    Law, MeiYee; Childs, Kevin L.; Campbell, Michael S.; Stein, Joshua C.; Olson, Andrew J.; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M.; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2015-01-01

    The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. PMID:25384563

  6. Analyses of Sweet Receptor Gene (Tas1r2) and Preference for Sweet Stimuli in Species of Carnivora

    PubMed Central

    Glaser, Dieter; Li, Weihua; Johnson, Warren E.; O'Brien, Stephen J.; Beauchamp, Gary K.; Brand, Joseph G.

    2009-01-01

    The extent to which taste receptor specificity correlates with, or even predicts, diet choice is not known. We recently reported that the insensitivity to sweeteners shown by species of Felidae can be explained by their lacking of a functional Tas1r2 gene. To broaden our understanding of the relationship between the structure of the sweet receptors and preference for sugars and artificial sweeteners, we measured responses to 12 sweeteners in 6 species of Carnivora and sequenced the coding regions of Tas1r2 in these same or closely related species. The lion showed no preference for any of the 12 sweet compounds tested, and it possesses the pseudogenized Tas1r2. All other species preferred some of the natural sugars, and their Tas1r2 sequences, having complete open reading frames, predict functional sweet receptors. In addition to preferring natural sugars, the lesser panda also preferred 3 (neotame, sucralose, and aspartame) of the 6 artificial sweeteners. Heretofore, it had been reported that among vertebrates, only Old World simians could taste aspartame. The observation that the lesser panda highly preferred aspartame could be an example of evolutionary convergence in the identification of sweet stimuli. PMID:19366814

  7. Cold-adapted tubulins in the glacier ice worm, Mesenchytraeus solifugus.

    PubMed

    Tartaglia, Lawrence J; Shain, Daniel H

    2008-11-01

    Glacier ice worms, Mesenchytraeus solifugus and related species, are the only known annelids that survive obligately in glacier ice and snow. One fundamental component of cold temperature adaptation is the ability to polymerize tubulin, which typically depolymerizes at low physiological temperatures (e.g., <10 degrees C) in most temperate species. In this study, we isolated two alpha-tubulin (Msalpha1, Msalpha2) and two beta-tubulin (Msbeta1, Msbeta2) subunits from an ice worm cDNA library, and compared their predicted amino acid sequences with homologues from other cold-adapted organisms (e.g., Antarctic fish, ciliate) in an effort to identify species-specific amino acid substitutions that contribute to cold temperature-dependent tubulin polymerization. Our comparisons and predicted protein structures suggest that ice worm-specific amino acid substitutions stabilize lateral contact associations, particularly between beta-tubulin protofilaments, but these substitutions occur at different positions in comparison with other cold-adapted tubulins. The ice worm tubulin gene family appears relatively small, comprising one primary alpha- and one primary beta-tubulin monomers, though minor isoforms and pseudogenes were identified. Our analyses suggest that variation occurs in the strategies (i.e., species-specific amino acid substitutions, gene number) by which cold-adapted taxa have evolved the ability to polymerize tubulin at low physiological temperatures.

  8. The Plasmodium PHIST and RESA-Like Protein Families of Human and Rodent Malaria Parasites

    PubMed Central

    Moreira, Cristina K.; Naissant, Bernina; Coppi, Alida; Bennett, Brandy L.; Aime, Elena; Franke-Fayard, Blandine; Janse, Chris J.; Coppens, Isabelle; Sinnis, Photini; Templeton, Thomas J.

    2016-01-01

    The phist gene family has members identified across the Plasmodium genus, defined by the presence of a domain of roughly 150 amino acids having conserved aromatic residues and an all alpha-helical structure. The family is highly amplified in P. falciparum, with 65 predicted genes in the genome of the 3D7 isolate. In contrast, in the rodent malaria parasite P. berghei 3 genes are identified, one of which is an apparent pseudogene. Transcripts of the P. berghei phist genes are predominant in schizonts, whereas in P. falciparum transcript profiles span different asexual blood stages and gametocytes. We pursued targeted disruption of P. berghei phist genes in order to characterize a simplistic model for the expanded phist gene repertoire in P. falciparum. Unsuccessful attempts to disrupt P. berghei PBANKA_114540 suggest that this phist gene is essential, while knockout of phist PBANKA_122900 shows an apparent normal progression and non-essential function throughout the life cycle. Epitope-tagging of P. falciparum and P. berghei phist genes confirmed protein export to the erythrocyte cytoplasm and localization with a punctate pattern. Three P. berghei PEXEL/HT-positive exported proteins exhibit at least partial co-localization, in support of a common vesicular compartment in the cytoplasm of erythrocytes infected with rodent malaria parasites. PMID:27022937

  9. Variation in the nrDNA ITS of Pinus subsection Cembroides: implications for molecular systematic studies of pine species complexes.

    PubMed

    Gernandt, D S; Liston, A; Piñero, D

    2001-12-01

    The pinyon pines (Pinus subsection Cembroides), distributed in semiarid regions of the western United States and Mexico, include a mixture of relictual and more recently evolved taxa. To investigate relationships among the pinyons, we screened and partially sequenced 3000-bp clones of the nuclear ribosomal DNA internal transcribed spacer (ITS) region for 16 taxa from subsect. Cembroides and nine representatives from four other subsections of subgenus Strobus. Restriction digests of clones reveal within-individual heterogeneity, suggesting that concerted evolution is operating slowly on the ITS in pine species. Two ITS clones were identified as pseudogenes. Tandem subrepeats in the ITS1 form stem loops comparable to those in other genera of Pinaceae and may be promoting recombination between rDNA repeats, resulting in ITS1 chimeras. Within the pinyon clade, phylogenetic structure is present, but different clones from the same (or different) individuals of a species are polyphyletic, indicating that coalescence of ITS copies within individual genomes predates evolutionary divergence in the group. At the level of subsection and above, the ITS region corresponds well with morphological and cpDNA evidence. Except for P. nelsonii, the pinyons are monophyletic, with both subsect. Cembroides and P. nelsonii forming a clade with the foxtail and bristlecone pines (subsect. Balfourianae) of western North America.

  10. Insights into natural products biosynthesis from analysis of 490 polyketide synthases from Fusarium.

    PubMed

    Brown, Daren W; Proctor, Robert H

    2016-04-01

    Species of the fungus Fusarium collectively cause disease on almost all crop plants and produce numerous natural products (NPs), including some of the mycotoxins of greatest concern to agriculture. Many Fusarium NPs are derived from polyketide synthases (PKSs), large multi-domain enzymes that catalyze sequential condensation of simple carboxylic acids to form polyketides. To gain insight into the biosynthesis of polyketide-derived NPs in Fusarium, we retrieved 488 PKS gene sequences from genome sequences of 31 species of the fungus. In addition to these apparently functional PKS genes, the genomes collectively included 81 pseudogenized PKS genes. Phylogenetic analysis resolved the PKS genes into 67 clades, and based on multiple lines of evidence, we propose that homologs in each clade are responsible for synthesis of a polyketide that is distinct from those synthesized by PKSs in other clades. The presence and absence of PKS genes among the species examined indicated marked differences in distribution of PKS homologs. Comparisons of Fusarium PKS genes and genes flanking them to those from other Ascomycetes provided evidence that Fusarium has the genetic potential to synthesize multiple NPs that are the same or similar to those reported in other fungi, but that have not yet been reported in Fusarium. The results also highlight ways in which such analyses can help guide identification of novel Fusarium NPs and differences in NP biosynthetic capabilities that exist among fungi. Published by Elsevier Inc.

  11. Why does the giant panda eat bamboo? A comparative analysis of appetite-reward-related genes among mammals.

    PubMed

    Jin, Ke; Xue, Chenyi; Wu, Xiaoli; Qian, Jinyi; Zhu, Yong; Yang, Zhen; Yonezawa, Takahiro; Crabbe, M James C; Cao, Ying; Hasegawa, Masami; Zhong, Yang; Zheng, Yufang

    2011-01-01

    The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda's dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda's diet switch. Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Our results revealed an interesting dopamine metabolic involvement in the panda's food choice. This finding suggests a new direction for molecular evolution studies behind the panda's dietary switch.

  12. Why Does the Giant Panda Eat Bamboo? A Comparative Analysis of Appetite-Reward-Related Genes among Mammals

    PubMed Central

    Jin, Ke; Xue, Chenyi; Wu, Xiaoli; Qian, Jinyi; Zhu, Yong; Yang, Zhen; Yonezawa, Takahiro; Crabbe, M. James C.; Cao, Ying; Hasegawa, Masami; Zhong, Yang; Zheng, Yufang

    2011-01-01

    Background The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda's dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda's diet switch. Methodology/Principal Findings Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Conclusions/Significance Our results revealed an interesting dopamine metabolic involvement in the panda's food choice. This finding suggests a new direction for molecular evolution studies behind the panda's dietary switch. PMID:21818345

  13. Complete genome analysis of Serratia marcescens RSC-14: A plant growth-promoting bacterium that alleviates cadmium stress in host plants

    PubMed Central

    Khan, Abdur Rahim; Park, Gun-Seok; Asaf, Sajjad; Hong, Sung-Jun; Jung, Byung Kwon

    2017-01-01

    Serratia marcescens RSC-14 is a Gram-negative bacterium that was previously isolated from the surface-sterilized roots of the Cd-hyperaccumulator Solanum nigrum. The strain stimulates plant growth and alleviates Cd stress in host plants. To investigate the genetic basis for these traits, the complete genome of RSC-14 was obtained by single-molecule real-time sequencing. The genome of S. marcescens RSC-14 comprised a 5.12-Mbp-long circular chromosome containing 4,593 predicted protein-coding genes, 22 rRNA genes, 88 tRNA genes, and 41 pseudogenes. It contained genes with potential functions in plant growth promotion, including genes involved in indole-3-acetic acid (IAA) biosynthesis, acetoin synthesis, and phosphate solubilization. Moreover, annotation using NCBI and Rapid Annotation using Subsystem Technology identified several genes that encode antioxidant enzymes as well as genes involved in antioxidant production, supporting the observed resistance towards heavy metals, such as Cd. The presence of IAA pathway-related genes and oxidative stress-responsive enzyme genes may explain the plant growth-promoting potential and Cd tolerance, respectively. This is the first report of a complete genome sequence of Cd-tolerant S. marcescens and its plant growth promotion pathway. The whole-genome analysis of this strain clarified the genetic basis underlying its phenotypic and biochemical characteristics, underpinning the beneficial interactions between RSC-14 and plants. PMID:28187139

  14. Genomic structure of rat 3alpha-hydroxysteroid/dihydrodiol dehydrogenase (3alpha-HSD/DD, AKR1C9).

    PubMed

    Lin, H K; Hung, C F; Moore, M; Penning, T M

    1999-11-01

    Rat liver 3alpha-hydroxysteroid/dihydrodiol dehydrogenase (3alpha-HSD/DD) is a member of the aldo-keto reductase (AKR) superfamily. It is involved in the inactivation of steroid hormones and the metabolic activation of polycyclic aromatic hydrocarbons (PAH) by converting trans-dihydrodiols into reactive and redox-active o-quinones. The structure of the 5'-flanking region of the gene and factors involved in the constitutive and regulated expression of this gene have been reported [H.-K. Lin, T.M. Penning, Cloning, sequencing, and functional analysis of the 5'-flanking region of the rat 3alpha-hydroxysteroid/dihydrodiol dehydrogenase gene, Cancer Res. 55 (1995) 4105-4113]. We now describe the complete genomic structure of the rat type 1 3alpha-HSD/DD gene. Charon 4A and P1 genomic clones contained at least three rat genes (type 1, type 2 and type 3 3alpha-HSD/DD) each of which encoded for the same open reading frame (ORF) but differed in their exon-intron organization. 5'-RACE confirmed that the type 1 3alpha-HSD/DD gene encodes for the dominant transcript in rat liver and it was the regulation of this gene that was previously studied. The rat type 1 3alpha-HSD/DD gene is 30 kb in length and consists of nine exons and eight introns. Exon 9 encodes +931 to 966 bp of the ORF and the 1292 bp 3'-UTR implicated in mRNA stability. This genomic structure is nearly identical to the homologous human genes, type 1 3alpha-HSD (chlordecone reductase/DD4, AKR1C4), type 2 3alpha-HSD (AKR1C3) and type 3 3alpha-HSD (bile-acid binding protein, AKR1C2) genes. Three different cDNA's containing identical ORFs for 3alpha-HSD have been reported suggesting that all three genes may be expressed in rat liver. Using 5' primers corresponding to the 5'-UTR's of the three different cDNA's only one PCR fragment was obtained and corresponded to the type 1 3alpha-HSD/DD gene. These data suggested that the type 2 and type 3 3alpha-HSD/DD genes are not abundantly expressed in rat liver. It is unknown whether the type 2 and type 3 3alpha-HSD/DD genes represent pseudo-genes or whether they represent genes that are differentially expressed in other rat tissues.

  15. Decoding the similarities and differences among mycobacterial species

    PubMed Central

    Vedithi, Sundeep Chaitanya; Blundell, Tom L.

    2017-01-01

    Mycobacteriaceae comprises pathogenic species such as Mycobacterium tuberculosis, M. leprae and M. abscessus, as well as non-pathogenic species, for example, M. smegmatis and M. thermoresistibile. Genome comparison and annotation studies provide insights into genome evolutionary relatedness, identify unique and pathogenicity-related genes in each species, and explore new targets that could be used for developing new diagnostics and therapeutics. Here, we present a comparative analysis of ten-mycobacterial genomes with the objective of identifying similarities and differences between pathogenic and non-pathogenic species. We identified 1080 core orthologous clusters that were enriched in proteins involved in amino acid and purine/pyrimidine biosynthetic pathways, DNA-related processes (replication, transcription, recombination and repair), RNA-methylation and modification, and cell-wall polysaccharide biosynthetic pathways. For their pathogenicity and survival in the host cell, pathogenic species have gained specific sets of genes involved in repair and protection of their genomic DNA. M. leprae is of special interest owing to its smallest genome (1600 genes and ~1300 psuedogenes), yet poor genome annotation. More than 75% of the pseudogenes were found to have a functional ortholog in the other mycobacterial genomes and belong to protein families such as transferases, oxidoreductases and hydrolases. PMID:28854187

  16. Comparison of genome degradation in Paratyphi A and Typhi, human-restricted serovars of Salmonella enterica that cause typhoid.

    PubMed

    McClelland, Michael; Sanderson, Kenneth E; Clifton, Sandra W; Latreille, Phil; Porwollik, Steffen; Sabo, Aniko; Meyer, Rekha; Bieri, Tamberlyn; Ozersky, Phil; McLellan, Michael; Harkins, C Richard; Wang, Chunyan; Nguyen, Christine; Berghoff, Amy; Elliott, Glendoria; Kohlberg, Sara; Strong, Cindy; Du, Feiyu; Carter, Jason; Kremizki, Colin; Layman, Dan; Leonard, Shawn; Sun, Hui; Fulton, Lucinda; Nash, William; Miner, Tracie; Minx, Patrick; Delehaunty, Kim; Fronick, Catrina; Magrini, Vincent; Nhan, Michael; Warren, Wesley; Florea, Liliana; Spieth, John; Wilson, Richard K

    2004-12-01

    Salmonella enterica serovars often have a broad host range, and some cause both gastrointestinal and systemic disease. But the serovars Paratyphi A and Typhi are restricted to humans and cause only systemic disease. It has been estimated that Typhi arose in the last few thousand years. The sequence and microarray analysis of the Paratyphi A genome indicates that it is similar to the Typhi genome but suggests that it has a more recent evolutionary origin. Both genomes have independently accumulated many pseudogenes among their approximately 4,400 protein coding sequences: 173 in Paratyphi A and approximately 210 in Typhi. The recent convergence of these two similar genomes on a similar phenotype is subtly reflected in their genotypes: only 30 genes are degraded in both serovars. Nevertheless, these 30 genes include three known to be important in gastroenteritis, which does not occur in these serovars, and four for Salmonella-translocated effectors, which are normally secreted into host cells to subvert host functions. Loss of function also occurs by mutation in different genes in the same pathway (e.g., in chemotaxis and in the production of fimbriae).

  17. Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection

    PubMed Central

    Vesperini, Fabio; Schuller, Björn

    2017-01-01

    In the emerging field of acoustic novelty detection, most research efforts are devoted to probabilistic approaches such as mixture models or state-space models. Only recent studies introduced (pseudo-)generative models for acoustic novelty detection with recurrent neural networks in the form of an autoencoder. In these approaches, auditory spectral features of the next short term frame are predicted from the previous frames by means of Long-Short Term Memory recurrent denoising autoencoders. The reconstruction error between the input and the output of the autoencoder is used as activation signal to detect novel events. There is no evidence of studies focused on comparing previous efforts to automatically recognize novel events from audio signals and giving a broad and in depth evaluation of recurrent neural network-based autoencoders. The present contribution aims to consistently evaluate our recent novel approaches to fill this white spot in the literature and provide insight by extensive evaluations carried out on three databases: A3Novelty, PASCAL CHiME, and PROMETHEUS. Besides providing an extensive analysis of novel and state-of-the-art methods, the article shows how RNN-based autoencoders outperform statistical approaches up to an absolute improvement of 16.4% average F-measure over the three databases. PMID:28182121

  18. Major Histocompatibility Complex Genes Map to Two Chromosomes in an Evolutionarily Ancient Reptile, the Tuatara Sphenodon punctatus

    PubMed Central

    Miller, Hilary C.; O’Meally, Denis; Ezaz, Tariq; Amemiya, Chris; Marshall-Graves, Jennifer A.; Edwards, Scott

    2015-01-01

    Major histocompatibility complex (MHC) genes are a central component of the vertebrate immune system and usually exist in a single genomic region. However, considerable differences in MHC organization and size exist between different vertebrate lineages. Reptiles occupy a key evolutionary position for understanding how variation in MHC structure evolved in vertebrates, but information on the structure of the MHC region in reptiles is limited. In this study, we investigate the organization and cytogenetic location of MHC genes in the tuatara (Sphenodon punctatus), the sole extant representative of the early-diverging reptilian order Rhynchocephalia. Sequencing and mapping of 12 clones containing class I and II MHC genes from a bacterial artificial chromosome library indicated that the core MHC region is located on chromosome 13q. However, duplication and translocation of MHC genes outside of the core region was evident, because additional class I MHC genes were located on chromosome 4p. We found a total of seven class I sequences and 11 class II β sequences, with evidence for duplication and pseudogenization of genes within the tuatara lineage. The tuatara MHC is characterized by high repeat content and low gene density compared with other species and we found no antigen processing or MHC framework genes on the MHC gene-containing clones. Our findings indicate substantial differences in MHC organization in tuatara compared with mammalian and avian MHCs and highlight the dynamic nature of the MHC. Further sequencing and annotation of tuatara and other reptile MHCs will determine if the tuatara MHC is representative of nonavian reptiles in general. PMID:25953959

  19. The ubiquitous mitochondrial creatine kinase gene maps to a conserved region on human chromosome 15q15 and mouse chromosome 2 bands F1-F3

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steeghs, K.; Wieringa, B.; Merkx, G.

    1994-11-01

    Members of the creatine kinase isoenzyme family (CKs; EC 2.7.3.2) are found in mitochondria and specialized subregions of the cytoplasm and catalyze the reversible exchange of high-energy phosphoryl between ATP and phosphocreatine. At least four functionally active genes, which encode the distinct CK subunits CKB, CKM, CKMT1 (ubiquitous), and CKMT2 (sarcomeric), and a variable number of CKB pseudogenes have been identified. Here, we report the use of a CKMT1 containing phage to map the CKMT1 gene by in situ hybridization on both human and mouse chromosomes.

  20. Linkage of genes for laminin B1 and B2 subunits on chromosome 1 in mouse.

    PubMed

    Elliott, R W; Barlow, D; Hogan, B L

    1985-08-01

    We have used cDNA clones for the B1 and B2 subunits of laminin to find restriction fragment length DNA polymorphisms for the genes encoding these polypeptides in the mouse. Three alleles were found for LamB2 and two for LamB1 among the inbred mouse strains. The segregation of these polymorphisms among recombinant inbred strains showed that these genes are tightly linked in the central region of mouse Chromosome 1 between Sas-1 and Ly-m22, 7.4 +/- 3.2 cM distal to the Pep-3 locus. There is no evidence in the mouse for pseudogenes for these proteins.

  1. Functional Annotation, Genome Organization and Phylogeny of the Grapevine (Vitis vinifera) Terpene Synthase Gene Family Based on Genome Assembly, FLcDNA Cloning, and Enzyme Assays

    PubMed Central

    2010-01-01

    Background Terpenoids are among the most important constituents of grape flavour and wine bouquet, and serve as useful metabolite markers in viticulture and enology. Based on the initial 8-fold sequencing of a nearly homozygous Pinot noir inbred line, 89 putative terpenoid synthase genes (VvTPS) were predicted by in silico analysis of the grapevine (Vitis vinifera) genome assembly [1]. The finding of this very large VvTPS family, combined with the importance of terpenoid metabolism for the organoleptic properties of grapevine berries and finished wines, prompted a detailed examination of this gene family at the genomic level as well as an investigation into VvTPS biochemical functions. Results We present findings from the analysis of the up-dated 12-fold sequencing and assembly of the grapevine genome that place the number of predicted VvTPS genes at 69 putatively functional VvTPS, 20 partial VvTPS, and 63 VvTPS probable pseudogenes. Gene discovery and annotation included information about gene architecture and chromosomal location. A dense cluster of 45 VvTPS is localized on chromosome 18. Extensive FLcDNA cloning, gene synthesis, and protein expression enabled functional characterization of 39 VvTPS; this is the largest number of functionally characterized TPS for any species reported to date. Of these enzymes, 23 have unique functions and/or phylogenetic locations within the plant TPS gene family. Phylogenetic analyses of the TPS gene family showed that while most VvTPS form species-specific gene clusters, there are several examples of gene orthology with TPS of other plant species, representing perhaps more ancient VvTPS, which have maintained functions independent of speciation. Conclusions The highly expanded VvTPS gene family underpins the prominence of terpenoid metabolism in grapevine. We provide a detailed experimental functional annotation of 39 members of this important gene family in grapevine and comprehensive information about gene structure and phylogeny for the entire currently known VvTPS gene family. PMID:20964856

  2. Identification and expression analysis of two interleukin-23α (p19) isoforms, in rainbow trout Oncorhynchus mykiss and Atlantic salmon Salmo salar.

    PubMed

    Jiang, Yousheng; Husain, Mansourah; Qi, Zhitao; Bird, Steve; Wang, Tiehui

    2015-08-01

    Interleukin (IL)-23 is a heterodimeric IL-12 family cytokine composed of a p19 α-chain, linked to a p40 β-chain that is shared with IL-12. IL-23 is distinguished functionally from IL-12 by its ability to induce the production of IL-17, and differentiation of Th17 cells in mammals. Three isoforms of p40 (p40a, p40b and p40c) have been found in some 3R teleosts. Salmonids also possess three p40 isoforms (p40b1, p40b2 and p40c) although p40a is missing, and two copies (paralogues) of p40b are present that have presumably been retained following the 4R duplication in this fish lineage. Teleost p19 has been discovered recently in zebrafish, but to date there is limited information on expression and modulation of this molecule. In this report we have cloned two p19 paralogues (p19a and p19b) in salmonids, suggesting that a salmonid can possess six potential IL-23 isoforms. Whilst Atlantic salmon has two active p19 genes, the rainbow trout p19b gene may have been pseudogenized. The salmonid p19 translations share moderate identities (22.8-29.9%) to zebrafish and mammalian p19 molecules, but their identity was supported by structural features, a conserved 4 exon/3 intron gene organisation, and phylogenetic tree analysis. The active salmonid p19 genes are highly expressed in blood and gonad. Bacterial (Yersinia ruckeri) and viral infection in rainbow trout induces the expression of p19a, suggesting pathogen-specific induction of IL-23 isoforms. Trout p19a expression was also induced by PAMPs (poly IC and peptidoglycan) and the proinflammatory cytokine IL-1β in primary head kidney macrophages. These data may indicate diverse functional roles of trout IL-23 isoforms in regulating the immune response in fish. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Contrasting allelic distribution of CO/ Hd1 homologues in Miscanthus sinensis from the East Asian mainland and the Japanese archipelago

    DOE PAGES

    Nagano, Hironori; Clark, Lindsay V.; Zhao, Hua; ...

    2015-06-18

    The genus Miscanthus is a perennial C 4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggestingmore » that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor-Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) ( MsiMITE1-MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. In conclusion, these differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan.« less

  4. Contrasting allelic distribution of CO/Hd1 homologues in Miscanthus sinensis from the East Asian mainland and the Japanese archipelago

    PubMed Central

    Nagano, Hironori; Clark, Lindsay V.; Zhao, Hua; Peng, Junhua; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Anzoua, Kossonou Guillaume; Matsuo, Tomoaki; Sacks, Erik J.; Yamada, Toshihiko

    2015-01-01

    The genus Miscanthus is a perennial C4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggesting that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor–Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) (MsiMITE1–MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. These differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan. PMID:26089536

  5. Contrasting allelic distribution of CO/Hd1 homologues in Miscanthus sinensis from the East Asian mainland and the Japanese archipelago.

    PubMed

    Nagano, Hironori; Clark, Lindsay V; Zhao, Hua; Peng, Junhua; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Anzoua, Kossonou Guillaume; Matsuo, Tomoaki; Sacks, Erik J; Yamada, Toshihiko

    2015-07-01

    The genus Miscanthus is a perennial C4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggesting that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor-Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) (MsiMITE1-MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. These differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  6. Differentiated adaptive evolution, episodic relaxation of selective constraints, and pseudogenization of umami and sweet taste genes TAS1Rs in catarrhine primates.

    PubMed

    Liu, Guangjian; Walter, Lutz; Tang, Suni; Tan, Xinxin; Shi, Fanglei; Pan, Huijuan; Roos, Christian; Liu, Zhijin; Li, Ming

    2014-01-01

    Umami and sweet tastes are two important basic taste perceptions that allow animals to recognize diets with nutritious carbohydrates and proteins, respectively. Until recently, analyses of umami and sweet taste were performed on various domestic and wild animals. While most of these studies focused on the pseudogenization of taste genes, which occur mostly in carnivores and species with absolute feeding specialization, omnivores and herbivores were more or less neglected. Catarrhine primates are a group of herbivorous animals (feeding mostly on plants) with significant divergence in dietary preference, especially the specialized folivorous Colobinae. Here, we conducted the most comprehensive investigation to date of selection pressure on sweet and umami taste genes (TAS1Rs) in catarrhine primates to test whether specific adaptive evolution occurred during their diversification, in association with particular plant diets. We documented significant relaxation of selective constraints on sweet taste gene TAS1R2 in the ancestral branch of Colobinae, which might correlate with their unique ingestion and digestion of leaves. Additionally, we identified positive selection acting on Cercopithecidae lineages for the umami taste gene TAS1R1, on the Cercopithecinae and extant Colobinae and Hylobatidae lineages for TAS1R2, and on Macaca lineages for TAS1R3. Our research further identified several site mutations in Cercopithecidae, Colobinae and Pygathrix, which were detected by previous studies altering the sensitivity of receptors. The positively selected sites were located mostly on the extra-cellular region of TAS1Rs. Among these positively selected sites, two vital sites for TAS1R1 and four vital sites for TAS1R2 in extra-cellular region were identified as being responsible for the binding of certain sweet and umami taste molecules through molecular modelling and docking. Our results suggest that episodic and differentiated adaptive evolution of TAS1Rs pervasively occurred in catarrhine primates, most concentrated upon the extra-cellular region of TAS1Rs.

  7. Contrasting allelic distribution of CO/ Hd1 homologues in Miscanthus sinensis from the East Asian mainland and the Japanese archipelago

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nagano, Hironori; Clark, Lindsay V.; Zhao, Hua

    The genus Miscanthus is a perennial C 4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggestingmore » that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor-Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) ( MsiMITE1-MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. In conclusion, these differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan.« less

  8. International Union of Basic and Clinical Pharmacology. LXXXVIII. G Protein-Coupled Receptor List: Recommendations for New Pairings with Cognate Ligands

    PubMed Central

    Alexander, Stephen P. H.; Sharman, Joanna L.; Pawson, Adam J.; Benson, Helen E.; Monaghan, Amy E.; Liew, Wen Chiy; Mpamhanga, Chidochangu P.; Bonner, Tom I.; Neubig, Richard R.; Pin, Jean Philippe; Spedding, Michael; Harmar, Anthony J.

    2013-01-01

    In 2005, the International Union of Basic and Clinical Pharmacology Committee on Receptor Nomenclature and Drug Classification (NC-IUPHAR) published a catalog of all of the human gene sequences known or predicted to encode G protein-coupled receptors (GPCRs), excluding sensory receptors. This review updates the list of orphan GPCRs and describes the criteria used by NC-IUPHAR to recommend the pairing of an orphan receptor with its cognate ligand(s). The following recommendations are made for new receptor names based on 11 pairings for class A GPCRs: hydroxycarboxylic acid receptors [HCA1 (GPR81) with lactate, HCA2 (GPR109A) with 3-hydroxybutyric acid, HCA3 (GPR109B) with 3-hydroxyoctanoic acid]; lysophosphatidic acid receptors [LPA4 (GPR23), LPA5 (GPR92), LPA6 (P2Y5)]; free fatty acid receptors [FFA4 (GPR120) with omega-3 fatty acids]; chemerin receptor (CMKLR1; ChemR23) with chemerin; CXCR7 (CMKOR1) with chemokines CXCL12 (SDF-1) and CXCL11 (ITAC); succinate receptor (SUCNR1) with succinate; and oxoglutarate receptor [OXGR1 with 2-oxoglutarate]. Pairings are highlighted for an additional 30 receptors in class A where further input is needed from the scientific community to validate these findings. Fifty-seven human class A receptors (excluding pseudogenes) are still considered orphans; information has been provided where there is a significant phenotype in genetically modified animals. In class B, six pairings have been reported by a single publication, with 28 (excluding pseudogenes) still classified as orphans. Seven orphan receptors remain in class C, with one pairing described by a single paper. The objective is to stimulate research into confirming pairings of orphan receptors where there is currently limited information and to identify cognate ligands for the remaining GPCRs. Further information can be found on the IUPHAR Database website (http://www.iuphar-db.org). PMID:23686350

  9. Inactivation of the olfactory marker protein (OMP) gene in river dolphins and other odontocete cetaceans.

    PubMed

    Springer, Mark S; Gatesy, John

    2017-04-01

    Various toothed whales (Odontoceti) are unique among mammals in lacking olfactory bulbs as adults and are thought to be anosmic (lacking the olfactory sense). At the molecular level, toothed whales have high percentages of pseudogenic olfactory receptor genes, but species that have been investigated to date retain an intact copy of the olfactory marker protein gene (OMP), which is highly expressed in olfactory receptor neurons and may regulate the temporal resolution of olfactory responses. One hypothesis for the retention of intact OMP in diverse odontocete lineages is that this gene is pleiotropic with additional functions that are unrelated to olfaction. Recent expression studies provide some support for this hypothesis. Here, we report OMP sequences for representatives of all extant cetacean families and provide the first molecular evidence for inactivation of this gene in vertebrates. Specifically, OMP exhibits independent inactivating mutations in six different odontocete lineages: four river dolphin genera (Platanista, Lipotes, Pontoporia, Inia), sperm whale (Physeter), and harbor porpoise (Phocoena). These results suggest that the only essential role of OMP that is maintained by natural selection is in olfaction, although a non-olfactory role for OMP cannot be ruled out for lineages that retain an intact copy of this gene. Available genome sequences from cetaceans and close outgroups provide evidence of inactivating mutations in two additional genes (CNGA2, CNGA4), which imply further pseudogenization events in the olfactory cascade of odontocetes. Selection analyses demonstrate that evolutionary constraints on all three genes (OMP, CNGA2, CNGA4) have been greatly reduced in Odontoceti, but retain a signature of purifying selection on the stem Cetacea branch and in Mysticeti (baleen whales). This pattern is compatible with the 'echolocation-priority' hypothesis for the evolution of OMP, which posits that negative selection was maintained in the common ancestor of Cetacea and was not relaxed significantly until the evolution of echolocation in Odontoceti. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Are both sympatric species Ilex perado and Ilex canariensis secretly hybridizing? Indication from nuclear markers collected in Tenerife

    PubMed Central

    Manen, Jean-François

    2004-01-01

    Background Intra-specific and intra-individual polymorphism is frequently observed in nuclear markers of Ilex (Aquifoliaceae) and discrepancy between plastid and nuclear phylogenies is the rule in this genus. These observations suggest that inter-specific plastid or/and nuclear introgression played an important role in the process of evolution of Ilex. With the aim of a precise understanding of the evolution of this genus, two distantly related sympatric species collected in Tenerife (Canary Islands), I. perado and I. canariensis, were studied in detail. Introgression between these two species was previously never reported. One plastid marker (the atpB-rbcL spacer) and two nuclear markers, the ribosomal internal transcribed spacer (ITS) and the nuclear encoded plastid glutamine synthetase (nepGS) were analyzed for 13 and 27 individuals of I. perado and I. canariensis, respectively. Results The plastid marker is intra-specifically constant and correlated with species identity. On the other hand, whereas the nuclear markers are conserved in I. perado, they are highly polymorphic in I. canariensis. The presence of pseudogenes and recombination in ITS sequences of I. canariensis explain this polymorphism. Ancestral sequence polymorphism with incomplete lineage sorting, or past or recent hybridization with an unknown species could explain this polymorphism, not resolved by concerted evolution. However, as already reported for many other plants, past or recent introgression of an alien genotype seem the most probable explanation for such a tremendous polymorphism. Conclusions Data do not allow the determination with certitude of the putative species introgressing I. canariensis, but I. perado is suspected. The introgression would be unilateral, with I. perado as the male donor, and the paternal sequences would be rapidly converted in highly divergent and consequently unidentifiable pseudogenes. At least, this study allows the establishment of precautionary measures when nuclear markers are used in phylogenetic studies of genera having experienced introgression such as the genus Ilex. PMID:15550175

  11. Asian population frequencies and haplotype distribution of killer cell immunoglobulin-like receptor (KIR) genes among Chinese, Malay, and Indian in Singapore.

    PubMed

    Lee, Yi Chuan; Chan, Soh Ha; Ren, Ee Chee

    2008-11-01

    Killer cell immunoglobulin-like receptors (KIR) gene frequencies have been shown to be distinctly different between populations and contribute to functional variation in the immune response. We have investigated KIR gene frequencies in 370 individuals representing three Asian populations in Singapore and report here the distribution of 14 KIR genes (2DL1, 2DL2, 2DL3, 2DL4, 2DL5, 2DS1, 2DS2, 2DS3, 2DS4, 2DS5, 3DL1, 3DL2, 3DL3, 3DS1) with two pseudogenes (2DP1, 3DP1) among Singapore Chinese (n = 210); Singapore Malay (n = 80), and Singapore Indian (n = 80). Four framework genes (KIR3DL3, 3DP1, 2DL4, 3DL2) and a nonframework pseudogene 2DP1 were detected in all samples while KIR2DS2, 2DL2, 2DL5, and 2DS5 had the greatest significant variation across the three populations. Fifteen significant linkage patterns, consistent with associations between genes of A and B haplotypes, were observed. Eighty-four distinct KIR profiles were determined in our populations, 38 of which had not been described in other populations. KIR haplotype studies were performed using nine Singapore Chinese families comprising 34 individuals. All genotypes could be resolved into corresponding pairs of existing haplotypes with eight distinct KIR genotypes and eight different haplotypes. The haplotype A2 with frequency of 63.9% was dominant in Singapore Chinese, comparable to that reported in Korean and Chinese Han. The A haplotypes predominate in Singapore Chinese, with ratio of A to B haplotypes of approximately 3:1. Comparison with KIR frequencies in other populations showed that Singapore Chinese shared similar distributions with Chinese Han, Japanese, and Korean; Singapore Indian was found to be comparable with North Indian Hindus while Singapore Malay resembled the Thai.

  12. The frequency of previously undetectable deletions involving 3' Exons of the PMS2 gene.

    PubMed

    Vaughn, Cecily P; Baker, Christine L; Samowitz, Wade S; Swensen, Jeffrey J

    2013-01-01

    Lynch syndrome is characterized by mutations in one of four mismatch repair genes, MLH1, MSH2, MSH6, or PMS2. Clinical mutation analysis of these genes includes sequencing of exonic regions and deletion/duplication analysis. However, detection of deletions and duplications in PMS2 has previously been confined to Exons 1-11 due to gene conversion between PMS2 and the pseudogene PMS2CL in the remaining 3' exons (Exons 12-15). We have recently described an MLPA-based method that permits detection of deletions of PMS2 Exons 12-15; however, the frequency of such deletions has not yet been determined. To address this question, we tested for 3' deletions in 58 samples that were reported to be negative for PMS2 mutations using previously available methods. All samples were from individuals whose tumors exhibited loss of PMS2 immunohistochemical staining without concomitant loss of MLH1 immunostaining. We identified seven samples in this cohort with deletions in the 3' region of PMS2, including three previously reported samples with deletions of Exons 13-15 (two samples) and Exons 14-15. Also detected were deletions of Exons 12-15, Exon 13, and Exon 14 (two samples). Breakpoint analysis of the intragenic deletions suggests they occurred through Alu-mediated recombination. Our results indicate that ∼12% of samples suspected of harboring a PMS2 mutation based on immunohistochemical staining, for which mutations have not yet been identified, would benefit from testing using the new methodology. Copyright © 2012 Wiley Periodicals, Inc.

  13. A frame-shift mutation of PMS2 is a widespread cause of Lynch syndrome.

    PubMed

    Clendenning, M; Senter, L; Hampel, H; Robinson, K Lagerstedt; Sun, S; Buchanan, D; Walsh, M D; Nilbert, M; Green, J; Potter, J; Lindblom, A; de la Chapelle, A

    2008-06-01

    When compared to the other mismatch repair genes involved in Lynch syndrome, the identification of mutations within PMS2 has been limited (<2% of all identified mutations), yet the immunohistochemical analysis of tumour samples indicates that approximately 5% of Lynch syndrome cases are caused by PMS2. This disparity is primarily due to complications in the study of this gene caused by interference from pseudogene sequences. Using a recently developed method for detecting PMS2 specific mutations, we have screened 99 patients who are likely candidates for PMS2 mutations based on immunohistochemical analysis. We have identified a frequently occurring frame-shift mutation (c.736_741del6ins11) in 12 ostensibly unrelated Lynch syndrome patients (20% of patients we have identified with a deleterious mutation in PMS2, n = 61). These individuals all display the rare allele (population frequency <0.05) at a single nucleotide polymorphism (SNP) in exon 11, and have been shown to possess a short common haplotype, allowing us to calculate that the mutation arose around 1625 years ago (65 generations; 95% confidence interval 22 to 120). Ancestral analysis indicates that this mutation is enriched in individuals with British and Swedish ancestry. We estimate that there are >10 000 carriers of this mutation in the USA alone. The identification of both the mutation and the common haplotype in one Swedish control sample (n = 225), along with evidence that Lynch syndrome associated cancers are rarer than expected in the probands' families, would suggest that this is a prevalent mutation with reduced penetrance.

  14. Mapping of the Pim-1 oncogene in mouse t-haplotypes and its use to define the relative map positions of the tcl loci t0(t6) and tw12 and the marker tf (tufted).

    PubMed

    Ark, B; Gummere, G; Bennett, D; Artzt, K

    1991-06-01

    Pim-1 is an oncogene activated in mouse T-cell lymphomas induced by Moloney and AKR mink cell focus (MCF) viruses. Pim-1 was previously mapped to chromosome 17 by somatic cell hybrids, and subsequently to the region between the hemoglobin alpha-chain pseudogene 4 (Hba-4ps) and the alpha-crystalline gene (Crya-1) by Southern blot analysis of DNA obtained from panels of recombinant inbred strains. We have now mapped Pim-1 more accurately in t-haplotypes by analysis of recombinant t-chromosomes. The recombinants were derived from Tts6tf/t12 parents backcrossed to + tf/ + tf, and scored for recombination between the loci of T and tf. For simplicity all t-complex lethal genes properly named tcl-tx are shortened to tx. The Pim-1 gene was localized 0.6 cM proximal to the tw12 lethal gene, thus placing the Pim-1 gene 5.2 cM distal to the H-2 region in t-haplotypes. Once mapped, the Pim-1 gene was used as a marker for further genetic analysis of t-haplotypes. tw12 is so close to tf that even with a large number of recombinants it was not possible to determine whether it is proximal or distal to tf. Southern blot analysis of DNA from T-tf recombinants with a separation of tw12 and tf indicated that tw12 is proximal to tf. The mapping of two allelic t-lethals, t0 and t6 with respect to tw12 and tf has also been a problem.(ABSTRACT TRUNCATED AT 250 WORDS)

  15. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    PubMed Central

    Huang, Yuan; Wang, Jun; Yang, Yongping; Fan, Chuanzhu; Chen, Jiahui

    2017-01-01

    Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs) and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in Salicaceae provide resources to better understand the successful adaptation of Salicaceae species. PMID:28676809

  16. The SUVR4 Histone Lysine Methyltransferase Binds Ubiquitin and Converts H3K9me1 to H3K9me3 on Transposon Chromatin in Arabidopsis

    PubMed Central

    Veiseth, Silje V.; Rahman, Mohummad A.; Yap, Kyoko L.; Fischer, Andreas; Egge-Jacobsen, Wolfgang; Reuter, Gunter; Zhou, Ming-Ming; Aalen, Reidunn B.; Thorstensen, Tage

    2011-01-01

    Chromatin structure and gene expression are regulated by posttranslational modifications (PTMs) on the N-terminal tails of histones. Mono-, di-, or trimethylation of lysine residues by histone lysine methyltransferases (HKMTases) can have activating or repressive functions depending on the position and context of the modified lysine. In Arabidopsis, trimethylation of lysine 9 on histone H3 (H3K9me3) is mainly associated with euchromatin and transcribed genes, although low levels of this mark are also detected at transposons and repeat sequences. Besides the evolutionarily conserved SET domain which is responsible for enzyme activity, most HKMTases also contain additional domains which enable them to respond to other PTMs or cellular signals. Here we show that the N-terminal WIYLD domain of the Arabidopsis SUVR4 HKMTase binds ubiquitin and that the SUVR4 product specificity shifts from di- to trimethylation in the presence of free ubiquitin, enabling conversion of H3K9me1 to H3K9me3 in vitro. Chromatin immunoprecipitation and immunocytological analysis showed that SUVR4 in vivo specifically converts H3K9me1 to H3K9me3 at transposons and pseudogenes and has a locus-specific repressive effect on the expression of such elements. Bisulfite sequencing indicates that this repression involves both DNA methylation–dependent and –independent mechanisms. Transcribed genes with high endogenous levels of H3K4me3, H3K9me3, and H2Bub1, but low H3K9me1, are generally unaffected by SUVR4 activity. Our results imply that SUVR4 is involved in the epigenetic defense mechanism by trimethylating H3K9 to suppress potentially harmful transposon activity. PMID:21423664

  17. [Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

    PubMed

    Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

    2010-01-01

    Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses already existing in the natural world.

  18. A-to-I RNA editing is developmentally regulated and generally adaptive for sexual reproduction in Neurospora crassa

    PubMed Central

    Li, Yang; Chen, Daipeng; Qi, Zhaomei; Wang, Qinhu; Wang, Jianhua; Jiang, Cong; Xu, Jin-Rong

    2017-01-01

    Although fungi lack adenosine deaminase acting on RNA (ADAR) enzymes, adenosine to inosine (A-to-I) RNA editing was reported recently in Fusarium graminearum during sexual reproduction. In this study, we profiled the A-to-I editing landscape and characterized its functional and adaptive properties in the model filamentous fungus Neurospora crassa. A total of 40,677 A-to-I editing sites were identified, and approximately half of them displayed stage-specific editing or editing levels at different sexual stages. RNA-sequencing analysis with the Δstc-1 and Δsad-1 mutants confirmed A-to-I editing occurred before ascus development but became more prevalent during ascosporogenesis. Besides fungal-specific sequence and secondary structure preference, 63.5% of A-to-I editing sites were in the coding regions and 81.3% of them resulted in nonsynonymous recoding, resulting in a significant increase in the proteome complexity. Many genes involved in RNA silencing, DNA methylation, and histone modifications had extensive recoding, including sad-1, sms-3, qde-1, and dim-2. Fifty pseudogenes harbor premature stop codons that require A-to-I editing to encode full-length proteins. Unlike in humans, nonsynonymous editing events in N. crassa are generally beneficial and favored by positive selection. Almost half of the nonsynonymous editing sites in N. crassa are conserved and edited in Neurospora tetrasperma. Furthermore, hundreds of them are conserved in F. graminearum and had higher editing levels. Two unknown genes with editing sites conserved between Neurospora and Fusarium were experimentally shown to be important for ascosporogenesis. This study comprehensively analyzed A-to-I editing in N. crassa and showed that RNA editing is stage-specific and generally adaptive, and may be functionally related to repeat induced point mutation and meiotic silencing by unpaired DNA. PMID:28847945

  19. A colostrum trypsin inhibitor gene expressed in the Cape fur seal mammary gland during lactation.

    PubMed

    Pharo, Elizabeth A; Cane, Kylie N; McCoey, Julia; Buckle, Ashley M; Oosthuizen, W H; Guinet, Christophe; Arnould, John P Y

    2016-03-01

    The colostrum trypsin inhibitor (CTI) gene and transcript were cloned from the Cape fur seal mammary gland and CTI identified by in silico analysis of the Pacific walrus and polar bear genomes (Order Carnivora), and in marine and terrestrial mammals of the Orders Cetartiodactyla (yak, whales, camel) and Perissodactyla (white rhinoceros). Unexpectedly, Weddell seal CTI was predicted to be a pseudogene. Cape fur seal CTI was expressed in the mammary gland of a pregnant multiparous seal, but not in a seal in its first pregnancy. While bovine CTI is expressed for 24-48 h postpartum (pp) and secreted in colostrum only, Cape fur seal CTI was detected for at least 2-3 months pp while the mother was suckling its young on-shore. Furthermore, CTI was expressed in the mammary gland of only one of the lactating seals that was foraging at-sea. The expression of β-casein (CSN2) and β-lactoglobulin II (LGB2), but not CTI in the second lactating seal foraging at-sea suggested that CTI may be intermittently expressed during lactation. Cape fur seal and walrus CTI encode putative small, secreted, N-glycosylated proteins with a single Kunitz/bovine pancreatic trypsin inhibitor (BPTI) domain indicative of serine protease inhibition. Mature Cape fur seal CTI shares 92% sequence identity with Pacific walrus CTI, but only 35% identity with BPTI. Structural homology modelling of Cape fur seal CTI and Pacific walrus trypsin based on the model of the second Kunitz domain of human tissue factor pathway inhibitor (TFPI) and porcine trypsin (Protein Data Bank: 1TFX) confirmed that CTI inhibits trypsin in a canonical fashion. Therefore, pinniped CTI may be critical for preventing the proteolytic degradation of immunoglobulins that are passively transferred from mother to young via colostrum and milk. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. The immunoglobulin heavy chain locus in the platypus (Ornithorhynchus anatinus).

    PubMed

    Gambón-Deza, F; Sánchez-Espinel, C; Magadán-Mompó, S

    2009-08-01

    Immunoglobulins loci in mammals are well known to be organized within a translocon, however their origin remains unresolved. Four of the five classes of immunoglobulins described in humans and rodents (immunoglobulins M, G, E and A-IgM, IgG, IgE and IgA) were found in marsupials and monotremes (immunoglobulin D-IgD was not found) thus showing that the genomic structure of antibodies in mammals has remained constant since its origin. We have recently described the genomic organization of the immunoglobulin heavy chain locus in reptiles (IGHM, IGHD and IGHY). These data and the characterization of the IGH locus in platypus (Ornithorhynchus anatinus), allow us to elucidate the changes that took place in this genomic region during evolution from reptile to mammal. Thus, by using available genome data, we were able to detect that platypus IGH locus contains reptilian and mammalian genes. Besides having an IGHD that is very similar to the one in reptiles and an IGHY, they also present the mammal specific antibody genes IGHG and IGHE, in addition to IGHA. We also detected a pseudogene that originated by recombination between the IGHD and the IGHM (similar to the IGHD2 found in Eublepharis macularius). The analysis of the IGH locus in platypus shows that IGHY was duplicated, firstly by evolving into IGHE and then into IGHG. The IGHA of the platypus has a complex origin, and probably arose by a process of recombination between the IGHM and the IGHY. We detected about 44 VH genes (25 were already described), most of which comprise a single group. When we compared these VH genes with those described in Anolis carolinensis, we find that there is an evolutionary relationship between the VH genes of platypus and the reptilian Group III genes. These results suggest that a fast VH turnover took place in platypus and this gave rise to a family with a high VH gene number and the disappearance of the earlier VH families.

  1. Structure and transcriptional regulation of the major intrinsic protein gene family in grapevine.

    PubMed

    Wong, Darren Chern Jan; Zhang, Li; Merlin, Isabelle; Castellarin, Simone D; Gambetta, Gregory A

    2018-04-11

    The major intrinsic protein (MIP) family is a family of proteins, including aquaporins, which facilitate water and small molecule transport across plasma membranes. In plants, MIPs function in a huge variety of processes including water transport, growth, stress response, and fruit development. In this study, we characterize the structure and transcriptional regulation of the MIP family in grapevine, describing the putative genome duplication events leading to the family structure and characterizing the family's tissue and developmental specific expression patterns across numerous preexisting microarray and RNAseq datasets. Gene co-expression network (GCN) analyses were carried out across these datasets and the promoters of each family member were analyzed for cis-regulatory element structure in order to provide insight into their transcriptional regulation. A total of 29 Vitis vinifera MIP family members (excluding putative pseudogenes) were identified of which all but two were mapped onto Vitis vinifera chromosomes. In this study, segmental duplication events were identified for five plasma membrane intrinsic protein (PIP) and four tonoplast intrinsic protein (TIP) genes, contributing to the expansion of PIPs and TIPs in grapevine. Grapevine MIP family members have distinct tissue and developmental expression patterns and hierarchical clustering revealed two primary groups regardless of the datasets analyzed. Composite microarray and RNA-seq gene co-expression networks (GCNs) highlighted the relationships between MIP genes and functional categories involved in cell wall modification and transport, as well as with other MIPs revealing a strong co-regulation within the family itself. Some duplicated MIP family members have undergone sub-functionalization and exhibit distinct expression patterns and GCNs. Cis-regulatory element (CRE) analyses of the MIP promoters and their associated GCN members revealed enrichment for numerous CREs including AP2/ERFs and NACs. Combining phylogenetic analyses, gene expression profiling, gene co-expression network analyses, and cis-regulatory element enrichment, this study provides a comprehensive overview of the structure and transcriptional regulation of the grapevine MIP family. The study highlights the duplication and sub-functionalization of the family, its strong coordinated expression with genes involved in growth and transport, and the putative classes of TFs responsible for its regulation.

  2. Y chromosome of D. pseudoobscura is not homologous to the ancestral Drosophila Y.

    PubMed

    Carvalho, Antonio Bernardo; Clark, Andrew G

    2005-01-07

    We report a genome-wide search of Y-linked genes in Drosophila pseudoobscura. All six identifiable orthologs of the D. melanogaster Y-linked genes have autosomal inheritance in D. pseudoobscura. Four orthologs were investigated in detail and proved to be Y-linked in D. guanche and D. bifasciata, which shows that less than 18 million years ago the ancestral Drosophila Y chromosome was translocated to an autosome in the D. pseudoobscura lineage. We found 15 genes and pseudogenes in the current Y of D. pseudoobscura, and none are shared with the D. melanogaster Y. Hence, the Y chromosome in the D. pseudoobscura lineage appears to have arisen de novo and is not homologous to the D. melanogaster Y.

  3. Molecular evolution tracks macroevolutionary transitions in Cetacea.

    PubMed

    McGowen, Michael R; Gatesy, John; Wildman, Derek E

    2014-06-01

    Cetacea (whales, dolphins, and porpoises) is a model group for investigating the molecular signature of macroevolutionary transitions. Recent research has begun to reveal the molecular underpinnings of the remarkable anatomical and behavioral transformation in this clade. This shift from terrestrial to aquatic environments is arguably the best-understood major morphological transition in vertebrate evolution. The ancestral body plan and physiology were extensively modified and, in many cases, these crucial changes are recorded in cetacean genomes. Recent studies have highlighted cetaceans as central to understanding adaptive molecular convergence and pseudogene formation. Here, we review current research in cetacean molecular evolution and the potential of Cetacea as a model for the study of other macroevolutionary transitions from a genomic perspective. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Our retroviral heritage.

    PubMed

    Patience, C; Wilkinson, D A; Weiss, R A

    1997-03-01

    Darwin could not have foretold that we are descended from viruses as well as from apes. While there is clear evidence that viral diseases, such as polio and rabies, affected ancient civilizations, viruses were not defined until the early years of this century, shortly after the rediscovery of mendelian genetics. That retroviral genomes can oscillate between infectious and genetic modes of transmission seemed preposterous before the discovery of reverse transcription in 1970. Those of us who had earlier provided mendelian evidence for germ-line transmission of retroviruses were subject of friendly ridicule. Today, the shunting of genetic elements between chromosomes and RNA, and the generation of processed pseudogenes, seems commonplace. It is timely, however, to revisit the topic of human endogenous retroviruses-the subject of this article.

  5. Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes.

    PubMed

    Law, MeiYee; Childs, Kevin L; Campbell, Michael S; Stein, Joshua C; Olson, Andrew J; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M; Lawrence, Carolyn J; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2015-01-01

    The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. © 2015 American Society of Plant Biologists. All Rights Reserved.

  6. Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

    PubMed

    Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

    2016-12-01

    In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.

  7. Maintenance and Loss of Duplicated Genes by Dosage Subfunctionalization.

    PubMed

    Gout, Jean-Francois; Lynch, Michael

    2015-08-01

    Whole-genome duplications (WGDs) have contributed to gene-repertoire enrichment in many eukaryotic lineages. However, most duplicated genes are eventually lost and it is still unclear why some duplicated genes are evolutionary successful whereas others quickly turn to pseudogenes. Here, we show that dosage constraints are major factors opposing post-WGD gene loss in several Paramecium species that share a common ancestral WGD. We propose a model where a majority of WGD-derived duplicates preserve their ancestral function and are retained to produce enough of the proteins performing this same ancestral function. Under this model, the expression level of individual duplicated genes can evolve neutrally as long as they maintain a roughly constant summed expression, and this allows random genetic drift toward uneven contributions of the two copies to total expression. Our analysis suggests that once a high level of imbalance is reached, which can require substantial lengths of time, the copy with the lowest expression level contributes a small enough fraction of the total expression that selection no longer opposes its loss. Extension of our analysis to yeast species sharing a common ancestral WGD yields similar results, suggesting that duplicated-gene retention for dosage constraints followed by divergence in expression level and eventual deterministic gene loss might be a universal feature of post-WGD evolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Global proteomic analysis of two tick-borne emerging zoonotic agents: Anaplasma phagocytophilum and Ehrlichia chaffeensis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Mingqun ..; Kikuchi, Takane; Brewer, Heather M.

    2011-02-17

    Anaplasma phagocytophilum and Ehrlichia chaffeensis are obligatory intracellular {alpha}-proteobacteria that infect human leukocytes and cause potentially fatal emerging zoonoses. In the present study, we determined global protein expression profiles of these bacteria cultured in the human promyelocytic leukemia cell line, HL-60. Mass spectrometric (MS) analyses identified a total of 1,212 A. phagocytophilum and 1,021 E. chaffeensis proteins, representing 89.3 and 92.3% of the predicted bacterial proteomes, respectively. Nearly all bacterial proteins ({approx}99%) with known functions were expressed, whereas only approximately 80% of hypothetical proteins were detected in infected human cells. Quantitative MS/MS analyses indicated that highly expressed proteins in bothmore » bacteria included chaperones, enzymes involved in biosynthesis and metabolism, and outer membrane proteins, such as A. phagocytophilum P44 and E. chaffeensis P28/OMP-1. Among 113 A. phagocytophilum p44 paralogous genes, 110 of them were expressed and 88 of them were encoded by pseudogenes. In addition, bacterial infection of HL-60 cells up-regulated the expression of human proteins involved mostly in cytoskeleton components, vesicular trafficking, cell signaling, and energy metabolism, but down regulated some pattern recognition receptors involved in innate immunity. Our proteomics data represent a comprehensive analysis of A. phagocytophilum and E. chaffeensis proteomes, and provide a quantitative view of human host protein expression profiles regulated by bacterial infection. The availability of these proteomic data will provide new insights into biology and pathogenesis of these obligatory intracellular pathogens.« less

  9. Clinical Validation of Copy Number Variant Detection from Targeted Next-Generation Sequencing Panels.

    PubMed

    Kerkhof, Jennifer; Schenkel, Laila C; Reilly, Jack; McRobbie, Sheri; Aref-Eshghi, Erfan; Stuart, Alan; Rupar, C Anthony; Adams, Paul; Hegele, Robert A; Lin, Hanxin; Rodenhiser, David; Knoll, Joan; Ainsworth, Peter J; Sadikovic, Bekim

    2017-11-01

    Next-generation sequencing (NGS) technology has rapidly replaced Sanger sequencing in the assessment of sequence variations in clinical genetics laboratories. One major limitation of current NGS approaches is the ability to detect copy number variations (CNVs) approximately >50 bp. Because these represent a major mutational burden in many genetic disorders, parallel CNV assessment using alternate supplemental methods, along with the NGS analysis, is normally required, resulting in increased labor, costs, and turnaround times. The objective of this study was to clinically validate a novel CNV detection algorithm using targeted clinical NGS gene panel data. We have applied this approach in a retrospective cohort of 391 samples and a prospective cohort of 2375 samples and found a 100% sensitivity (95% CI, 89%-100%) for 37 unique events and a high degree of specificity to detect CNVs across nine distinct targeted NGS gene panels. This NGS CNV pipeline enables stand-alone first-tier assessment for CNV and sequence variants in a clinical laboratory setting, dispensing with the need for parallel CNV analysis using classic techniques, such as microarray, long-range PCR, or multiplex ligation-dependent probe amplification. This NGS CNV pipeline can also be applied to the assessment of complex genomic regions, including pseudogenic DNA sequences, such as the PMS2CL gene, and to mitochondrial genome heteroplasmy detection. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  10. Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.

    PubMed

    Subhadra, Bindu; Kim, Dong Ho; Kim, Jaeseok; Woo, Kyungho; Sohn, Kyung Mok; Kim, Hwa-Jung; Han, Kyudong; Oh, Man Hwan; Choi, Chul Hee

    2018-06-01

    Urinary tract infections (UTIs) are among the most common infections in humans, predominantly caused by uropathogenic Escherichia coli (UPEC). The diverse genomes of UPEC strains mostly impede disease prevention and control measures. In this study, we comparatively analyzed the whole genome sequence of a highly virulent UPEC strain, namely UPEC 26-1, which was isolated from urine sample of a patient suffering from UTI in Korea. Whole genome analysis showed that the genome consists of one circular chromosome of 5,329,753 bp, comprising 5064 protein-coding genes, 122 RNA genes (94 tRNA, 22 rRNA and 6 ncRNA genes), and 100 pseudogenes, with an average G+C content of 50.56%. In addition, we identified 8 prophage regions comprising 5 intact, 2 incomplete and 1 questionable ones and 63 genomic islands, suggesting the possibility of horizontal gene transfer in this strain. Comparative genome analysis of UPEC 26-1 with the UPEC strain CFT073 revealed an average nucleotide identity of 99.7%. The genome comparison with CFT073 provides major differences in the genome of UPEC 26-1 that would explain its increased virulence and biofilm formation. Nineteen of the total GIs were unique to UPEC 26-1 compared to CFT073 and nine of them harbored unique genes that are involved in virulence, multidrug resistance, biofilm formation and bacterial pathogenesis. The data from this study will assist in future studies of UPEC strains to develop effective control measures.

  11. High incidence of large deletions in the PMS2 gene in Spanish Lynch syndrome families.

    PubMed

    Brea-Fernández, A J; Cameselle-Teijeiro, J M; Alenda, C; Fernández-Rozadilla, C; Cubiella, J; Clofent, J; Reñé, J M; Anido, U; Milá, M; Balaguer, F; Castells, A; Castellvi-Bel, S; Jover, R; Carracedo, A; Ruiz-Ponte, C

    2014-06-01

    Lynch syndrome (LS) is caused by germline mutations in one of the four mismatch repair (MMR) genes. Defects in this pathway lead to microsatellite instability (MSI) in DNA tumors, which constitutes the molecular hallmark of this disease. Selection of patients for genetic testing in LS is usually based on fulfillment of diagnostic clinical criteria (i.e. Amsterdam criteria or the revised Bethesda guidelines). However, following these criteria PMS2 mutations have probably been underestimated as their penetrances appear to be lower than those of the other MMR genes. The use of universal MMR study-based strategies, using MSI testing and immunohistochemical (IHC) staining, is being one proposed alternative. Besides, germline mutation detection in PMS2 is complicated by the presence of highly homologous pseudogenes. Nevertheless, specific amplification of PMS2 by long-range polymerase chain reaction (PCR) and the improvement of the analysis of large deletions/duplications by multiplex ligation-dependent probe amplification (MLPA) overcome this difficulty. By using both approaches, we analyzed 19 PMS2-suspected carriers who have been selected by clinical or universal strategies and found five large deletions and one frameshift mutation in PMS2 in six patients (31%). Owing to the high incidence of large deletions found in our cohort, we recommend MLPA analysis as the first-line method for searching germline mutations in PMS2. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. Mycobacterium leprae RecA is structurally analogous but functionally distinct from Mycobacterium tuberculosis RecA protein.

    PubMed

    Patil, K Neelakanteshwar; Singh, Pawan; Harsha, Sri; Muniyappa, K

    2011-12-01

    Mycobacterium leprae is closely related to Mycobacterium tuberculosis, yet causes a very different illness. Detailed genomic comparison between these two species of mycobacteria reveals that the decaying M. leprae genome contains less than half of the M. tuberculosis functional genes. The reduction of genome size and accumulation of pseudogenes in the M. leprae genome is thought to result from multiple recombination events between related repetitive sequences, which provided the impetus to investigate the recombination-like activities of RecA protein. In this study, we have cloned, over-expressed and purified M. leprae RecA and compared its activities with that of M. tuberculosis RecA. Both proteins, despite being 91% identical at the amino acid level, exhibit strikingly different binding profiles for single-stranded DNA with varying GC contents, in the ability to catalyze the formation of D-loops and to promote DNA strand exchange. The kinetics and the extent of single-stranded DNA-dependent ATPase and coprotease activities were nearly equivalent between these two recombinases. However, the degree of inhibition exerted by a range of ATP:ADP ratios was greater on strand exchange promoted by M. leprae RecA compared to its M. tuberculosis counterpart. Taken together, our results provide insights into the mechanistic aspects of homologous recombination and coprotease activity promoted by M. lepare RecA, and further suggests that it differs from the M. tuberculosis counterpart. These results are consistent with an emerging concept of DNA-sequence influenced structural differences in RecA nucleoprotein filaments and how these differences reflect on the multiple activities associated with RecA protein. Copyright © 2011 Elsevier B.V. All rights reserved.

  13. Rudimentary expression of RYamide in Drosophila melanogaster relative to other Drosophila species points to a functional decline of this neuropeptide gene.

    PubMed

    Veenstra, Jan A; Khammassi, Hela

    2017-04-01

    RYamides are arthropod neuropeptides with unknown function. In 2011 two RYamides were isolated from D. melanogaster as the ligands for the G-protein coupled receptor CG5811. The D. melanogaster gene encoding these neuropeptides is highly unusual, as there are four RYamide encoding exons in the current genome assembly, but an exon encoding a signal peptide is absent. Comparing the D. melanogaster gene structure with those from other species, including D. virilis, suggests that the gene is degenerating. RNAseq data from 1634 short sequence read archives at NCBI containing more than 34 billion spots yielded numerous individual spots that correspond to the RYamide encoding exons, of which a large number include the intron-exon boundary at the start of this exon. Although 72 different sequences have been spliced onto this RYamide encoding exon, none codes for the signal peptide of this gene. Thus, the RNAseq data for this gene reveal only noise and no signal. The very small quantities of peptide recovered during isolation and the absence of credible RNAseq data, indicates that the gene is very little expressed, while the RYamide gene structure in D. melanogaster suggests that it might be evolving into a pseudogene. Yet, the identification of the peptides it encodes clearly shows it is still functional. Using region specific antisera, we could localize numerous neurons and enteroendocrine cells in D. willistoni, D. virilis and D. pseudoobscura, but only two adult abdominal neurons in D. melanogaster. Those two neurons project to and innervate the rectal papillae, suggesting that RYamides may be involved in the regulation of water homeostasis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Major Histocompatibility Complex Genes Map to Two Chromosomes in an Evolutionarily Ancient Reptile, the Tuatara Sphenodon punctatus.

    PubMed

    Miller, Hilary C; O'Meally, Denis; Ezaz, Tariq; Amemiya, Chris; Marshall-Graves, Jennifer A; Edwards, Scott

    2015-05-07

    Major histocompatibility complex (MHC) genes are a central component of the vertebrate immune system and usually exist in a single genomic region. However, considerable differences in MHC organization and size exist between different vertebrate lineages. Reptiles occupy a key evolutionary position for understanding how variation in MHC structure evolved in vertebrates, but information on the structure of the MHC region in reptiles is limited. In this study, we investigate the organization and cytogenetic location of MHC genes in the tuatara (Sphenodon punctatus), the sole extant representative of the early-diverging reptilian order Rhynchocephalia. Sequencing and mapping of 12 clones containing class I and II MHC genes from a bacterial artificial chromosome library indicated that the core MHC region is located on chromosome 13q. However, duplication and translocation of MHC genes outside of the core region was evident, because additional class I MHC genes were located on chromosome 4p. We found a total of seven class I sequences and 11 class II β sequences, with evidence for duplication and pseudogenization of genes within the tuatara lineage. The tuatara MHC is characterized by high repeat content and low gene density compared with other species and we found no antigen processing or MHC framework genes on the MHC gene-containing clones. Our findings indicate substantial differences in MHC organization in tuatara compared with mammalian and avian MHCs and highlight the dynamic nature of the MHC. Further sequencing and annotation of tuatara and other reptile MHCs will determine if the tuatara MHC is representative of nonavian reptiles in general. Copyright © 2015 Miller et al.

  15. The Linum usitatissimum L. plastome reveals atypical structural evolution, new editing sites, and the phylogenetic position of Linaceae within Malpighiales.

    PubMed

    de Santana Lopes, Amanda; Pacheco, Túlio Gomes; Santos, Karla Gasparini Dos; Vieira, Leila do Nascimento; Guerra, Miguel Pedro; Nodari, Rubens Onofre; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Rogalski, Marcelo

    2018-02-01

    The plastome of Linum usitatissimum was completely sequenced allowing analyses of evolution of genome structure, RNA editing sites, molecular markers, and indicating the position of Linaceae within Malpighiales. Flax (Linum usitatissimum L.) is an economically important crop used as food, feed, and industrial feedstock. It belongs to the Linaceae family, which is noted by high morphological and ecological diversity. Here, we reported the complete sequence of flax plastome, the first species within Linaceae family to have the plastome sequenced, assembled and characterized in detail. The plastome of flax is a circular DNA molecule of 156,721 bp with a typical quadripartite structure including two IRs of 31,990 bp separating the LSC of 81,767 bp and the SSC of 10,974 bp. It shows two expansion events from IRB to LSC and from IRB to SSC, and a contraction event in the IRA-LSC junction, which changed significantly the size and the gene content of LSC, SSC and IRs. We identified 109 unique genes and 2 pseudogenes (rpl23 and ndhF). The plastome lost the conserved introns of clpP gene and the complete sequence of rps16 gene. The clpP, ycf1, and ycf2 genes show high nucleotide and aminoacid divergence, but they still possibly retain the functionality. Moreover, we also identified 176 SSRs, 20 tandem repeats, and 39 dispersed repeats. We predicted in 18 genes a total of 53 RNA editing sites of which 32 were not found before in other species. The phylogenetic inference based on 63 plastid protein-coding genes of 38 taxa supports three major clades within Malpighiales order. One of these clades has flax (Linaceae) sister to Chrysobalanaceae family, differing from earlier studies that included Linaceae into the euphorbioid clade.

  16. Gossypium barbadense genome sequence provides insight into the evolution of extra-long staple fiber and specialized metabolites.

    PubMed

    Liu, Xia; Zhao, Bo; Zheng, Hua-Jun; Hu, Yan; Lu, Gang; Yang, Chang-Qing; Chen, Jie-Dan; Chen, Jun-Jian; Chen, Dian-Yang; Zhang, Liang; Zhou, Yan; Wang, Ling-Jian; Guo, Wang-Zhen; Bai, Yu-Lin; Ruan, Ju-Xin; Shangguan, Xiao-Xia; Mao, Ying-Bo; Shan, Chun-Min; Jiang, Jian-Ping; Zhu, Yong-Qiang; Jin, Lei; Kang, Hui; Chen, Shu-Ting; He, Xu-Lin; Wang, Rui; Wang, Yue-Zhu; Chen, Jie; Wang, Li-Jun; Yu, Shu-Ting; Wang, Bi-Yun; Wei, Jia; Song, Si-Chao; Lu, Xin-Yan; Gao, Zheng-Chao; Gu, Wen-Yi; Deng, Xiao; Ma, Dan; Wang, Sen; Liang, Wen-Hua; Fang, Lei; Cai, Cai-Ping; Zhu, Xie-Fei; Zhou, Bao-Liang; Jeffrey Chen, Z; Xu, Shu-Hua; Zhang, Yu-Gao; Wang, Sheng-Yue; Zhang, Tian-Zhen; Zhao, Guo-Ping; Chen, Xiao-Ya

    2015-09-30

    Of the two cultivated species of allopolyploid cotton, Gossypium barbadense produces extra-long fibers for the production of superior textiles. We sequenced its genome (AD)2 and performed a comparative analysis. We identified three bursts of retrotransposons from 20 million years ago (Mya) and a genome-wide uneven pseudogenization peak at 11-20 Mya, which likely contributed to genomic divergences. Among the 2,483 genes preferentially expressed in fiber, a cell elongation regulator, PRE1, is strikingly At biased and fiber specific, echoing the A-genome origin of spinnable fiber. The expansion of the PRE members implies a genetic factor that underlies fiber elongation. Mature cotton fiber consists of nearly pure cellulose. G. barbadense and G. hirsutum contain 29 and 30 cellulose synthase (CesA) genes, respectively; whereas most of these genes (>25) are expressed in fiber, genes for secondary cell wall biosynthesis exhibited a delayed and higher degree of up-regulation in G. barbadense compared with G. hirsutum, conferring an extended elongation stage and highly active secondary wall deposition during extra-long fiber development. The rapid diversification of sesquiterpene synthase genes in the gossypol pathway exemplifies the chemical diversity of lineage-specific secondary metabolites. The G. barbadense genome advances our understanding of allopolyploidy, which will help improve cotton fiber quality.

  17. Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.

    PubMed

    McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael

    2014-08-01

    Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event. Copyright © 2014 by the Genetics Society of America.

  18. Complete genome sequence of Jiangella gansuensis strain YIM 002T (DSM 44835T), the type species of the genus Jiangella and source of new antibiotic compounds

    DOE PAGES

    Jiao, Jian-Yu; Carro, Lorena; Liu, Lan; ...

    2017-02-03

    Jiangella gansuensis strain YIM 002 T is the type strain of the type species of the genus Jiangella, which is at the present time composed of five species, and was isolated from desert soil sample in Gansu Province (China). The five strains of this genus are clustered in a monophyletic group when closer actinobacterial genera are used to infer a 16S rRNA gene sequence phylogeny. The study of this genome is part of the Genomic Encyclopedia of Bacteria and Archaea project, and here we describe the complete genome sequence and annotation of this taxon. The genome of J. gansuensis strainmore » YIM 002T contains a single scaffold of size 5,585,780 bp, which involves 149 pseudogenes, 4905 protein-coding genes and 50 RNA genes, including 2520 hypothetical proteins and 4 rRNA genes. From the investigation of genome sizes of Jiangella species, J. gansuensis shows a smaller size, which indicates this strain might have discarded too much genetic information to adapt to desert environment. Seven new compounds from this bacterium have recently been described; however, its potential should be higher, as secondary metabolite gene cluster analysis predicted 60 gene clusters, including the potential to produce the pristinamycin.« less

  19. Complete genome sequence of Jiangella gansuensis strain YIM 002T (DSM 44835T), the type species of the genus Jiangella and source of new antibiotic compounds

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiao, Jian-Yu; Carro, Lorena; Liu, Lan

    Jiangella gansuensis strain YIM 002 T is the type strain of the type species of the genus Jiangella, which is at the present time composed of five species, and was isolated from desert soil sample in Gansu Province (China). The five strains of this genus are clustered in a monophyletic group when closer actinobacterial genera are used to infer a 16S rRNA gene sequence phylogeny. The study of this genome is part of the Genomic Encyclopedia of Bacteria and Archaea project, and here we describe the complete genome sequence and annotation of this taxon. The genome of J. gansuensis strainmore » YIM 002T contains a single scaffold of size 5,585,780 bp, which involves 149 pseudogenes, 4905 protein-coding genes and 50 RNA genes, including 2520 hypothetical proteins and 4 rRNA genes. From the investigation of genome sizes of Jiangella species, J. gansuensis shows a smaller size, which indicates this strain might have discarded too much genetic information to adapt to desert environment. Seven new compounds from this bacterium have recently been described; however, its potential should be higher, as secondary metabolite gene cluster analysis predicted 60 gene clusters, including the potential to produce the pristinamycin.« less

  20. The complete sequence and promoter activity of the human A-raf-1 gene (ARAF1)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, J.E.; Beck, T.W.; Brennscheidt, U.

    1994-03-01

    The raf proto-oncogenes encode cytoplasmic protein serine/threonine kinases, which play a critical role in cell growth and development. One of these, A-raf-1 (human gene symbol, ARAF1), which is predominantly expressed in mouse urogenital tissues, has been mapped to an evolutionarily conserved linkage group composed of ARAF1, SYN1, TIMP, and properdin located at human chromosome Xp11.2. The authors have isolated human genomic DNA clones containing the expressed gene (ARAF1) on the X chromosome and a pseudogene (ARAF2) on chromosome 7p12-q11.21. Analysis of the nucleotide sequence from the ARAF1 genomic clones demonstrated that it consists of 16 exons encoded by minimally 10,776more » nucleotides. The major transcriptional start site (+1) was determined by RNase protection and primer extension assays. Promoter activity was confirmed by functional assays using DNA fragments fused to a CAT reporter gene. The ARAF1 minimal promoter, located between nucleotides -59 and +93, has a low G + C content and lacks consensus TATA and Inr sequences but shows sequence similarity at position -1 to the E box that is known to interact with USF and TFII-I transcription factors. 65 refs., 7 figs., 1 tab.« less

  1. Constitutive heterochromatin of chromosome 1 and Duffy blood group alleles in schizophrenia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kosower, N.S.; Gerad, L.; Goldstein, M.

    1995-04-24

    Cytogenetic analysis was carried out in unrelated schizophrenic patients, unrelated controls and patients and family members in multiplex families. The size-distribution of chromosome 1 heterochromatic region (1qH, C-band variants) among 21 unrelated schizophrenic patients was different from that found in a group of 46 controls. The patient group had 1qH variants of smaller size than the control group (P < 0.01). Incubation of phytohemagglutinin-treated blood lymphocytes with 5-azacytidine (which causes decondensation and extension of the heterochromatin) led to a lesser degree of heterochromatin decondensation in a group of patients than in the controls (7 schizophrenic, 9 controls, P < 0.01).more » The distribution of phenotypes of Duffy blood group system (whose locus is linked to the 1qH region) among 28 schizophrenic patients was also different from that in the general population. Cosegregation of schizophrenia with a 1qH (C-band) variant and Duffy blood group allele was observed in one of six multiplex families. The overall results suggest that alterations within the Duffy/1qH region are involved in schizophrenia in some cases. This region contains the locus of D5 dopamine receptor pseudogene 2 (1q21.1), which is transcribed in normal lymphocytes. 33 refs., 1 fig., 2 tabs.« less

  2. The comparative chloroplast genomic analysis of photosynthetic orchids and developing DNA markers to distinguish Phalaenopsis orchids.

    PubMed

    Jheng, Cheng-Fong; Chen, Tien-Chih; Lin, Jhong-Yi; Chen, Ting-Chieh; Wu, Wen-Luan; Chang, Ching-Chun

    2012-07-01

    The chloroplast genome of Phalaenopsis equestris was determined and compared to those of Phalaenopsis aphrodite and Oncidium Gower Ramsey in Orchidaceae. The chloroplast genome of P. equestris is 148,959 bp, and a pair of inverted repeats (25,846 bp) separates the genome into large single-copy (85,967 bp) and small single-copy (11,300 bp) regions. The genome encodes 109 genes, including 4 rRNA, 30 tRNA and 75 protein-coding genes, but loses four ndh genes (ndhA, E, F and H) and seven other ndh genes are pseudogenes. The rate of inter-species variation between the two moth orchids was 0.74% (1107 sites) for single nucleotide substitution and 0.24% for insertions (161 sites; 1388 bp) and deletions (189 sites; 1393 bp). The IR regions have a lower rate of nucleotide substitution (3.5-5.8-fold) and indels (4.3-7.1-fold) than single-copy regions. The intergenic spacers are the most divergent, and based on the length variation of the three intergenic spacers, 11 native Phalaenopsis orchids could be successfully distinguished. The coding genes, IR junction and RNA editing sites are relatively more conserved between the two moth orchids than between those of Phalaenopsis and Oncidium spp. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  3. Implications of Hybridization, NUMTs, and Overlooked Diversity for DNA Barcoding of Eurasian Ground Squirrels

    PubMed Central

    Ermakov, Oleg A.; Simonov, Evgeniy; Surin, Vadim L.; Titov, Sergey V.; Brandler, Oleg V.; Ivanova, Natalia V.; Borisenko, Alex V.

    2015-01-01

    The utility of DNA Barcoding for species identification and discovery has catalyzed a concerted effort to build the global reference library; however, many animal groups of economical or conservational importance remain poorly represented. This study aims to contribute DNA barcode records for all ground squirrel species (Xerinae, Sciuridae, Rodentia) inhabiting Eurasia and to test efficiency of this approach for species discrimination. Cytochrome c oxidase subunit 1 (COI) gene sequences were obtained for 97 individuals representing 16 ground squirrel species of which 12 were correctly identified. Taxonomic allocation of some specimens within four species was complicated by geographically restricted mtDNA introgression. Exclusion of individuals with introgressed mtDNA allowed reaching a 91.6% identification success rate. Significant COI divergence (3.5–4.4%) was observed within the most widespread ground squirrel species (Spermophilus erythrogenys, S. pygmaeus, S. suslicus, Urocitellus undulatus), suggesting the presence of cryptic species. A single putative NUMT (nuclear mitochondrial pseudogene) sequence was recovered during molecular analysis; mitochondrial COI from this sample was amplified following re-extraction of DNA. Our data show high discrimination ability of 100 bp COI fragments for Eurasian ground squirrels (84.3%) with no incorrect assessments, underscoring the potential utility of the existing reference librariy for the development of diagnostic ‘mini-barcodes’. PMID:25617768

  4. Plastome Evolution in the Sole Hemiparasitic Genus Laurel Dodder (Cassytha) and Insights into the Plastid Phylogenomics of Lauraceae

    PubMed Central

    Wu, Chung-Shien; Wang, Ting-Jen; Wu, Chia-Wen; Wang, Ya-Nan

    2017-01-01

    Abstract To date, little is known about the evolution of plastid genomes (plastomes) in Lauraceae. As one of the top five largest families in tropical forests, the Lauraceae contain many species that are important ecologically and economically. Lauraceous species also provide wonderful materials to study the evolutionary trajectory in response to parasitism because they contain both nonparasitic and parasitic species. This study compared the plastomes of nine Lauraceous species, including the sole hemiparasitic and herbaceous genus Cassytha (laurel dodder; here represented by Cassytha filiformis). We found differential contractions of the canonical inverted repeat (IR), resulting in two IR types present in Lauraceae. These two IR types reinforce Cryptocaryeae and Neocinnamomum—Perseeae–Laureae as two separate clades. Our data reveal several traits unique to Cas. filiformis, including loss of IRs, loss or pseudogenization of 11 ndh and rpl23 genes, richness of repeats, and accelerated rates of nucleotide substitutions in protein-coding genes. Although Cas. filiformis is low in chlorophyll content, our analysis based on dN/dS ratios suggests that both its plastid house-keeping and photosynthetic genes are under strong selective constraints. Hence, we propose that short generation time and herbaceous lifestyle rather than reduced photosynthetic ability drive the accelerated rates of nucleotide substitutions in Cas. filiformis. PMID:28985306

  5. Penguins reduced olfactory receptor genes common to other waterbirds

    PubMed Central

    Lu, Qin; Wang, Kai; Lei, Fumin; Yu, Dan; Zhao, Huabin

    2016-01-01

    The sense of smell, or olfaction, is fundamental in the life of animals. However, penguins (Aves: Sphenisciformes) possess relatively small olfactory bulbs compared with most other waterbirds such as Procellariiformes and Gaviiformes. To test whether penguins have a reduced reliance on olfaction, we analyzed the draft genome sequences of the two penguins, which diverged at the origin of the order Sphenisciformes; we also examined six closely related species with available genomes, and identified 29 one-to-one orthologous olfactory receptor genes (i.e. ORs) that are putatively functionally conserved and important across the eight birds. To survey the 29 one-to-one orthologous ORs in penguins and their relatives, we newly generated 34 sequences that are missing from the draft genomes. Through the analysis of totaling 378 OR sequences, we found that, of these functionally important ORs common to other waterbirds, penguins have a significantly greater percentage of OR pseudogenes than other waterbirds, suggesting a reduction of olfactory capability. The penguin-specific reduction of olfactory capability arose in the common ancestor of penguins between 23 and 60 Ma, which may have resulted from the aquatic specializations for underwater vision. Our study provides genetic evidence for a possible reduction of reliance on olfaction in penguins. PMID:27527385

  6. Recurrent and founder mutations in the PMS2 gene

    PubMed Central

    Tomsic, Jerneja; Senter, Leigha; Liyanarachchi, Sandya; Clendenning, Mark; Vaughn, Cecily P.; Jenkins, Mark A.; Hopper, John L.; Young, Joanne; Samowitz, Wade; de la Chapelle, Albert

    2012-01-01

    Germline mutations in PMS2 are associated with Lynch syndrome (LS), the most common known cause of hereditary colorectal cancer. Mutation detection in PMS2 has been difficult due to the presence of several pseudogenes, but a custom-designed long-range PCR strategy now allows adequate mutation detection. Many mutations are unique. However some mutations are observed repeatedly, across individuals not known to be related, due to the mutation being either recurrent, arising multiple times de novo at hot spots for mutations, or of founder origin, having occurred once in an ancestor. Previously, we observed 36 distinct mutations in a sample of 61 independently ascertained Caucasian probands of mixed European background with PMS2 mutations. Eleven of these mutations were detected in more than one individual not known to be related and of these, six were detected more than twice. These six mutations accounted for 31 (51%) ostensibly unrelated probands. Here we performed genotyping and haplotype analysis in four mutations observed in multiple probands and found two (c.137G>T and exon 10 deletion) to be founder mutations, one (c.903G>T) a probable founder, and one (c.1A>G) where founder mutation status could not be evaluated. We discuss possible explanations for the frequent occurrence of founder mutations in PMS2. PMID:22577899

  7. A massive parallel sequencing workflow for diagnostic genetic testing of mismatch repair genes

    PubMed Central

    Hansen, Maren F; Neckmann, Ulrike; Lavik, Liss A S; Vold, Trine; Gilde, Bodil; Toft, Ragnhild K; Sjursen, Wenche

    2014-01-01

    The purpose of this study was to develop a massive parallel sequencing (MPS) workflow for diagnostic analysis of mismatch repair (MMR) genes using the GS Junior system (Roche). A pathogenic variant in one of four MMR genes, (MLH1, PMS2, MSH6, and MSH2), is the cause of Lynch Syndrome (LS), which mainly predispose to colorectal cancer. We used an amplicon-based sequencing method allowing specific and preferential amplification of the MMR genes including PMS2, of which several pseudogenes exist. The amplicons were pooled at different ratios to obtain coverage uniformity and maximize the throughput of a single-GS Junior run. In total, 60 previously identified and distinct variants (substitutions and indels), were sequenced by MPS and successfully detected. The heterozygote detection range was from 19% to 63% and dependent on sequence context and coverage. We were able to distinguish between false-positive and true-positive calls in homopolymeric regions by cross-sample comparison and evaluation of flow signal distributions. In addition, we filtered variants according to a predefined status, which facilitated variant annotation. Our study shows that implementation of MPS in routine diagnostics of LS can accelerate sample throughput and reduce costs without compromising sensitivity, compared to Sanger sequencing. PMID:24689082

  8. Recurrent and founder mutations in the PMS2 gene.

    PubMed

    Tomsic, J; Senter, L; Liyanarachchi, S; Clendenning, M; Vaughn, C P; Jenkins, M A; Hopper, J L; Young, J; Samowitz, W; de la Chapelle, A

    2013-03-01

    Germline mutations in PMS2 are associated with Lynch syndrome (LS), the most common known cause of hereditary colorectal cancer. Mutation detection in PMS2 has been difficult due to the presence of several pseudogenes, but a custom-designed long-range PCR strategy now allows adequate mutation detection. Many mutations are unique. However, some mutations are observed repeatedly across individuals not known to be related due to the mutation being either recurrent, arising multiple times de novo at hot spots for mutations, or of founder origin, having occurred once in an ancestor. Previously, we observed 36 distinct mutations in a sample of 61 independently ascertained Caucasian probands of mixed European background with PMS2 mutations. Eleven of these mutations were detected in more than one individual not known to be related and of these, six were detected more than twice. These six mutations accounted for 31 (51%) ostensibly unrelated probands. Here, we performed genotyping and haplotype analysis in four mutations observed in multiple probands and found two (c.137G>T and exon 10 deletion) to be founder mutations and one (c.903G>T) a probable founder. One (c.1A>G) could not be evaluated for founder mutation status. We discuss possible explanations for the frequent occurrence of founder mutations in PMS2. © 2012 John Wiley & Sons A/S.

  9. Gene duplications in prokaryotes can be associated with environmental adaptation

    PubMed Central

    2010-01-01

    Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment. PMID:20961426

  10. Gene duplications in prokaryotes can be associated with environmental adaptation.

    PubMed

    Bratlie, Marit S; Johansen, Jostein; Sherman, Brad T; Huang, Da Wei; Lempicki, Richard A; Drabløs, Finn

    2010-10-20

    Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment.

  11. Development of a practical NF1 genetic testing method through the pilot analysis of five Japanese families with neurofibromatosis type 1.

    PubMed

    Okumura, Akiko; Ozaki, Mamoru; Niida, Yo

    2015-08-01

    Mutation analysis of NF1, the responsible gene for neurofibromatosis type 1 (NF1), is still difficult due to its large size, lack of mutational hotspots, the presence of many pseudogenes, and its wide spectrum of mutations. To develop a simple and inexpensive NF1 genetic testing for clinical use, we analyzed five Japanese families with NF1 as a pilot study. Our original method, CEL endonuclease mediated heteroduplex incision with polyacrylamide gel electrophoresis and silver staining (CHIPS) was optimized for NF1 mutation screening, and reverse transcription polymerase chain reaction (RT-PCR) was performed to determine the effect of transcription. Also, we employed DNA microarray analysis to evaluate the break points of the large deletion. A new nonsense mutation, p.Gln209(∗), was detected in family 1 and the splicing donor site mutation, c.2850+1G>T, was detected in family 2. In family 3, c.4402A>G was detected in exon 34 and the p.Ser1468Gly missense mutation was predicted. However mRNA analysis revealed that this substitution created an aberrant splicing acceptor site, thereby causing the p.Phe1457(∗) nonsense mutation. In the other two families, type-1 and unique NF1 microdeletions were detected by DNA microarray analysis. Our results show that the combination of CHIPS and RT-PCR effectively screen and characterize NF1 point mutations, and both DNA and RNA level analysis are required to understand the nature of the NF1 mutation. Our results also suggest the possibility of a higher incidence and unique profile of NF1 large deletions in the Japanese population as compared to previous studies performed in Europe. Copyright © 2014 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

  12. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana)

    PubMed Central

    2010-01-01

    Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079

  13. Genomic Anatomy of a Premier Major Histocompatibility Complex Paralogous Region on Chromosome 1q21–q22

    PubMed Central

    Shiina, Takashi; Ando, Asako; Suto, Yumiko; Kasai, Fumio; Shigenari, Atsuko; Takishima, Nobusada; Kikkawa, Eri; Iwata, Kyoko; Kuwano, Yuko; Kitamura, Yuka; Matsuzawa, Yumiko; Sano, Kazumi; Nogami, Masahiro; Kawata, Hisako; Li, Suyun; Fukuzumi, Yasuhito; Yamazaki, Masaaki; Tashiro, Hiroyuki; Tamiya, Gen; Kohda, Atsushi; Okumura, Katsuzumi; Ikemura, Toshimichi; Soeda, Eiichi; Mizuki, Nobuhisa; Kimura, Minoru; Bahram, Seiamak; Inoko, Hidetoshi

    2001-01-01

    Human chromosomes 1q21–q25, 6p21.3–22.2, 9q33–q34, and 19p13.1–p13.4 carry clusters of paralogous loci, to date best defined by the flagship 6p MHC region. They have presumably been created by two rounds of large-scale genomic duplications around the time of vertebrate emergence. Phylogenetically, the 1q21–25 region seems most closely related to the 6p21.3 MHC region, as it is only the MHC paralogous region that includes bona fide MHC class I genes, the CD1 and MR1 loci. Here, to clarify the genomic structure of this model MHC paralogous region as well as to gain insight into the evolutionary dynamics of the entire quadriplication process, a detailed analysis of a critical 1.7 megabase (Mb) region was performed. To this end, a composite, deep, YAC, BAC, and PAC contig encompassing all five CD1 genes and linking the centromeric +P5 locus to the telomeric KRTC7 locus was constructed. Within this contig a 1.1-Mb BAC and PAC core segment joining CD1D to FCER1A was fully sequenced and thoroughly analyzed. This led to the mapping of a total of 41 genes (12 expressed genes, 12 possibly expressed genes, and 17 pseudogenes), among which 31 were novel. The latter include 20 olfactory receptor (OR) genes, 9 of which are potentially expressed. Importantly, CD1, SPTA1, OR, and FCERIA belong to multigene families, which have paralogues in the other three regions. Furthermore, it is noteworthy that 12 of the 13 expressed genes in the 1q21–q22 region around the CD1 loci are immunologically relevant. In addition to CD1A-E, these include SPTA1, MNDA, IFI-16, AIM2, BL1A, FY and FCERIA. This functional convergence of structurally unrelated genes is reminiscent of the 6p MHC region, and perhaps represents the emergence of yet another antigen presentation gene cluster, in this case dedicated to lipid/glycolipid antigens rather than antigen-derived peptides. [The nucleotide sequence data reported in this paper have been submitted to the DDBJ, EMBL, and GenBank databases under accession nos. AB045357–AB045365.] PMID:11337475

  14. Comparative Sequence Analysis of the X-Inactivation Center Region in Mouse, Human, and Bovine

    PubMed Central

    Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

    2002-01-01

    We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5′ of Xist that was recently shown to attract histone modification early after the onset of X inactivation. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AJ421478, AJ421479, AJ421480, and AJ421481. Online supplemental data are available at http://pbil.univ-lyon1.fr/datasets/Xic2002/data.html and www.genome.org.] PMID:12045143

  15. Comparative Genome Analysis Provides Insights into Both the Lifestyle of Acidithiobacillus ferrivorans Strain CF27 and the Chimeric Nature of the Iron-Oxidizing Acidithiobacilli Genomes.

    PubMed

    Tran, Tam T T; Mangenot, Sophie; Magdelenat, Ghislaine; Payen, Emilie; Rouy, Zoé; Belahbib, Hassiba; Grail, Barry M; Johnson, D Barrie; Bonnefoy, Violaine; Talla, Emmanuel

    2017-01-01

    The iron-oxidizing species Acidithiobacillus ferrivorans is one of few acidophiles able to oxidize ferrous iron and reduced inorganic sulfur compounds at low temperatures (<10°C). To complete the genome of At. ferrivorans strain CF27, new sequences were generated, and an update assembly and functional annotation were undertaken, followed by a comparative analysis with other Acidithiobacillus species whose genomes are publically available. The At. ferrivorans CF27 genome comprises a 3,409,655 bp chromosome and a 46,453 bp plasmid. At. ferrivorans CF27 possesses genes allowing its adaptation to cold, metal(loid)-rich environments, as well as others that enable it to sense environmental changes, allowing At. ferrivorans CF27 to escape hostile conditions and to move toward favorable locations. Interestingly, the genome of At. ferrivorans CF27 exhibits a large number of genomic islands (mostly containing genes of unknown function), suggesting that a large number of genes has been acquired by horizontal gene transfer over time. Furthermore, several genes specific to At. ferrivorans CF27 have been identified that could be responsible for the phenotypic differences of this strain compared to other Acidithiobacillus species. Most genes located inside At. ferrivorans CF27-specific gene clusters which have been analyzed were expressed by both ferrous iron-grown and sulfur-attached cells, indicating that they are not pseudogenes and may play a role in both situations. Analysis of the taxonomic composition of genomes of the Acidithiobacillia infers that they are chimeric in nature, supporting the premise that they belong to a particular taxonomic class, distinct to other proteobacterial subgroups.

  16. Role of an archaeal PitA transporter in the copper and arsenic resistance of Metallosphaera sedula, an extreme thermoacidophile.

    PubMed

    McCarthy, Samuel; Ai, Chenbing; Wheaton, Garrett; Tevatia, Rahul; Eckrich, Valerie; Kelly, Robert; Blum, Paul

    2014-10-01

    Thermoacidophilic archaea, such as Metallosphaera sedula, are lithoautotrophs that occupy metal-rich environments. In previous studies, an M. sedula mutant lacking the primary copper efflux transporter, CopA, became copper sensitive. In contrast, the basis for supranormal copper resistance remained unclear in the spontaneous M. sedula mutant, CuR1. Here, transcriptomic analysis of copper-shocked cultures indicated that CuR1 had a unique regulatory response to metal challenge corresponding to the upregulation of 55 genes. Genome resequencing identified 17 confirmed mutations unique to CuR1 that were likely to change gene function. Of these, 12 mapped to genes with annotated function associated with transcription, metabolism, or transport. These mutations included 7 nonsynonymous substitutions, 4 insertions, and 1 deletion. One of the insertion mutations mapped to pseudogene Msed_1517 and extended its reading frame an additional 209 amino acids. The extended mutant allele was identified as a homolog of Pho4, a family of phosphate symporters that includes the bacterial PitA proteins. Orthologs of this allele were apparent in related extremely thermoacidophilic species, suggesting M. sedula naturally lacked this gene. Phosphate transport studies combined with physiologic analysis demonstrated M. sedula PitA was a low-affinity, high-velocity secondary transporter implicated in copper resistance and arsenate sensitivity. Genetic analysis demonstrated that spontaneous arsenate-resistant mutants derived from CuR1 all underwent mutation in pitA and nonselectively became copper sensitive. Taken together, these results point to archaeal PitA as a key requirement for the increased metal resistance of strain CuR1 and its accelerated capacity for copper bioleaching. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  17. Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

    PubMed Central

    Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

    2000-01-01

    The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409

  18. Role of an Archaeal PitA Transporter in the Copper and Arsenic Resistance of Metallosphaera sedula, an Extreme Thermoacidophile

    PubMed Central

    McCarthy, Samuel; Ai, Chenbing; Wheaton, Garrett; Tevatia, Rahul; Eckrich, Valerie; Kelly, Robert

    2014-01-01

    Thermoacidophilic archaea, such as Metallosphaera sedula, are lithoautotrophs that occupy metal-rich environments. In previous studies, an M. sedula mutant lacking the primary copper efflux transporter, CopA, became copper sensitive. In contrast, the basis for supranormal copper resistance remained unclear in the spontaneous M. sedula mutant, CuR1. Here, transcriptomic analysis of copper-shocked cultures indicated that CuR1 had a unique regulatory response to metal challenge corresponding to the upregulation of 55 genes. Genome resequencing identified 17 confirmed mutations unique to CuR1 that were likely to change gene function. Of these, 12 mapped to genes with annotated function associated with transcription, metabolism, or transport. These mutations included 7 nonsynonymous substitutions, 4 insertions, and 1 deletion. One of the insertion mutations mapped to pseudogene Msed_1517 and extended its reading frame an additional 209 amino acids. The extended mutant allele was identified as a homolog of Pho4, a family of phosphate symporters that includes the bacterial PitA proteins. Orthologs of this allele were apparent in related extremely thermoacidophilic species, suggesting M. sedula naturally lacked this gene. Phosphate transport studies combined with physiologic analysis demonstrated M. sedula PitA was a low-affinity, high-velocity secondary transporter implicated in copper resistance and arsenate sensitivity. Genetic analysis demonstrated that spontaneous arsenate-resistant mutants derived from CuR1 all underwent mutation in pitA and nonselectively became copper sensitive. Taken together, these results point to archaeal PitA as a key requirement for the increased metal resistance of strain CuR1 and its accelerated capacity for copper bioleaching. PMID:25092032

  19. Disruption and pseudoautosomal localization of the major histocompatibility complex in monotremes

    PubMed Central

    Dohm, Juliane C; Tsend-Ayush, Enkhjargal; Reinhardt, Richard; Grützner, Frank; Himmelbauer, Heinz

    2007-01-01

    Background The monotremes, represented by the duck-billed platypus and the echidnas, are the most divergent species within mammals, featuring a flamboyant mix of reptilian, mammalian and specialized characteristics. To understand the evolution of the mammalian major histocompatibility complex (MHC), the analysis of the monotreme genome is vital. Results We characterized several MHC containing bacterial artificial chromosome clones from platypus (Ornithorhynchus anatinus) and the short-beaked echidna (Tachyglossus aculeatus) and mapped them onto chromosomes. We discovered that the MHC of monotremes is not contiguous and locates within pseudoautosomal regions of two pairs of their sex chromosomes. The analysis revealed an MHC core region with class I and class II genes on platypus and echidna X3/Y3. Echidna X4/Y4 and platypus Y4/X5 showed synteny to the human distal class III region and beyond. We discovered an intron-containing class I pseudogene on platypus Y4/X5 at a genomic location equivalent to the human HLA-B,C region, suggesting ancestral synteny of the monotreme MHC. Analysis of male meioses from platypus and echidna showed that MHC chromosomes occupy different positions in the meiotic chains of either species. Conclusion Molecular and cytogenetic analyses reveal new insights into the evolution of the mammalian MHC and the multiple sex chromosome system of monotremes. In addition, our data establish the first homology link between chicken microchromosomes and the smallest chromosomes in the monotreme karyotype. Our results further suggest that segments of the monotreme MHC that now reside on separate chromosomes must once have been syntenic and that the complex sex chromosome system of monotremes is dynamic and still evolving. PMID:17727704

  20. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

    PubMed

    Wenger, Yvan; Galliot, Brigitte

    2013-03-25

    Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.

  1. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

    PubMed Central

    2013-01-01

    Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871

  2. Comparative Analysis of the Full Genome of Helicobacter pylori Isolate Sahul64 Identifies Genes of High Divergence

    PubMed Central

    Lu, Wei; Wise, Michael J.; Tay, Chin Yen; Windsor, Helen M.; Marshall, Barry J.; Peacock, Christopher

    2014-01-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains. PMID:24375107

  3. Comparative analysis of the full genome of Helicobacter pylori isolate Sahul64 identifies genes of high divergence.

    PubMed

    Lu, Wei; Wise, Michael J; Tay, Chin Yen; Windsor, Helen M; Marshall, Barry J; Peacock, Christopher; Perkins, Tim

    2014-03-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains.

  4. Molecular cloning of the potato Gro1-4 gene conferring resistance to pathotype Ro1 of the root cyst nematode Globodera rostochiensis, based on a candidate gene approach.

    PubMed

    Paal, Jürgen; Henselewski, Heike; Muth, Jost; Meksem, Khalid; Menéndez, Cristina M; Salamini, Francesco; Ballvora, Agim; Gebhardt, Christiane

    2004-04-01

    The endoparasitic root cyst nematode Globodera rostochiensis causes considerable damage in potato cultivation. In the past, major genes for nematode resistance have been introgressed from related potato species into cultivars. Elucidating the molecular basis of resistance will contribute to the understanding of nematode-plant interactions and assist in breeding nematode-resistant cultivars. The Gro1 resistance locus to G. rostochiensis on potato chromosome VII co-localized with a resistance-gene-like (RGL) DNA marker. This marker was used to isolate from genomic libraries 15 members of a closely related candidate gene family. Analysis of inheritance, linkage mapping, and sequencing reduced the number of candidate genes to three. Complementation analysis by stable potato transformation showed that the gene Gro1-4 conferred resistance to G. rostochiensis pathotype Ro1. Gro1-4 encodes a protein of 1136 amino acids that contains Toll-interleukin 1 receptor (TIR), nucleotide-binding (NB), leucine-rich repeat (LRR) homology domains and a C-terminal domain with unknown function. The deduced Gro1-4 protein differed by 29 amino acid changes from susceptible members of the Gro1 gene family. Sequence characterization of 13 members of the Gro1 gene family revealed putative regulatory elements and a variable microsatellite in the promoter region, insertion of a retrotransposon-like element in the first intron, and a stop codon in the NB coding region of some genes. Sequence analysis of RT-PCR products showed that Gro1-4 is expressed, among other members of the family including putative pseudogenes, in non-infected roots of nematode-resistant plants. RT-PCR also demonstrated that members of the Gro1 gene family are expressed in most potato tissues.

  5. Real Time Optima Tracking Using Harvesting Models of the Genetic Algorithm

    NASA Technical Reports Server (NTRS)

    Baskaran, Subbiah; Noever, D.

    1999-01-01

    Tracking optima in real time propulsion control, particularly for non-stationary optimization problems is a challenging task. Several approaches have been put forward for such a study including the numerical method called the genetic algorithm. In brief, this approach is built upon Darwinian-style competition between numerical alternatives displayed in the form of binary strings, or by analogy to 'pseudogenes'. Breeding of improved solution is an often cited parallel to natural selection in.evolutionary or soft computing. In this report we present our results of applying a novel model of a genetic algorithm for tracking optima in propulsion engineering and in real time control. We specialize the algorithm to mission profiling and planning optimizations, both to select reduced propulsion needs through trajectory planning and to explore time or fuel conservation strategies.

  6. Dynamics of actin evolution in dinoflagellates.

    PubMed

    Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F

    2011-04-01

    Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.

  7. Genetic evolution of Mycoplasma capricolum subsp. capripneumoniae strains and molecular epidemiology of contagious caprine pleuropneumonia by sequencing of locus H2.

    PubMed

    Lorenzon, S; Wesonga, H; Ygesu, Laikemariam; Tekleghiorgis, Tesfaalem; Maikano, Y; Angaya, M; Hendrikx, P; Thiaucourt, F

    2002-03-01

    Contagious caprine pleuropneumonia (CCPP) is a major threat to goat farming in developing countries. Its exact distribution is not well known, despite the fact that new diagnostic tools such as PCR and competitive ELISA are now available. The authors developed a study of the molecular epidemiology of the disease, based on the amplification of a 2400 bp long fragment containing two duplicated gene coding for a putative membrane protein. The sequence of this fragment, obtained on 19 Mycoplasma capricolum subsp. capripneumoniae (Mccp) strains from various geographical locations, gave 11 polymorphic positions. The three mutations found on gene H2prim were silent and did not appear to induce any amino acid modifications in the putative translated protein. The second gene may be a pseudogene not translated in vivo, as it bore a deletion of the ATG codon found in the other members of the "Mycoplasma mycoides cluster" and as the six mutations evidenced in the Mccp strains would induce modifications in the translated amino acids. In addition, an Mccp strain isolated in the United Arab Emirates showed a deletion of the whole pseudogene, a further indication that this gene is not compulsory for mycoplasma growth. Four lineages were defined, based on the nucleotide sequence. These correlated relatively well with the geographical origin of the strains: North, Central or East Africa. The strain of Turkish origin had a sequence similar to that found in North African strains, while strains isolated in Oman had sequences similar to those of North or East African strains. The latter is possibly due to the regular import of goats of various origins. Similar molecular epidemiology tools have been developed by sequencing the two operons of the 16S rRNA gene or by AFLP. All these various techniques give complementary results. One (16S rRNA) offers the likelihood of a finer identification of strains circulating in a region, another (H2) of determining the geographical origin of the strains. These tools can make a very useful contribution to understanding the epidemiology of CCPP.

  8. Novel Role of 3’UTR-Embedded Alu Elements as Facilitators of Processed Pseudogene Genesis and Host Gene Capture by Viral Genomes

    PubMed Central

    Engel, Pablo; Angulo, Ana

    2016-01-01

    Since the discovery of the high abundance of Alu elements in the human genome, the interest for the functional significance of these retrotransposons has been increasing. Primate Alu and rodent Alu-like elements are retrotransposed by a mechanism driven by the LINE1 (L1) encoded proteins, the same machinery that generates the L1 repeats, the processed pseudogenes (PPs), and other retroelements. Apart from free Alu RNAs, Alus are also transcribed and retrotranscribed as part of cellular gene transcripts, generally embedded inside 3’ untranslated regions (UTRs). Despite different proposed hypotheses, the functional implication of the presence of Alus inside 3’UTRs remains elusive. In this study we hypothesized that Alu elements in 3’UTRs could be involved in the genesis of PPs. By analyzing human genome data we discovered that the existence of 3’UTR-embedded Alu elements is overrepresented in genes source of PPs. In contrast, the presence of other retrotransposable elements in 3’UTRs does not show this PP linked overrepresentation. This research was extended to mouse and rat genomes and the results accordingly reveal overrepresentation of 3’UTR-embedded B1 (Alu-like) elements in PP parent genes. Interestingly, we also demonstrated that the overrepresentation of 3’UTR-embedded Alus is particularly significant in PP parent genes with low germline gene expression level. Finally, we provide data that support the hypothesis that the L1 machinery is also the system that herpesviruses, and possibly other large DNA viruses, use to capture host genes expressed in germline or somatic cells. Altogether our results suggest a novel role for Alu or Alu-like elements inside 3’UTRs as facilitators of the genesis of PPs, particularly in lowly expressed genes. Moreover, we propose that this L1-driven mechanism, aided by the presence of 3’UTR-embedded Alus, may also be exploited by DNA viruses to incorporate host genes to their viral genomes. PMID:28033411

  9. The Genomes of the Fungal Plant Pathogens Cladosporium fulvum and Dothistroma septosporum Reveal Adaptation to Different Hosts and Lifestyles But Also Signatures of Common Ancestry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    de Wit, Pierre J. G. M.; van der Burgt, Ate; Okmen, Bilal

    2012-05-04

    We sequenced and compared the genomes of the Dothideomycete fungal plant pathogens Cladosporium fulvum (Cfu) (syn. Passalora fulva) and Dothistroma septosporum (Dse) that are closely related phylogenetically, but have different lifestyles and hosts. Although both fungi grow extracellularly in close contact with host mesophyll cells, Cfu is a biotroph infecting tomato, while Dse is a hemibiotroph infecting pine. The genomes of these fungi have a similar set of genes (70percent of gene content in both genomes are homologs), but differ significantly in size (Cfu >61.1-Mb; Dse 31.2-Mb), which is mainly due to the difference in repeat content (47.2percent in Cfumore » versus 3.2percent in Dse). Recent adaptation to different lifestyles and hosts is suggested by diverged sets of genes. Cfu contains an tomatinase gene that we predict might be required for detoxification of tomatine, while this gene is absent in Dse. Many genes encoding secreted proteins are unique to each species and the repeat-rich areas in Cfu are enriched for these species-specific genes. In contrast, conserved genes suggest common host ancestry. Homologs of Cfu effector genes, including Ecp2 and Avr4, are present in Dse and induce a Cf-Ecp2- and Cf-4-mediated hypersensitive response, respectively. Strikingly, genes involved in production of the toxin dothistromin, a likely virulence factor for Dse, are conserved in Cfu, but their expression differs markedly with essentially no expression by Cfu in planta. Likewise, Cfu has a carbohydrate-degrading enzyme catalog that is more similar to that of necrotrophs or hemibiotrophs and a larger pectinolytic gene arsenal than Dse, but many of these genes are not expressed in planta or are pseudogenized. Overall, comparison of their genomes suggests that these closely related plant pathogens had a common ancestral host but since adapted to different hosts and lifestyles by a combination of differentiated gene content, pseudogenization, and gene regulation.« less

  10. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to themore » un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, and a transcriptional regulator, among other proteins, most of which are annotated as hypothetical, that were missed during annotation.« less

  11. Lessons learned from the initial sequencing of the pig genome: comparative analysis of an 8 Mb region of pig chromosome 17

    PubMed Central

    Hart, Elizabeth A; Caccamo, Mario; Harrow, Jennifer L; Humphray, Sean J; Gilbert, James GR; Trevanion, Steve; Hubbard, Tim; Rogers, Jane; Rothschild, Max F

    2007-01-01

    Background We describe here the sequencing, annotation and comparative analysis of an 8 Mb region of pig chromosome 17, which provides a useful test region to assess coverage and quality for the pig genome sequencing project. We report our findings comparing the annotation of draft sequence assembled at different depths of coverage. Results Within this region we annotated 71 loci, of which 53 are orthologous to human known coding genes. When compared to the syntenic regions in human (20q13.13-q13.33) and mouse (chromosome 2, 167.5 Mb-178.3 Mb), this region was found to be highly conserved with respect to gene order. The most notable difference between the three species is the presence of a large expansion of zinc finger coding genes and pseudogenes on mouse chromosome 2 between Edn3 and Phactr3 that is absent from pig and human. All of our annotation has been made publicly available in the Vertebrate Genome Annotation browser, VEGA. We assessed the impact of coverage on sequence assembly across this region and found, as expected, that increased sequence depth resulted in fewer, longer contigs. One-third of our annotated loci could not be fully re-aligned back to the low coverage version of the sequence, principally because the transcripts are fragmented over several contigs. Conclusion We have demonstrated the considerable advantages of sequencing at increased read depths and discuss the implications that lower coverage sequence may have on subsequent comparative and functional studies, particularly those involving complex loci such as GNAS. PMID:17705864

  12. HLA-B40, B18, B27, and B37 allele discrimination using group-specific amplification and SSCP method.

    PubMed

    Bannai, M; Tokunaga, K; Lin, L; Ogawa, A; Fujisawa, K; Juji, T

    1996-04-01

    We developed a system for discriminating HLA-B40, B18, B27, and B37 alleles using a two-step PCR method followed by SSCP analysis. Fragments (0.8 kb) including exon 2, intron 2, and exon 3 were amplified in the first PCR. We used two sets of primers, one specific for HLA-B60-related alleles and the other specific for HLA-B61-related, B18, B27, and B37 alleles. No amplifications of other class I genes or pseudogenes were observed. In the second PCR, exon 2 and exon 3 were amplified separately, using diluents of the first PCR products as templates. HLA-B61-related, B18, B27, B37, and B60-related alleles were clearly discriminated in the SSCP analysis of the second PCR products. In a population study in which B61 alleles were analyzed, B*4003 was detected in two Japanese individuals in addition to two B61 alleles previously reported to occur in Japanese, B*4002 and B*4006. The relative frequencies of B*4002, B*4006, and B*4003 in Japanese were 58, 35, and 6%, respectively. The individuals having B*4003 are the first non-South Americans in whom this allele has been detected. The SSCP banding patterns of 18 HLA-B60-positive Japanese population samples were identical to those of a B*40012 sample for both exon 2 and exon 3. We also demonstrated that the B37 allele occurring in some Japanese is B*3701.

  13. Towards cloning the WAS-gene locus: YAC-contigs and PFGE analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meindi, A.; Schindelhauer, D.; Hellebrand, H.

    1994-09-01

    Patients with X-linked recessive Wiskott-Aldrich syndrome (WAS) manifest eczema, thrombocytopenia and severe immunodeficiency. Mapping studies place the WAS gene locus between the markers TIMP and DXS255 which both have been shown to be recombinant with the disease locus. Linkage analysis in eight families including a large Swiss family showed tight linkage of the disease to the loci DXS255 and DXS1126 and exclusion of TIMP as well as polymorphic loci adjacent to the OATL1 pseudogene cluster (e.g., DXS6616). Physical mapping with established YAC contigs and a radiation hybrid encompassing the Xp11.22-11.3 region revealed the loci order TIMP-PFC-elk1-DXS1367-DXS6616-OATL1-(DXS11260DXS226)-C5-3-TGE-3, SYP and (DXS255-DXS146). Themore » markers TIMP and C5-3 are contained on the same 1.6 Mb MluI-fragment. A novel expressed sequence (R1) could be placed between elk-1 and the PFC gene while the STS C5-3 could be localized adjacent to DXS1126. The gene cluster around DXS1126 could be connected with the TFE-3 and synaptophysin genes which map on the same 400 kb MluI fragment and two overlapping YACs. The minimum distance between SYP and DXS255 is 1.2 Mb; the maximum distance is 2.2 Mb. Expressed sequences which are obtained from a cosmid contig around DXS1126 and C5-3 are being used for mutation screening in WAS patients.« less

  14. Phytophthora megakarya and Phytophthora palmivora, Closely Related Causal Agents of Cacao Black Pod Rot, Underwent Increases in Genome Sizes and Gene Numbers by Different Mechanisms

    PubMed Central

    Ali, Shahin S.; Shao, Jonathan; Lary, David J.; Kronmiller, Brent A.; Shen, Danyu; Strem, Mary D.; Amoako-Attah, Ishmael; Akrofi, Andrew Yaw; Begoude, B.A. Didier; ten Hoopen, G. Martijn; Coulibaly, Klotioloma; Kebe, Boubacar Ismaël; Melnick, Rachel L.; Guiltinan, Mark J.; Tyler, Brett M.; Meinhardt, Lyndel W.

    2017-01-01

    Phytophthora megakarya (Pmeg) and Phytophthora palmivora (Ppal) are closely related species causing cacao black pod rot. Although Ppal is a cosmopolitan pathogen, cacao is the only known host of economic importance for Pmeg. Pmeg is more virulent on cacao than Ppal. We sequenced and compared the Pmeg and Ppal genomes and identified virulence-related putative gene models (PGeneM) that may be responsible for their differences in host specificities and virulence. Pmeg and Ppal have estimated genome sizes of 126.88 and 151.23 Mb and PGeneM numbers of 42,036 and 44,327, respectively. The evolutionary histories of Pmeg and Ppal appear quite different. Postspeciation, Ppal underwent whole-genome duplication whereas Pmeg has undergone selective increases in PGeneM numbers, likely through accelerated transposable element-driven duplications. Many PGeneMs in both species failed to match transcripts and may represent pseudogenes or cryptic genetic reservoirs. Pmeg appears to have amplified specific gene families, some of which are virulence-related. Analysis of mycelium, zoospore, and in planta transcriptome expression profiles using neural network self-organizing map analysis generated 24 multivariate and nonlinear self-organizing map classes. Many members of the RxLR, necrosis-inducing phytophthora protein, and pectinase genes families were specifically induced in planta. Pmeg displays a diverse virulence-related gene complement similar in size to and potentially of greater diversity than Ppal but it remains likely that the specific functions of the genes determine each species’ unique characteristics as pathogens. PMID:28186564

  15. Evolution of trace amine associated receptor (TAAR) gene family in vertebrates: lineage-specific expansions and degradations of a second class of vertebrate chemosensory receptors expressed in the olfactory epithelium.

    PubMed

    Hashiguchi, Yasuyuki; Nishida, Mutsumi

    2007-09-01

    The trace amine-associated receptors (TAARs) form a specific family of G protein-coupled receptors in vertebrates. TAARs were initially considered neurotransmitter receptors, but recent study showed that mouse TAARs function as chemosensory receptors in the olfactory epithelium. To clarify the evolutionary dynamics of the TAAR gene family in vertebrates, near-complete repertoires of TAAR genes and pseudogenes were identified from the genomic assemblies of 4 teleost fishes (zebrafish, fugu, stickleback, and medaka), western clawed frogs, chickens, 3 mammals (humans, mice, and opossum), and sea lampreys. Database searches revealed that fishes had many putatively functional TAAR genes (13-109 genes), whereas relatively small numbers of TAAR genes (3-22 genes) were identified in tetrapods. Phylogenetic analysis of these genes indicated that the TAAR gene family was subdivided into 5 subfamilies that diverged before the divergence of ray-finned fishes and tetrapods. In tetrapods, virtually all TAAR genes were located in 1 specific region of their genomes as a gene cluster; however, in fishes, TAAR genes were scattered throughout more than 2 genomic locations. This possibly reflects a whole-genome duplication that occurred in the common ancestor of ray-finned fishes. Expression analysis of zebrafish and stickleback TAAR genes revealed that many TAARs in these fishes were expressed in the olfactory organ, suggesting the relatively high importance of TAARs as chemosensory receptors in fishes. A possible evolutionary history of the vertebrate TAAR gene family was inferred from the phylogenetic and comparative genomic analyses.

  16. [The ENCODE project and functional genomics studies].

    PubMed

    Ding, Nan; Qu, Hongzhu; Fang, Xiangdong

    2014-03-01

    Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.

  17. New consensus nomenclature for mammalian keratins

    PubMed Central

    Schweizer, Jürgen; Bowden, Paul E.; Coulombe, Pierre A.; Langbein, Lutz; Lane, E. Birgitte; Magin, Thomas M.; Maltais, Lois; Omary, M. Bishr; Parry, David A.D.; Rogers, Michael A.; Wright, Mathew W.

    2006-01-01

    Keratins are intermediate filament–forming proteins that provide mechanical support and fulfill a variety of additional functions in epithelial cells. In 1982, a nomenclature was devised to name the keratin proteins that were known at that point. The systematic sequencing of the human genome in recent years uncovered the existence of several novel keratin genes and their encoded proteins. Their naming could not be adequately handled in the context of the original system. We propose a new consensus nomenclature for keratin genes and proteins that relies upon and extends the 1982 system and adheres to the guidelines issued by the Human and Mouse Genome Nomenclature Committees. This revised nomenclature accommodates functional genes and pseudogenes, and although designed specifically for the full complement of human keratins, it offers the flexibility needed to incorporate additional keratins from other mammalian species. PMID:16831889

  18. Complete genome sequence of the chromate-reducing bacterium Thermoanaerobacter thermohydrosulfuricus strain BSB-33

    DOE PAGES

    Bhattacharya, Pamela; Barnebey, Adam; Zemla, Marcin; ...

    2015-10-05

    Thermoanaerobacter thermohydrosulfuricus BSB-33 is a thermophilic gram positive obligate anaerobe isolated from a hot spring in West Bengal, India. Unlike other T. thermohydrosulfuricus strains, BSB-33 is able to anaerobically reduce Fe(III) and Cr(VI) optimally at 60 °C. BSB-33 is the first Cr(VI) reducing T. thermohydrosulfuricus genome sequenced and of particular interest for bioremediation of environmental chromium contaminations. Here we discuss features of T. thermohydrosulfuricus BSB-33 and the unique genetic elements that may account for the peculiar metal reducing properties of this organism. The T. thermohydrosulfuricus BSB-33 genome comprises 2597606 bp encoding 2581 protein genes, 12 rRNA, 193 pseudogenes and hasmore » a G + C content of 34.20 %. Lastly, putative chromate reductases were identified by comparative analyses with other Thermoanaerobacter and chromate-reducing bacteria.« less

  19. Nuclear mtDNA pseudogenes as a source of new variants of mitochondrial genes: A case study of Siberian rubythroat Luscinia calliope (muscicapidae, aves).

    PubMed

    Spiridonova, L N; Red'kin, Ya A; Valchuk, O P

    2016-01-01

    First evidence for the presence of copies of mitochondrial cytochrome b gene of the subspecies group Luscinia calliope anadyrensis-L. c. camtschatkensis in the nuclear genome of nominative L. c. calliope was obtained, which indirectly indicates the nuclear origin of the subspecies-specific mitochondrial haplotypes in Siberian rubythroat. This fact clarifies the appearance of mitochondrial haplotypes of eastern subspecies by exchange between the homologous regions of the nuclear and mitochondrial genomes followed by fixation by the founder effect. This is the first study to propose a mechanism of DNA fragment exchange between the nucleus and mitochondria (intergenomic recombination) and to show the role of nuclear copies of mtDNA as a source of new taxon-specific mitochondrial haplotypes, which implies their involvement in the microevolutionary processes and morphogenesis.

  20. Molecular basis of length polymorphism in the human zeta-globin gene complex.

    PubMed Central

    Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J

    1983-01-01

    The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667

  1. Detection of large scale 3' deletions in the PMS2 gene amongst Colon-CFR participants: have we been missing anything?

    PubMed

    Clendenning, Mark; Walsh, Michael D; Gelpi, Judith Balmana; Thibodeau, Stephen N; Lindor, Noralane; Potter, John D; Newcomb, Polly; LeMarchand, Loic; Haile, Robert; Gallinger, Steve; Hopper, John L; Jenkins, Mark A; Rosty, Christophe; Young, Joanne P; Buchanan, Daniel D

    2013-09-01

    Current screening practices have been able to identify PMS2 mutations in 78 % of cases of colorectal cancer from the Colorectal Cancer Family Registry (Colon CFR) which showed solitary loss of the PMS2 protein. However the detection of large-scale deletions in the 3' end of the PMS2 gene has not been possible due to technical difficulties associated with pseudogene sequences. Here, we utilised a recently described MLPA/long-range PCR-based approach to screen the remaining 22 % (n = 16) of CRC-affected probands for mutations in the 3' end of the PMS2 gene. No deletions encompassing any or all of exons 12 through 15 were identified; therefore, our results suggest that 3' deletions in PMS2 are not a frequent occurrence in such families.

  2. The mouse genome displays highly dynamic populations of KRAB-zinc finger protein genes and related genetic units

    PubMed Central

    Kauzlaric, Annamaria; Ecco, Gabriela; Cassano, Marco; Duc, Julien; Imbeault, Michael; Trono, Didier

    2017-01-01

    KRAB-containing poly-zinc finger proteins (KZFPs) constitute the largest family of transcription factors encoded by mammalian genomes, and growing evidence indicates that they fulfill functions critical to both embryonic development and maintenance of adult homeostasis. KZFP genes underwent broad and independent waves of expansion in many higher vertebrates lineages, yet comprehensive studies of members harbored by a given species are scarce. Here we present a thorough analysis of KZFP genes and related units in the murine genome. We first identified about twice as many elements than previously annotated as either KZFP genes or pseudogenes, notably by assigning to this family an entity formerly considered as a large group of Satellite repeats. We then could delineate an organization in clusters distributed throughout the genome, with signs of recombination, translocation, duplication and seeding of new sites by retrotransposition of KZFP genes and related genetic units (KZFP/rGUs). Moreover, we harvested evidence indicating that closely related paralogs had evolved through both drifting and shifting of sequences encoding for zinc finger arrays. Finally, we could demonstrate that the KAP1-SETDB1 repressor complex tames the expression of KZFP/rGUs within clusters, yet that the primary targets of this regulation are not the KZFP/rGUs themselves but enhancers contained in neighboring endogenous retroelements and that, underneath, KZFPs conserve highly individualized patterns of expression. PMID:28334004

  3. Analyses of chicken immunoglobulin light chain cDNA clones indicate a few germline V lambda genes and allotypes of the C lambda locus.

    PubMed

    Parvari, R; Ziv, E; Lentner, F; Tel-Or, S; Burstein, Y; Schechter, I

    1987-01-01

    cDNA libraries of chicken spleen and Harder gland (a gland enriched with immunocytes) constructed in pBR322 were screened by differential hybridization and by mRNA hybrid-selected translation. Eleven L-chain cDNA clones were identified from which VL probes were prepared and each was annealed with kidney DNA restriction digests. All VL probes revealed the same set of bands, corresponding to about 15 germline VL genes of one subgroup. The nucleotide sequences of six VL clones showed greater than or equal to 85% homology, and the predicted amino acid sequences were identical or nearly identical to the major N-terminal sequence of L-chains in chicken serum. These findings, and the fact that the VL clones were randomly selected from normal lymphoid tissues, strongly indicate that the bulk of chicken L-chains is encoded by a few germline VL genes, probably much less than 15 since many of the VL genes are known to be pseudogenes. Therefore, it is likely that somatic mechanisms operating prior to specific triggering by antigen play a major role in the generation of antibody diversity in chicken. Analysis of the constant region locus (sequencing of CL gene and cDNAs) demonstrate a single CL isotype and suggest the presence of CL allotypes.

  4. Cloning, expression and biochemical characterization of one Epsilon-class (GST-3) and ten Delta-class (GST-1) glutathione S-transferases from Drosophila melanogaster, and identification of additional nine members of the Epsilon class.

    PubMed Central

    Sawicki, Rafał; Singh, Sharda P; Mondal, Ashis K; Benes, Helen; Zimniak, Piotr

    2003-01-01

    From the fruitfly, Drosophila melanogaster, ten members of the cluster of Delta-class glutathione S-transferases (GSTs; formerly denoted as Class I GSTs) and one member of the Epsilon-class cluster (formerly GST-3) have been cloned, expressed in Escherichia coli, and their catalytic properties have been determined. In addition, nine more members of the Epsilon cluster have been identified through bioinformatic analysis but not further characterized. Of the 11 expressed enzymes, seven accepted the lipid peroxidation product 4-hydroxynonenal as substrate, and nine were active in glutathione conjugation of 1-chloro-2,4-dinitrobenzene. Since the enzymically active proteins included the gene products of DmGSTD3 and DmGSTD7 which were previously deemed to be pseudogenes, we investigated them further and determined that both genes are transcribed in Drosophila. Thus our present results indicate that DmGSTD3 and DmGSTD7 are probably functional genes. The existence and multiplicity of insect GSTs capable of conjugating 4-hydroxynonenal, in some cases with catalytic efficiencies approaching those of mammalian GSTs highly specialized for this function, indicates that metabolism of products of lipid peroxidation is a highly conserved biochemical pathway with probable detoxification as well as regulatory functions. PMID:12443531

  5. Detection of novel NF1 mutations and rapid mutation prescreening with Pyrosequencing.

    PubMed

    Brinckmann, Anja; Mischung, Claudia; Bässmann, Ingelore; Kühnisch, Jirko; Schuelke, Markus; Tinschert, Sigrid; Nürnberg, Peter

    2007-12-01

    Neurofibromatosis type 1 (NF1) is caused by mutations in the neurofibromin (NF1) gene. Mutation analysis of NF1 is complicated by its large size, the lack of mutation hotspots, pseudogenes and frequent de novo mutations. Additionally, the search for NF1 mutations on the mRNA level is often hampered by nonsense-mediated mRNA decay (NMD) of the mutant allele. In this study we searched for mutations in a cohort of 38 patients and investigated the relationship between mutation type and allele-specific transcription from the wild-type versus mutant alleles. Quantification of relative mRNA transcript numbers was done by Pyrosequencing, a novel real-time sequencing method whose signals can be quantified very accurately. We identified 21 novel mutations comprising various mutation types. Pyrosequencing detected a definite relationship between allelic NF1 transcript imbalance due to NMD and mutation type in 24 of 29 patients who all carried frame-shift or nonsense mutations. NMD was absent in 5 patients with missense and silent mutations, as well as in 4 patients with splice-site mutations that did not disrupt the reading frame. Pyrosequencing was capable of detecting NMD even when the effects were only moderate. Diagnostic laboratories could thus exploit this effect for rapid prescreening for NF1 mutations as more than 60% of the mutations in this gene disrupt the reading frame and are prone to NMD.

  6. Genome of the Actinomycete Plant Pathogen Clavibacter michiganensis subsp. sepedonicus Suggests Recent Niche Adaptation▿ †

    PubMed Central

    Bentley, Stephen D.; Corton, Craig; Brown, Susan E.; Barron, Andrew; Clark, Louise; Doggett, Jon; Harris, Barbara; Ormond, Doug; Quail, Michael A.; May, Georgiana; Francis, David; Knudson, Dennis; Parkhill, Julian; Ishimaru, Carol A.

    2008-01-01

    Clavibacter michiganensis subsp. sepedonicus is a plant-pathogenic bacterium and the causative agent of bacterial ring rot, a devastating agricultural disease under strict quarantine control and zero tolerance in the seed potato industry. This organism appears to be largely restricted to an endophytic lifestyle, proliferating within plant tissues and unable to persist in the absence of plant material. Analysis of the genome sequence of C. michiganensis subsp. sepedonicus and comparison with the genome sequences of related plant pathogens revealed a dramatic recent evolutionary history. The genome contains 106 insertion sequence elements, which appear to have been active in extensive rearrangement of the chromosome compared to that of Clavibacter michiganensis subsp. michiganensis. There are 110 pseudogenes with overrepresentation in functions associated with carbohydrate metabolism, transcriptional regulation, and pathogenicity. Genome comparisons also indicated that there is substantial gene content diversity within the species, probably due to differential gene acquisition and loss. These genomic features and evolutionary dating suggest that there was recent adaptation for life in a restricted niche where nutrient diversity and perhaps competition are low, correlated with a reduced ability to exploit previously occupied complex niches outside the plant. Toleration of factors such as multiplication and integration of insertion sequence elements, genome rearrangements, and functional disruption of many genes and operons seems to indicate that there has been general relaxation of selective pressure on a large proportion of the genome. PMID:18192393

  7. The mouse genome displays highly dynamic populations of KRAB-zinc finger protein genes and related genetic units.

    PubMed

    Kauzlaric, Annamaria; Ecco, Gabriela; Cassano, Marco; Duc, Julien; Imbeault, Michael; Trono, Didier

    2017-01-01

    KRAB-containing poly-zinc finger proteins (KZFPs) constitute the largest family of transcription factors encoded by mammalian genomes, and growing evidence indicates that they fulfill functions critical to both embryonic development and maintenance of adult homeostasis. KZFP genes underwent broad and independent waves of expansion in many higher vertebrates lineages, yet comprehensive studies of members harbored by a given species are scarce. Here we present a thorough analysis of KZFP genes and related units in the murine genome. We first identified about twice as many elements than previously annotated as either KZFP genes or pseudogenes, notably by assigning to this family an entity formerly considered as a large group of Satellite repeats. We then could delineate an organization in clusters distributed throughout the genome, with signs of recombination, translocation, duplication and seeding of new sites by retrotransposition of KZFP genes and related genetic units (KZFP/rGUs). Moreover, we harvested evidence indicating that closely related paralogs had evolved through both drifting and shifting of sequences encoding for zinc finger arrays. Finally, we could demonstrate that the KAP1-SETDB1 repressor complex tames the expression of KZFP/rGUs within clusters, yet that the primary targets of this regulation are not the KZFP/rGUs themselves but enhancers contained in neighboring endogenous retroelements and that, underneath, KZFPs conserve highly individualized patterns of expression.

  8. KIR-HLA distribution in a Vietnamese population from Hanoi.

    PubMed

    Amorim, Leonardo Maldaner; van Tong, Hoang; Hoan, Nghiem Xuan; Vargas, Luciana de Brito; Ribeiro, Enilze Maria de Souza Fonseca; Petzl-Erler, Maria Luiza; Boldt, Angelica B W; Toan, Nguyen Linh; Song, Le Huu; Velavan, Thirumalaisamy P; Augusto, Danillo G

    2018-02-01

    The KIR (killer cell immunoglobulin-like receptors) gene family codifies a group of receptors that recognize human leukocyte antigens (HLA) and modulate natural killer (NK) cells response. Genetic diversity of KIR genes and HLA ligands has not yet been deeply investigated in South East Asia. Here, we characterized KIR gene presence and absence polymorphism of 14 KIR genes and two pseudogenes, as well as the frequencies of the ligands HLA-Bw4, HLA-C1 and HLA-C2 in a Vietnamese population from Hanoi (n = 140). Genotyping was performed by polymerase chain reaction with specific sequence primers (PCR-SSP). We compared KIR frequencies and performed principal component analysis with 43 worldwide populations of different ancestries. KIR carrier frequencies in Vietnamese were similar to those reported for Thai and Chinese Han, but differed significantly from other geographically close populations such as Japanese and South Korean. This similarity was also observed in KIR gene-content genotypes and is in accordance with the origin from Southern China and Thailand proposed for the Vietnamese population. The frequencies of HLA ligands observed in Vietnamese did not differ from those reported for other East-Asian populations (p > .05). Studies regarding KIR-HLA in populations are of prime importance to understand their evolution, function and role in diseases. Copyright © 2017 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.

  9. Extensive Gains and Losses of Olfactory Receptor Genes in Mammalian Evolution

    PubMed Central

    Niimura, Yoshihito; Nei, Masatoshi

    2007-01-01

    Odor perception in mammals is mediated by a large multigene family of olfactory receptor (OR) genes. The number of OR genes varies extensively among different species of mammals, and most species have a substantial number of pseudogenes. To gain some insight into the evolutionary dynamics of mammalian OR genes, we identified the entire set of OR genes in platypuses, opossums, cows, dogs, rats, and macaques and studied the evolutionary change of the genes together with those of humans and mice. We found that platypuses and primates have <400 functional OR genes while the other species have 800–1,200 functional OR genes. We then estimated the numbers of gains and losses of OR genes for each branch of the phylogenetic tree of mammals. This analysis showed that (i) gene expansion occurred in the placental lineage each time after it diverged from monotremes and from marsupials and (ii) hundreds of gains and losses of OR genes have occurred in an order-specific manner, making the gene repertoires highly variable among different orders. It appears that the number of OR genes is determined primarily by the functional requirement for each species, but once the number reaches the required level, it fluctuates by random duplication and deletion of genes. This fluctuation seems to have been aided by the stochastic nature of OR gene expression. PMID:17684554

  10. Two host cytoplasmic effectors are required for pathogenesis of Phytophthora sojae by suppression of host defenses.

    PubMed

    Liu, Tingli; Ye, Wenwu; Ru, Yanyan; Yang, Xinyu; Gu, Biao; Tao, Kai; Lu, Shan; Dong, Suomeng; Zheng, Xiaobo; Shan, Weixing; Wang, Yuanchao; Dou, Daolong

    2011-01-01

    Phytophthora sojae encodes hundreds of putative host cytoplasmic effectors with conserved FLAK motifs following signal peptides, termed crinkling- and necrosis-inducing proteins (CRN) or Crinkler. Their functions and mechanisms in pathogenesis are mostly unknown. Here, we identify a group of five P. sojae-specific CRN-like genes with high levels of sequence similarity, of which three are putative pseudogenes. Functional analysis shows that the two functional genes encode proteins with predicted nuclear localization signals that induce contrasting responses when expressed in Nicotiana benthamiana and soybean (Glycine max). PsCRN63 induces cell death, while PsCRN115 suppresses cell death elicited by the P. sojae necrosis-inducing protein (PsojNIP) or PsCRN63. Expression of CRN fragments with deleted signal peptides and FLAK motifs demonstrates that the carboxyl-terminal portions of PsCRN63 or PsCRN115 are sufficient for their activities. However, the predicted nuclear localization signal is required for PsCRN63 to induce cell death but not for PsCRN115 to suppress cell death. Furthermore, silencing of the PsCRN63 and PsCRN115 genes in P. sojae stable transformants leads to a reduction of virulence on soybean. Intriguingly, the silenced transformants lose the ability to suppress host cell death and callose deposition on inoculated plants. These results suggest a role for CRN effectors in the suppression of host defense responses.

  11. Molecular Population Genetics of Human CYP3A Locus: Signatures of Positive Selection and Implications for Evolutionary Environmental Medicine

    PubMed Central

    Chen, Xiaoping; Wang, Haijian; Zhou, Gangqiao; Zhang, Xiumei; Dong, Xiaojia; Zhi, Lianteng; Jin, Li; He, Fuchu

    2009-01-01

    Background The human CYP3A gene cluster codes for cytochrome P450 (CYP) subfamily enzymes that catalyze the metabolism of various exogenous and endogenous chemicals and is an obvious candidate for evolutionary and environmental genomic study. Functional variants in the CYP3A locus may have undergone a selective sweep in response to various environmental conditions. Objective The goal of this study was to profile the allelic structure across the human CYP3A locus and investigate natural selection on that locus. Methods From the CYP3A locus spanning 231 kb, we resequenced 54 genomic DNA fragments (a total of 43,675 bases) spanning four genes (CYP3A4, CYP3A5, CYP3A7, and CYP3A43) and two pseudogenes (CYP3AP1 and CYP3AP2), and randomly selected intergenic regions at the CYP3A locus in Africans (24 individuals), Caucasians (24 individuals), and Chinese (29 individuals). We comprehensively investigated the nucleotide diversity and haplotype structure and examined the possible role of natural selection in shaping the sequence variation throughout the gene cluster. Results Neutrality tests with Tajima’s D, Fu and Li’s D* and F*, and Fay and Wu’s H indicated possible roles of positive selection on the entire CYP3A locus in non-Africans. Sliding-window analyses of nucleotide diversity and frequency spectrum, as well as haplotype diversity and phylogenetically inferred haplotype structure, revealed that CYP3A4 and CYP3A7 had recently undergone or were undergoing a selective sweep in all three populations, whereas CYP3A43 and CYP3A5 were undergoing a selective sweep in non-Africans and Caucasians, respectively. Conclusion The refined allelic architecture and selection spectrum for the human CYP3A locus highlight that evolutionary dynamics of molecular adaptation may underlie the phenotypic variation of the xenobiotic disposition system and varied predisposition to complex disorders in which xenobiotics play a role. PMID:20019904

  12. Gene duplication and fragmentation in the zebra finch major histocompatibility complex.

    PubMed

    Balakrishnan, Christopher N; Ekblom, Robert; Völker, Martin; Westerdahl, Helena; Godinez, Ricardo; Kotkiewicz, Holly; Burt, David W; Graves, Tina; Griffin, Darren K; Warren, Wesley C; Edwards, Scott V

    2010-04-01

    Due to its high polymorphism and importance for disease resistance, the major histocompatibility complex (MHC) has been an important focus of many vertebrate genome projects. Avian MHC organization is of particular interest because the chicken Gallus gallus, the avian species with the best characterized MHC, possesses a highly streamlined minimal essential MHC, which is linked to resistance against specific pathogens. It remains unclear the extent to which this organization describes the situation in other birds and whether it represents a derived or ancestral condition. The sequencing of the zebra finch Taeniopygia guttata genome, in combination with targeted bacterial artificial chromosome (BAC) sequencing, has allowed us to characterize an MHC from a highly divergent and diverse avian lineage, the passerines. The zebra finch MHC exhibits a complex structure and history involving gene duplication and fragmentation. The zebra finch MHC includes multiple Class I and Class II genes, some of which appear to be pseudogenes, and spans a much more extensive genomic region than the chicken MHC, as evidenced by the presence of MHC genes on each of seven BACs spanning 739 kb. Cytogenetic (FISH) evidence and the genome assembly itself place core MHC genes on as many as four chromosomes with TAP and Class I genes mapping to different chromosomes. MHC Class II regions are further characterized by high endogenous retroviral content. Lastly, we find strong evidence of selection acting on sites within passerine MHC Class I and Class II genes. The zebra finch MHC differs markedly from that of the chicken, the only other bird species with a complete genome sequence. The apparent lack of synteny between TAP and the expressed MHC Class I locus is in fact reminiscent of a pattern seen in some mammalian lineages and may represent convergent evolution. Our analyses of the zebra finch MHC suggest a complex history involving chromosomal fission, gene duplication and translocation in the history of the MHC in birds, and highlight striking differences in MHC structure and organization among avian lineages.

  13. The Ftx Noncoding Locus Controls X Chromosome Inactivation Independently of Its RNA Products.

    PubMed

    Furlan, Giulia; Gutierrez Hernandez, Nancy; Huret, Christophe; Galupa, Rafael; van Bemmel, Joke Gerarda; Romito, Antonio; Heard, Edith; Morey, Céline; Rougeulle, Claire

    2018-05-03

    Accumulation of the Xist long noncoding RNA (lncRNA) on one X chromosome is the trigger for X chromosome inactivation (XCI) in female mammals. Xist expression, which needs to be tightly controlled, involves a cis-acting region, the X-inactivation center (Xic), containing many lncRNA genes that evolved concomitantly to Xist from protein-coding ancestors through pseudogeneization and loss of coding potential. Here, we uncover an essential role for the Xic-linked noncoding gene Ftx in the regulation of Xist expression. We show that Ftx is required in cis to promote Xist transcriptional activation and establishment of XCI. Importantly, we demonstrate that this function depends on Ftx transcription and not on the RNA products. Our findings illustrate the multiplicity of layers operating in the establishment of XCI and highlight the diversity in the modus operandi of the noncoding players. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Genome-Wide Identification and Mapping of NBS-Encoding Resistance Genes in Solanum tuberosum Group Phureja

    PubMed Central

    Lozano, Roberto; Ponce, Olga; Ramirez, Manuel; Mostajo, Nelly; Orjeda, Gisella

    2012-01-01

    The majority of disease resistance (R) genes identified to date in plants encode a nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domain containing protein. Additional domains such as coiled-coil (CC) and TOLL/interleukin-1 receptor (TIR) domains can also be present. In the recently sequenced Solanum tuberosum group phureja genome we used HMM models and manual curation to annotate 435 NBS-encoding R gene homologs and 142 NBS-derived genes that lack the NBS domain. Highly similar homologs for most previously documented Solanaceae R genes were identified. A surprising ∼41% (179) of the 435 NBS-encoding genes are pseudogenes primarily caused by premature stop codons or frameshift mutations. Alignment of 81.80% of the 577 homologs to S. tuberosum group phureja pseudomolecules revealed non-random distribution of the R-genes; 362 of 470 genes were found in high density clusters on 11 chromosomes. PMID:22493716

  15. Development of novel low-copy nuclear markers for Hieraciinae (Asteraceae) and their perspective for other tribes.

    PubMed

    Krak, Karol; Alvarez, Inés; Caklová, Petra; Costa, Andrea; Chrtek, Jindrich; Fehrer, Judith

    2012-02-01

    The development of three low-copy nuclear markers for low taxonomic level phylogenies in Asteraceae with emphasis on the subtribe Hieraciinae is reported. Marker candidates were selected by comparing a Lactuca complementary DNA (cDNA) library with public DNA sequence databases. Interspecific variation and phylogenetic signal of the selected genes were investigated for diploid taxa from the subtribe Hieraciinae and compared to a reference phylogeny. Their ability to cross-amplify was assessed for other Asteraceae tribes. All three markers had higher variation (2.1-4.5 times) than the internal transcribed spacer (ITS) in Hieraciinae. Cross-amplification was successful in at least seven other tribes of the Asteraceae. Only three cases indicating the presence of paralogs or pseudogenes were detected. The results demonstrate the potential of these markers for phylogeny reconstruction in the Hieraciinae as well as in other Asteraceae tribes, especially for very closely related species.

  16. Epigenetics, chromatin and genome organization: recent advances from the ENCODE project.

    PubMed

    Siggens, L; Ekwall, K

    2014-09-01

    The organization of the genome into functional units, such as enhancers and active or repressed promoters, is associated with distinct patterns of DNA and histone modifications. The Encyclopedia of DNA Elements (ENCODE) project has advanced our understanding of the principles of genome, epigenome and chromatin organization, identifying hundreds of thousands of potential regulatory regions and transcription factor binding sites. Part of the ENCODE consortium, GENCODE, has annotated the human genome with novel transcripts including new noncoding RNAs and pseudogenes, highlighting transcriptional complexity. Many disease variants identified in genome-wide association studies are located within putative enhancer regions defined by the ENCODE project. Understanding the principles of chromatin and epigenome organization will help to identify new disease mechanisms, biomarkers and drug targets, particularly as ongoing epigenome mapping projects generate data for primary human cell types that play important roles in disease. © 2014 The Association for the Publication of the Journal of Internal Medicine.

  17. Creating reference gene annotation for the mouse C57BL6/J genome assembly.

    PubMed

    Mudge, Jonathan M; Harrow, Jennifer

    2015-10-01

    Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species.

  18. The complete mitochondrial genome of Chinese green hydra, Hydra sinensis (Hydroida: Hydridae).

    PubMed

    Pan, Hong-Chun; Qian, Xiao-Cheng; Li, Ping; Li, Xiao-Fei; Wang, An-Tai

    2014-02-01

    The complete mitochondrial genome of Chinese green hydra, Hydra sinensis (Hydroida: Hydridae) is a linear molecule of 16,189 bp in length, containing 13 protein-coding genes, small and large subunit ribosomal RNAs, methionine and tryptophan transfer RNAs, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mitochondrial DNA. The A + T content of the overall base composition of H-strand is 77.2% (T: 41.7%; C: 10.9%; A: 35.5%; and G: 11.9%). COI and ND1 genes begin with GTG as start codon, while other 11 protein-coding genes start with a typical ATG initiation codon. COII, ATP8, ATP6, COIII, ND5, ND6, ND3, ND1, ND4 and COI genes are terminated with TAA as stop codon, ND4L ends with TAG, ND2 ends with TA and Cyt b ends with T.

  19. CRISPR/Cas9-mediated gene knockout of NANOG and NANOGP8 decreases the malignant potential of prostate cancer cells.

    PubMed

    Kawamura, Norihiko; Nimura, Keisuke; Nagano, Hiromichi; Yamaguchi, Sohei; Nonomura, Norio; Kaneda, Yasufumi

    2015-09-08

    NANOG expression in prostate cancer is highly correlated with cancer stem cell characteristics and resistance to androgen deprivation. However, it is not clear whether NANOG or its pseudogenes contribute to the malignant potential of cancer. We established NANOG- and NANOGP8-knockout DU145 prostate cancer cell lines using the CRISPR/Cas9 system. Knockouts of NANOG and NANOGP8 significantly attenuated malignant potential, including sphere formation, anchorage-independent growth, migration capability, and drug resistance, compared to parental DU145 cells. NANOG and NANOGP8 knockout did not inhibit in vitro cell proliferation, but in vivo tumorigenic potential decreased significantly. These phenotypes were recovered in NANOG- and NANOGP8-rescued cell lines. These results indicate that NANOG and NANOGP8 proteins are expressed in prostate cancer cell lines, and NANOG and NANOGP8 equally contribute to the high malignant potential of prostate cancer.

  20. Detection of large scale 3′ deletions in the PMS2 gene amongst Colon-CFR participants – have we been missing anything?

    PubMed Central

    Clendenning, Mark; Walsh, Michael D; Gelpi, Judith Balmana; Thibodeau, Stephen N.; Lindor, Noralane; Potter, John D.; Newcomb, Polly; LeMarchand, Loic; Haile, Robert; Gallinger, Steve; Hopper, John L.; Jenkins, Mark A.; Rosty, Christophe; Young, Joanne P.; Buchanan, Daniel D.

    2013-01-01

    Current screening practices have been able to identify PMS2 mutations in 78% of cases of colorectal cancer from the Colorectal Cancer Family Registry (Colon CFR) which showed solitary loss of the PMS2 protein. However the detection of large-scale deletions in the 3′ end of the PMS2 gene has not been possible due to technical difficulties associated with pseudogene sequences. Here, we utilised a recently described MLPA/long-range PCR-based approach to screen the remaining 22% (n = 16) of CRC-affected probands for mutations in the 3′ end of the PMS2 gene. No deletions encompassing any or all of exons 12 through 15 were identified; therefore, our results suggest that 3′ deletions in PMS2 are not a frequent occurrence in such families. PMID:23288611

  1. Identification of Potential Prostate Cancer-Related Pseudogenes Based on Competitive Endogenous RNA Network Hypothesis.

    PubMed

    Jiang, Tao; Guo, Junjie; Hu, Zhongchun; Zhao, Ming; Gu, Zhenggang; Miao, Shu

    2018-06-20

    BACKGROUND Long noncoding RNAs (lncRNAs) have been revealed to function as competing endogenous RNAs (ceRNAs), which can seclude the common microRNAs (miRNAs) and hence prevent the miRNAs from binding to their ancestral gene. Nonetheless, the role of lncRNA-mediated ceRNAs in prostate cancer has not yet been elucidated. MATERIAL AND METHODS Using The Cancer Genome Atlas (TCGA) database, lncRNA, miRNA, and mRNA profiles from 499 prostate cancer tissues and 52 normal prostate tissues were analyzed with the R package "DESeq" to identify the differentially expressed RNAs. GO and KEGG pathway analyses were performed using "DAVID6.8" and R packages "Clusterprofile." The ceRNA network in prostate cancer was constructed using miRDB, miRTarBase, and TargetScan databases. Survival analysis was performed with Kaplan-Meier analysis. RESULTS A total of 376 lncRNAs, 33 miRNAs, and 687 mRNAs were identified as significant factors in tumorigenesis. Based on the hypothesis that the ceRNA network (lncRNA-miRNA-mRNA regulatory axis) is involved in prostate cancer and forms competitive interrelations between miRNA and mRNA or lncRNA, we constructed a ceRNA network that included 23 lncRNAs, 6 miRNAs, and 2 mRNAs that were differentially expressed in prostate cancer. Only 3 lncRNAs (LINC00308, LINC00355, and OSTN-AS1) had a significant association with survival (P<0.05). The 3 prostate cancer-specific lncRNA were validated in prostate cancer cell lines PC3 and DU145 using qRT-PCR. CONCLUSIONS We demonstrated the differential lncRNA expression profiles in prostate cancer, which provides new insights for future studies of the ceRNA network and its regulatory mechanisms in prostate cancer.

  2. A functional analysis of the spacer of V(D)J recombination signal sequences.

    PubMed

    Lee, Alfred Ian; Fugmann, Sebastian D; Cowell, Lindsay G; Ptaszek, Leon M; Kelsoe, Garnett; Schatz, David G

    2003-10-01

    During lymphocyte development, V(D)J recombination assembles antigen receptor genes from component V, D, and J gene segments. These gene segments are flanked by a recombination signal sequence (RSS), which serves as the binding site for the recombination machinery. The murine Jbeta2.6 gene segment is a recombinationally inactive pseudogene, but examination of its RSS reveals no obvious reason for its failure to recombine. Mutagenesis of the Jbeta2.6 RSS demonstrates that the sequences of the heptamer, nonamer, and spacer are all important. Strikingly, changes solely in the spacer sequence can result in dramatic differences in the level of recombination. The subsequent analysis of a library of more than 4,000 spacer variants revealed that spacer residues of particular functional importance are correlated with their degree of conservation. Biochemical assays indicate distinct cooperation between the spacer and heptamer/nonamer along each step of the reaction pathway. The results suggest that the spacer serves not only to ensure the appropriate distance between the heptamer and nonamer but also regulates RSS activity by providing additional RAG:RSS interaction surfaces. We conclude that while RSSs are defined by a "digital" requirement for absolutely conserved nucleotides, the quality of RSS function is determined in an "analog" manner by numerous complex interactions between the RAG proteins and the less-well conserved nucleotides in the heptamer, the nonamer, and, importantly, the spacer. Those modulatory effects are accurately predicted by a new computational algorithm for "RSS information content." The interplay between such binary and multiplicative modes of interactions provides a general model for analyzing protein-DNA interactions in various biological systems.

  3. Rice choline monooxygenase (OsCMO) protein functions in enhancing glycine betaine biosynthesis in transgenic tobacco but does not accumulate in rice (Oryza sativa L. ssp. japonica).

    PubMed

    Luo, Di; Niu, Xiangli; Yu, Jinde; Yan, Jun; Gou, Xiaojun; Lu, Bao-Rong; Liu, Yongsheng

    2012-09-01

    Glycine betaine (GB) is a compatible quaternary amine that enables plants to tolerate abiotic stresses, including salt, drought and cold. In plants, GB is synthesized through two-step of successive oxidations from choline, catalyzed by choline monooxygenase (CMO) and betaine aldehyde dehydrogenase (BADH), respectively. Rice is considered as a typical non-GB accumulating species, although the entire genome sequencing revealed rice contains orthologs of both CMO and BADH. Several studies unraveled that rice has a functional BADH gene, but whether rice CMO gene (OsCMO) is functional or a pseudogene remains to be elucidated. In the present study, we report the functional characterization of rice CMO gene. The OsCMO gene was isolated from rice cv. Nipponbare (Oryza sativa L. ssp. japonica) using RT-PCR. Northern blot demonstrated the transcription of OsCMO is enhanced by salt stress. Transgenic tobacco plants overexpressing OsCMO results in increased GB content and elevated tolerance to salt stress. Immunoblotting analysis demonstrates that a functional OsCMO protein with correct size was present in transgenic tobacco but rarely accumulated in wild-type rice plants. Surprisingly, a large amount of truncated proteins derived from OsCMO was induced in the rice seedlings in response to salt stresses. This suggests that it is the lack of a functional OsCMO protein that presumably results in non-GB accumulation in the tested rice plant. Expression and transgenic studies demonstrate OsCMO is transcriptionally induced in response to salt stress and functions in increasing glycinebetaine accumulation and enhancing tolerance to salt stress. Immunoblotting analysis suggests that no accumulation of glycinebetaine in the Japonica rice plant presumably results from lack of a functional OsCMO protein.

  4. Analysis of a library of macaque nuclear mitochondrial sequences confirms macaque origin of divergent sequences from old oral polio vaccine samples.

    PubMed

    Vartanian, Jean-Pierre; Wain-Hobson, Simon

    2002-05-28

    Nuclear mtDNA sequences (numts) are a widespread family of paralogs evolving as pseudogenes in chromosomal DNA [Zhang, D. E. & Hewitt, G. M. (1996) TREE 11, 247-251 and Bensasson, D., Zhang, D., Hartl, D. L. & Hewitt, G. M. (2001) TREE 16, 314-321]. When trying to identify the species origin of an unknown DNA sample by way of an mtDNA locus, PCR may amplify both mtDNA and numts. Indeed, occasionally numts dominate confounding attempts at species identification [Bensasson, D., Zhang, D. X. & Hewitt, G. M. (2000) Mol. Biol. Evol. 17, 406-415; Wallace, D. C., et al. (1997) Proc. Natl. Acad. Sci. USA 94, 14900-14905]. Rhesus and cynomolgus macaque mtDNA haplotypes were identified in a study of oral polio vaccine samples dating from the late 1950s [Blancou, P., et al. (2001) Nature (London) 410, 1045-1046]. They were accompanied by a number of putative numts. To confirm that these putative numts were of macaque origin, a library of numts corresponding to a small segment of 12S rDNA locus has been made by using DNA from a Chinese rhesus macaque. A broad distribution was found with up to 30% sequence variation. Phylogenetic analysis showed that the evolutionary trajectories of numts and bona fide mtDNA haplotypes do not overlap with the signal exception of the host species; mtDNA fragments are continually crossing over into the germ line. In the case of divergent mtDNA sequences from old oral polio vaccine samples [Blancou, P., et al. (2001) Nature (London) 410, 1045-1046], all were closely related to numts in the Chinese macaque library.

  5. The clinical phenotype of Lynch syndrome due to germline PMS2 mutations

    PubMed Central

    Senter, Leigha; Clendenning, Mark; Sotamaa, Kaisa; Hampel, Heather; Green, Jane; Potter, John D.; Lindblom, Annika; Lagerstedt, Kristina; Thibodeau, Stephen N.; Lindor, Noralane M.; Young, Joanne; Winship, Ingrid; Dowty, James G.; White, Darren M.; Hopper, John L.; Baglietto, Laura; Jenkins, Mark A.; de la Chapelle, Albert

    2009-01-01

    Background and Aims Although the clinical phenotype of Lynch syndrome (also known as Hereditary Nonpolyposis Colorectal Cancer) has been well described, little is known about disease in PMS2 mutation carriers. Now that mutation detection methods can discern mutations in PMS2 from mutations in its pseudogenes, more mutation carriers have been identified. Information about the clinical significance of PMS2 mutations is crucial for appropriate counseling. Here, we report the clinical characteristics of a large series of PMS2 mutation carriers. Methods We performed PMS2 mutation analysis using long range PCR and MLPA for 99 probands diagnosed with Lynch syndrome-associated tumors showing isolated loss of PMS2 by immunohistochemistry. Penetrance was calculated using a modified segregation analysis adjusting for ascertainment. Results Germline PMS2 mutations were detected in 62% of probands (n = 55 monoallelic; 6 biallelic). Among families with monoallelic PMS2 mutations, 65.5% met revised Bethesda guidelines. Compared with the general population, in mutation carriers, the incidence of colorectal cancer was 5.2 fold higher and the incidence of endometrial cancer was 7.5 fold higher. In North America, this translates to a cumulative cancer risk to age 70 of 15–20% for colorectal cancer, 15% for endometrial cancer, and 25–32% for any Lynch syndrome-associated cancer. No elevated risk for non-Lynch syndrome-associated cancers was observed. Conclusions PMS2 mutations contribute significantly to Lynch syndrome but the penetrance for monoallelic mutation carriers appears to be lower than that for the other mismatch repair genes. Modified counseling and cancer surveillance guidelines for PMS2 mutation carriers are proposed. PMID:18602922

  6. The clinical phenotype of Lynch syndrome due to germ-line PMS2 mutations.

    PubMed

    Senter, Leigha; Clendenning, Mark; Sotamaa, Kaisa; Hampel, Heather; Green, Jane; Potter, John D; Lindblom, Annika; Lagerstedt, Kristina; Thibodeau, Stephen N; Lindor, Noralane M; Young, Joanne; Winship, Ingrid; Dowty, James G; White, Darren M; Hopper, John L; Baglietto, Laura; Jenkins, Mark A; de la Chapelle, Albert

    2008-08-01

    Although the clinical phenotype of Lynch syndrome (also known as hereditary nonpolyposis colorectal cancer) has been well described, little is known about disease in PMS2 mutation carriers. Now that mutation detection methods can discern mutations in PMS2 from mutations in its pseudogenes, more mutation carriers have been identified. Information about the clinical significance of PMS2 mutations is crucial for appropriate counseling. Here, we report the clinical characteristics of a large series of PMS2 mutation carriers. We performed PMS2 mutation analysis using long-range polymerase chain reaction and multiplex ligation-dependent probe amplification for 99 probands diagnosed with Lynch syndrome-associated tumors showing isolated loss of PMS2 by immunohistochemistry. Penetrance was calculated using a modified segregation analysis adjusting for ascertainment. Germ-line PMS2 mutations were detected in 62% of probands (n = 55 monoallelic; 6 biallelic). Among families with monoallelic PMS2 mutations, 65.5% met revised Bethesda guidelines. Compared with the general population, in mutation carriers, the incidence of colorectal cancer was 5.2-fold higher, and the incidence of endometrial cancer was 7.5-fold higher. In North America, this translates to a cumulative cancer risk to age 70 years of 15%-20% for colorectal cancer, 15% for endometrial cancer, and 25%-32% for any Lynch syndrome-associated cancer. No elevated risk for non-Lynch syndrome-associated cancers was observed. PMS2 mutations contribute significantly to Lynch syndrome, but the penetrance for monoallelic mutation carriers appears to be lower than that for the other mismatch repair genes. Modified counseling and cancer surveillance guidelines for PMS2 mutation carriers are proposed.

  7. Sequence Diversity of Pan troglodytes Subspecies and the Impact of WFDC6 Selective Constraints in Reproductive Immunity

    PubMed Central

    Ferreira, Zélia; Hurle, Belen; Andrés, Aida M.; Kretzschmar, Warren W.; Mullikin, James C.; Cherukuri, Praveen F.; Cruz, Pedro; Gonder, Mary Katherine; Stone, Anne C.; Tishkoff, Sarah; Swanson, Willie J.; Green, Eric D.; Clark, Andrew G.; Seixas, Susana

    2013-01-01

    Recent efforts have attempted to describe the population structure of common chimpanzee, focusing on four subspecies: Pan troglodytes verus, P. t. ellioti, P. t. troglodytes, and P. t. schweinfurthii. However, few studies have pursued the effects of natural selection in shaping their response to pathogens and reproduction. Whey acidic protein (WAP) four-disulfide core domain (WFDC) genes and neighboring semenogelin (SEMG) genes encode proteins with combined roles in immunity and fertility. They display a strikingly high rate of amino acid replacement (dN/dS), indicative of adaptive pressures during primate evolution. In human populations, three signals of selection at the WFDC locus were described, possibly influencing the proteolytic profile and antimicrobial activities of the male reproductive tract. To evaluate the patterns of genomic variation and selection at the WFDC locus in chimpanzees, we sequenced 17 WFDC genes and 47 autosomal pseudogenes in 68 chimpanzees (15 P. t. troglodytes, 22 P. t. verus, and 31 P. t. ellioti). We found a clear differentiation of P. t. verus and estimated the divergence of P. t. troglodytes and P. t. ellioti subspecies in 0.173 Myr; further, at the WFDC locus we identified a signature of strong selective constraints common to the three subspecies in WFDC6—a recent paralog of the epididymal protease inhibitor EPPIN. Overall, chimpanzees and humans do not display similar footprints of selection across the WFDC locus, possibly due to different selective pressures between the two species related to immune response and reproductive biology. PMID:24356879

  8. Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals

    PubMed Central

    2008-01-01

    Background The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) α, β and γ subunits. Further investigation of 14 α-like (Abpa) and 13 β- or γ-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Results Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. Conclusion We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification. PMID:18269759

  9. Duplication and selection in the evolution of primate β-defensin genes

    PubMed Central

    Semple, Colin AM; Rolfe, Mark; Dorin, Julia R

    2003-01-01

    Background Innate immunity is the first line of defense against microorganisms in vertebrates and acts by providing an initial barrier to microorganisms and triggering adaptive immune responses. Peptides such as β-defensins are an important component of this defense, providing a broad spectrum of antimicrobial activity against bacteria, fungi, mycobacteria and several enveloped viruses. β-defensins are small cationic peptides that vary in their expression patterns and spectrum of pathogen specificity. Disruptions in β-defensin function have been implicated in human diseases, including cystic fibrosis, and a fuller understanding of the variety, function and evolution of human β-defensins might form the basis for novel therapies. Here we use a combination of laboratory and computational techniques to characterize the main human β-defensin locus on chromosome 8p22-p23. Results In addition to known genes in the region we report the genomic structures and expression patterns of four novel human β-defensin genes and a related pseudogene. These genes show an unusual pattern of evolution, with rapid divergence between second exon sequences that encode the mature β-defensin peptides matched by relative stasis in first exons that encode signal peptides. Conclusions We conclude that the 8p22-p23 locus has evolved by successive rounds of duplication followed by substantial divergence involving positive selection, to produce a diverse cluster of paralogous genes established before the human-baboon divergence more than 23 million years ago. Positive selection, disproportionately favoring alterations in the charge of amino-acid residues, is implicated as driving second exon divergence in these genes. PMID:12734011

  10. Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals.

    PubMed

    Laukaitis, Christina M; Heger, Andreas; Blakley, Tyler D; Munclinger, Pavel; Ponting, Chris P; Karn, Robert C

    2008-02-12

    The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) alpha, beta and gamma subunits. Further investigation of 14 alpha-like (Abpa) and 13 beta- or gamma-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification.

  11. Pseudogenization of a Sweet-Receptor Gene Accounts for Cats' Indifference toward Sugar

    PubMed Central

    Li, Xia; Li, Weihua; Wang, Hong; Cao, Jie; Maehashi, Kenji; Huang, Liquan; Bachmanov, Alexander A; Reed, Danielle R; Legrand-Defretin, Véronique; Beauchamp, Gary K; Brand, Joseph G

    2005-01-01

    Although domestic cats (Felis silvestris catus) possess an otherwise functional sense of taste, they, unlike most mammals, do not prefer and may be unable to detect the sweetness of sugars. One possible explanation for this behavior is that cats lack the sensory system to taste sugars and therefore are indifferent to them. Drawing on work in mice, demonstrating that alleles of sweet-receptor genes predict low sugar intake, we examined the possibility that genes involved in the initial transduction of sweet perception might account for the indifference to sweet-tasting foods by cats. We characterized the sweet-receptor genes of domestic cats as well as those of other members of the Felidae family of obligate carnivores, tiger and cheetah. Because the mammalian sweet-taste receptor is formed by the dimerization of two proteins (T1R2 and T1R3; gene symbols Tas1r2 and Tas1r3), we identified and sequenced both genes in the cat by screening a feline genomic BAC library and by performing PCR with degenerate primers on cat genomic DNA. Gene expression was assessed by RT-PCR of taste tissue, in situ hybridization, and immunohistochemistry. The cat Tas1r3 gene shows high sequence similarity with functional Tas1r3 genes of other species. Message from Tas1r3 was detected by RT-PCR of taste tissue. In situ hybridization and immunohistochemical studies demonstrate that Tas1r3 is expressed, as expected, in taste buds. However, the cat Tas1r2 gene shows a 247-base pair microdeletion in exon 3 and stop codons in exons 4 and 6. There was no evidence of detectable mRNA from cat Tas1r2 by RT-PCR or in situ hybridization, and no evidence of protein expression by immunohistochemistry. Tas1r2 in tiger and cheetah and in six healthy adult domestic cats all show the similar deletion and stop codons. We conclude that cat Tas1r3 is an apparently functional and expressed receptor but that cat Tas1r2 is an unexpressed pseudogene. A functional sweet-taste receptor heteromer cannot form, and thus the cat lacks the receptor likely necessary for detection of sweet stimuli. This molecular change was very likely an important event in the evolution of the cat's carnivorous behavior. PMID:16103917

  12. Development of a Bioinformatics Framework for the Detection of Gene Conversion and the Analysis of Combinatorial Diversity in Immunoglobulin Heavy Chains in Four Cattle Breeds.

    PubMed

    Walther, Stefanie; Tietze, Manfred; Czerny, Claus-Peter; König, Sven; Diesterbeck, Ulrike S

    2016-01-01

    We have developed a new bioinformatics framework for the analysis of rearranged bovine heavy chain immunoglobulin (Ig) variable regions by combining and refining widely used alignment algorithms. This bioinformatics framework allowed us to investigate alignments of heavy chain framework regions (FRHs) and the separate alignments of FRHs and heavy chain complementarity determining regions (CDRHs) to determine their germline origin in the four cattle breeds Aubrac, German Black Pied, German Simmental, and Holstein Friesian. Now it is also possible to specifically analyze Ig heavy chains possessing exceptionally long CDR3Hs. In order to gain more insight into breed specific differences in Ig combinatorial diversity, somatic hypermutations and putative gene conversions of IgG, we compared the dominantly transcribed variable (IGHV), diversity (IGHD), and joining (IGHJ) segments and their recombination in the four cattle breeds. The analysis revealed the use of 15 different IGHV segments, 21 IGHD segments, and two IGHJ segments with significant different transcription levels within the breeds. Furthermore, there are preferred rearrangements within the three groups of CDR3H lengths. In the sequences of group 2 (CDR3H lengths (L) of 11-47 amino acid residues (aa)) a higher number of recombination was observed than in sequences of group 1 (L≤10 aa) and 3 (L≥48 aa). The combinatorial diversity of germline IGHV, IGHD, and IGHJ-segments revealed 162 rearrangements that were significantly different. The few preferably rearranged gene segments within group 3 CDR3H regions may indicate specialized antibodies because this length is unique in cattle. The most important finding of this study, which was enabled by using the bioinformatics framework, is the discovery of strong evidence for gene conversion as a rare event using pseudogenes fulfilling all definitions for this particular diversification mechanism.

  13. Genome and transcriptome adaptation accompanying emergence of the definitive type 2 host-restricted Salmonella enterica serovar Typhimurium pathovar.

    PubMed

    Kingsley, Robert A; Kay, Sally; Connor, Thomas; Barquist, Lars; Sait, Leanne; Holt, Kathryn E; Sivaraman, Karthi; Wileman, Thomas; Goulding, David; Clare, Simon; Hale, Christine; Seshasayee, Aswin; Harris, Simon; Thomson, Nicholas R; Gardner, Paul; Rabsch, Wolfgang; Wigley, Paul; Humphrey, Tom; Parkhill, Julian; Dougan, Gordon

    2013-08-27

    Salmonella enterica serovar Typhimurium definitive type 2 (DT2) is host restricted to Columba livia (rock or feral pigeon) but is also closely related to S. Typhimurium isolates that circulate in livestock and cause a zoonosis characterized by gastroenteritis in humans. DT2 isolates formed a distinct phylogenetic cluster within S. Typhimurium based on whole-genome-sequence polymorphisms. Comparative genome analysis of DT2 94-213 and S. Typhimurium SL1344, DT104, and D23580 identified few differences in gene content with the exception of variations within prophages. However, DT2 94-213 harbored 22 pseudogenes that were intact in other closely related S. Typhimurium strains. We report a novel in silico approach to identify single amino acid substitutions in proteins that have a high probability of a functional impact. One polymorphism identified using this method, a single-residue deletion in the Tar protein, abrogated chemotaxis to aspartate in vitro. DT2 94-213 also exhibited an altered transcriptional profile in response to culture at 42°C compared to that of SL1344. Such differentially regulated genes included a number involved in flagellum biosynthesis and motility. IMPORTANCE Whereas Salmonella enterica serovar Typhimurium can infect a wide range of animal species, some variants within this serovar exhibit a more limited host range and altered disease potential. Phylogenetic analysis based on whole-genome sequences can identify lineages associated with specific virulence traits, including host adaptation. This study represents one of the first to link pathogen-specific genetic signatures, including coding capacity, genome degradation, and transcriptional responses to host adaptation within a Salmonella serovar. We performed comparative genome analysis of reference and pigeon-adapted definitive type 2 (DT2) S. Typhimurium isolates alongside phenotypic and transcriptome analyses, to identify genetic signatures linked to host adaptation within the DT2 lineage.

  14. Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

    PubMed Central

    2011-01-01

    Background The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host interactions of these motile intestinal lactobacilli. PMID:21995554

  15. Development of a Bioinformatics Framework for the Detection of Gene Conversion and the Analysis of Combinatorial Diversity in Immunoglobulin Heavy Chains in Four Cattle Breeds

    PubMed Central

    Czerny, Claus-Peter; König, Sven; Diesterbeck, Ulrike S.

    2016-01-01

    We have developed a new bioinformatics framework for the analysis of rearranged bovine heavy chain immunoglobulin (Ig) variable regions by combining and refining widely used alignment algorithms. This bioinformatics framework allowed us to investigate alignments of heavy chain framework regions (FRHs) and the separate alignments of FRHs and heavy chain complementarity determining regions (CDRHs) to determine their germline origin in the four cattle breeds Aubrac, German Black Pied, German Simmental, and Holstein Friesian. Now it is also possible to specifically analyze Ig heavy chains possessing exceptionally long CDR3Hs. In order to gain more insight into breed specific differences in Ig combinatorial diversity, somatic hypermutations and putative gene conversions of IgG, we compared the dominantly transcribed variable (IGHV), diversity (IGHD), and joining (IGHJ) segments and their recombination in the four cattle breeds. The analysis revealed the use of 15 different IGHV segments, 21 IGHD segments, and two IGHJ segments with significant different transcription levels within the breeds. Furthermore, there are preferred rearrangements within the three groups of CDR3H lengths. In the sequences of group 2 (CDR3H lengths (L) of 11–47 amino acid residues (aa)) a higher number of recombination was observed than in sequences of group 1 (L≤10 aa) and 3 (L≥48 aa). The combinatorial diversity of germline IGHV, IGHD, and IGHJ-segments revealed 162 rearrangements that were significantly different. The few preferably rearranged gene segments within group 3 CDR3H regions may indicate specialized antibodies because this length is unique in cattle. The most important finding of this study, which was enabled by using the bioinformatics framework, is the discovery of strong evidence for gene conversion as a rare event using pseudogenes fulfilling all definitions for this particular diversification mechanism. PMID:27828971

  16. Genetic and physical mapping of 2q35 in the region of NRAMP and IL8R genes: Identification of a polymorphic repeat in exon 2 of NRAMP

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, J.K.; Shaw, M.A.; Barton, C.H.

    1994-11-15

    Recent interest has focused on the region of conserved synteny between mouse chromosome 1 and human 2q33-q37, particularly over the region encoding the murine macrophage resistance gene Ity/Lsh/Bcg (candidate Nramp) and members of the Il8r interleukin-8 (IL8) receptor gene cluster. In this paper, identification of a restriction fragment length polymorphism in the Il8RB gene in 35 pedigrees previously typed for markers in the 2q33-37 interval provided evidence (lod scores > 3) for linkage between Il8RB and the 2q34-135 markers FN1, TNP1, VIL1, and DES. Physical mapping, using yeast artificial chromosomes isolated with VIL1, confirmed that IL8RA, IL8RB and the IL8RBmore » pseudogene map within the NRAMP-VIL1 interval, with the physical distance (155 kb) from 5{prime} LSH to 3{prime} VIL1 representing {approx}3-fold that observed in the mouse. Partial sequencing of NRAMP confirmed the presence of the N-terminal proline/serine-rich putative SH3 binding domain in exon 2 of the human gene. Further analysis of Brazilian leprosy and visceral leishmaniasis pedigrees identified a rare second allele varying in a 9-nucleotide repeat motif of the exon 2 sequence but segregating independently of the disease phenotype. 38 refs., 4 figs., 3 tabs.« less

  17. Allying with armored snails: the complete genome of gammaproteobacterial endosymbiont.

    PubMed

    Nakagawa, Satoshi; Shimamura, Shigeru; Takaki, Yoshihiro; Suzuki, Yohey; Murakami, Shun-ichi; Watanabe, Tamaki; Fujiyoshi, So; Mino, Sayaka; Sawabe, Tomoo; Maeda, Takahiro; Makita, Hiroko; Nemoto, Suguru; Nishimura, Shin-Ichiro; Watanabe, Hiromi; Watsuji, Tomo-o; Takai, Ken

    2014-01-01

    Deep-sea vents harbor dense populations of various animals that have their specific symbiotic bacteria. Scaly-foot gastropods, which are snails with mineralized scales covering the sides of its foot, have a gammaproteobacterial endosymbiont in their enlarged esophageal glands and diverse epibionts on the surface of their scales. In this study, we report the complete genome sequencing of gammaproteobacterial endosymbiont. The endosymbiont genome displays features consistent with ongoing genome reduction such as large proportions of pseudogenes and insertion elements. The genome encodes functions commonly found in deep-sea vent chemoautotrophs such as sulfur oxidation and carbon fixation. Stable carbon isotope ((13)C)-labeling experiments confirmed the endosymbiont chemoautotrophy. The genome also includes an intact hydrogenase gene cluster that potentially has been horizontally transferred from phylogenetically distant bacteria. Notable findings include the presence and transcription of genes for flagellar assembly, through which proteins are potentially exported from bacterium to the host. Symbionts of snail individuals exhibited extreme genetic homogeneity, showing only two synonymous changes in 19 different genes (13 810 positions in total) determined for 32 individual gastropods collected from a single colony at one time. The extremely low genetic individuality in endosymbionts probably reflects that the stringent symbiont selection by host prevents the random genetic drift in the small population of horizontally transmitted symbiont. This study is the first complete genome analysis of gastropod endosymbiont and offers an opportunity to study genome evolution in a recently evolved endosymbiont.

  18. Gene expression profile and immunological evaluation of unique hypothetical unknown proteins of Mycobacterium leprae by using quantitative real-time PCR.

    PubMed

    Kim, Hee Jin; Prithiviraj, Kalyani; Groathouse, Nathan; Brennan, Patrick J; Spencer, John S

    2013-02-01

    The cell-mediated immunity (CMI)-based in vitro gamma interferon release assay (IGRA) of Mycobacterium leprae-specific antigens has potential as a promising diagnostic means to detect those individuals in the early stages of M. leprae infection. Diagnosis of leprosy is a major obstacle toward ultimate disease control and has been compromised in the past by the lack of specific markers. Comparative bioinformatic analysis among mycobacterial genomes identified potential M. leprae-specific proteins called "hypothetical unknowns." Due to massive gene decay and the prevalence of pseudogenes, it is unclear whether any of these proteins are expressed or are immunologically relevant. In this study, we performed cDNA-based quantitative real-time PCR to investigate the expression status of 131 putative open reading frames (ORFs) encoding hypothetical unknowns. Twenty-six of the M. leprae-specific antigen candidates showed significant levels of gene expression compared to that of ESAT-6 (ML0049), which is an important T cell antigen of low abundance in M. leprae. Fifteen of 26 selected antigen candidates were expressed and purified in Escherichia coli. The seroreactivity to these proteins of pooled sera from lepromatous leprosy patients and cavitary tuberculosis patients revealed that 9 of 15 recombinant hypothetical unknowns elicited M. leprae-specific immune responses. These nine proteins may be good diagnostic reagents to improve both the sensitivity and specificity of detection of individuals with asymptomatic leprosy.

  19. Gene Expression Profile and Immunological Evaluation of Unique Hypothetical Unknown Proteins of Mycobacterium leprae by Using Quantitative Real-Time PCR

    PubMed Central

    Prithiviraj, Kalyani; Groathouse, Nathan; Brennan, Patrick J.; Spencer, John S.

    2013-01-01

    The cell-mediated immunity (CMI)-based in vitro gamma interferon release assay (IGRA) of Mycobacterium leprae-specific antigens has potential as a promising diagnostic means to detect those individuals in the early stages of M. leprae infection. Diagnosis of leprosy is a major obstacle toward ultimate disease control and has been compromised in the past by the lack of specific markers. Comparative bioinformatic analysis among mycobacterial genomes identified potential M. leprae-specific proteins called “hypothetical unknowns.” Due to massive gene decay and the prevalence of pseudogenes, it is unclear whether any of these proteins are expressed or are immunologically relevant. In this study, we performed cDNA-based quantitative real-time PCR to investigate the expression status of 131 putative open reading frames (ORFs) encoding hypothetical unknowns. Twenty-six of the M. leprae-specific antigen candidates showed significant levels of gene expression compared to that of ESAT-6 (ML0049), which is an important T cell antigen of low abundance in M. leprae. Fifteen of 26 selected antigen candidates were expressed and purified in Escherichia coli. The seroreactivity to these proteins of pooled sera from lepromatous leprosy patients and cavitary tuberculosis patients revealed that 9 of 15 recombinant hypothetical unknowns elicited M. leprae-specific immune responses. These nine proteins may be good diagnostic reagents to improve both the sensitivity and specificity of detection of individuals with asymptomatic leprosy. PMID:23239802

  20. QuickMap: a public tool for large-scale gene therapy vector insertion site mapping and analysis.

    PubMed

    Appelt, J-U; Giordano, F A; Ecker, M; Roeder, I; Grund, N; Hotz-Wagenblatt, A; Opelz, G; Zeller, W J; Allgayer, H; Fruehauf, S; Laufs, S

    2009-07-01

    Several events of insertional mutagenesis in pre-clinical and clinical gene therapy studies have created intense interest in assessing the genomic insertion profiles of gene therapy vectors. For the construction of such profiles, vector-flanking sequences detected by inverse PCR, linear amplification-mediated-PCR or ligation-mediated-PCR need to be mapped to the host cell's genome and compared to a reference set. Although remarkable progress has been achieved in mapping gene therapy vector insertion sites, public reference sets are lacking, as are the possibilities to quickly detect non-random patterns in experimental data. We developed a tool termed QuickMap, which uniformly maps and analyzes human and murine vector-flanking sequences within seconds (available at www.gtsg.org). Besides information about hits in chromosomes and fragile sites, QuickMap automatically determines insertion frequencies in +/- 250 kb adjacency to genes, cancer genes, pseudogenes, transcription factor and (post-transcriptional) miRNA binding sites, CpG islands and repetitive elements (short interspersed nuclear elements (SINE), long interspersed nuclear elements (LINE), Type II elements and LTR elements). Additionally, all experimental frequencies are compared with the data obtained from a reference set, containing 1 000 000 random integrations ('random set'). Thus, for the first time a tool allowing high-throughput profiling of gene therapy vector insertion sites is available. It provides a basis for large-scale insertion site analyses, which is now urgently needed to discover novel gene therapy vectors with 'safe' insertion profiles.

  1. Diversity of killer cell immunoglobulin-like receptor genes in Indonesian populations of Java, Kalimantan, Timor and Irian Jaya.

    PubMed

    Velickovic, M; Velickovic, Z; Panigoro, R; Dunckley, H

    2009-01-01

    Killer cell immunoglobulin-like receptors (KIRs) regulate the activity of natural killer and T cells through interactions with specific human leucocyte antigen class I molecules on target cells. Population studies performed over the last several years have established that KIR gene frequencies (GFs) and genotype content vary considerably among different ethnic groups, indicating the extent of KIR diversity, some of which have also shown the effect of the presence or absence of specific KIR genes in human disease. We have determined the frequencies of 16 KIR genes and pseudogenes and genotypes in 193 Indonesian individuals from Java, East Timor, Irian Jaya (western half of the island of New Guinea) and Kalimantan provinces of Indonesian Borneo. All 16 KIR genes were observed in all four populations. Variation in GFs between populations was observed, except for KIR2DL4, KIR3DL2, KIR3DL3, KIR2DP1 and KIR3DP1 genes, which were present in every individual tested. When comparing KIR GFs between populations, both principal component analysis and a phylogenetic tree showed close clustering of the Kalimantan and Javanese populations, while Irianese populations were clearly separated from the other three populations. Our results indicate a high level of KIR polymorphism in Indonesian populations that probably reflects the large geographical spread of the Indonesian archipelago and the complex evolutionary history and population migration in this region.

  2. Improved multiplex ligation-dependent probe amplification analysis identifies a deleterious PMS2 allele generated by recombination with crossover between PMS2 and PMS2CL.

    PubMed

    Wernstedt, Annekatrin; Valtorta, Emanuele; Armelao, Franco; Togni, Roberto; Girlando, Salvatore; Baudis, Michael; Heinimann, Karl; Messiaen, Ludwine; Staehli, Noemie; Zschocke, Johannes; Marra, Giancarlo; Wimmer, Katharina

    2012-09-01

    Heterozygous PMS2 germline mutations are associated with Lynch syndrome. Up to one third of these mutations are genomic deletions. Their detection is complicated by a pseudogene (PMS2CL), which--owing to extensive interparalog sequence exchange--closely resembles PMS2 downstream of exon 12. A recently redesigned multiplex ligation-dependent probe amplification (MLPA) assay identifies PMS2 copy number alterations with improved reliability when used with reference DNAs containing equal numbers of PMS2- and PMS2CL-specific sequences. We selected eight such reference samples--all publicly available--and used them with this assay to study 13 patients with PMS2-defective colorectal tumors. Three presented deleterious alterations: an Alu-mediated exon deletion; a 125-kb deletion encompassing PMS2 and four additional genes (two with tumor-suppressing functions); and a novel deleterious hybrid PMS2 allele produced by recombination with crossover between PMS2 and PMS2CL, with the breakpoint in intron 10 (the most 5' breakpoint of its kind reported thus far). We discuss mechanisms that might generate this allele in different chromosomal configurations (and their diagnostic implications) and describe an allele-specific PCR assay that facilitates its detection. Our data indicate that the redesigned PMS2 MLPA assay is a valid first-line option. In our series, it identified roughly a quarter of all PMS2 mutations. Copyright © 2012 Wiley Periodicals, Inc.

  3. Improved Multiplex Ligation-Dependent Probe Amplification Analysis Identifies a Deleterious PMS2 Allele Generated by Recombination with Crossover Between PMS2 and PMS2CL

    PubMed Central

    Wernstedt, Annekatrin; Valtorta, Emanuele; Armelao, Franco; Togni, Roberto; Girlando, Salvatore; Baudis, Michael; Heinimann, Karl; Messiaen, Ludwine; Staehli, Noemie; Zschocke, Johannes; Marra, Giancarlo; Wimmer, Katharina

    2012-01-01

    Heterozygous PMS2 germline mutations are associated with Lynch syndrome. Up to one third of these mutations are genomic deletions. Their detection is complicated by a pseudogene (PMS2CL), which – owing to extensive interparalog sequence exchange – closely resembles PMS2 downstream of exon 12. A recently redesigned multiplex ligation-dependent probe amplification (MLPA) assay identifies PMS2 copy number alterations with improved reliability when used with reference DNAs containing equal numbers of PMS2- and PMS2CL-specific sequences. We selected eight such reference samples – all publicly available – and used them with this assay to study 13 patients with PMS2-defective colorectal tumors. Three presented deleterious alterations: an Alu-mediated exon deletion; a 125-kb deletion encompassing PMS2 and four additional genes (two with tumor-suppressing functions); and a novel deleterious hybrid PMS2 allele produced by recombination with crossover between PMS2 and PMS2CL, with the breakpoint in intron 10 (the most 5′ breakpoint of its kind reported thus far). We discuss mechanisms that might generate this allele in different chromosomal configurations (and their diagnostic implications) and describe an allele-specific PCR assay that facilitates its detection. Our data indicate that the redesigned PMS2 MLPA assay is a valid first-line option. In our series, it identified roughly a quarter of all PMS2 mutations. © 2012 Wiley Periodicals, Inc. PMID:22585707

  4. The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the Sino-Himalayan subregion.

    PubMed

    Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe

    2016-02-15

    Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Mollusk genes encoding lysine tRNA (UUU) contain introns.

    PubMed

    Matsuo, M; Abe, Y; Saruta, Y; Okada, N

    1995-11-20

    New intron-containing genes encoding tRNAs were discovered when genomic DNA isolated from various animal species was amplified by the polymerase chain reaction (PCR) with primers based on sequences of rabbit tRNA(Lys). From sequencing analysis of the products of PCR, we found that introns are present in several genes encoding tRNA(Lys) in mollusks, such as Loligo bleekeri (squid) and Octopus vulgaris (octopus). These introns were specific to genes encoding tRNA(Lys)(CUU) and were not present in genes encoding tRNA(Lys)(CUU). In addition, the sequences of the introns were different from one another. To confirm the results of our initial experiments, we isolated and sequenced genes encoding tRNA(Lys)(CUU) and tRNA(Lys)(UUU). The gene for tRNA(Lys)(UUU) from squid contained an intron, whose sequence was the same as that identified by PCR, and the gene formed a cluster with a corresponding pseudogene. Several DNA regions of 2.1 kb containing this cluster appeared to be tandemly arrayed in the squid genome. By contrast, the gene encoding tRNA(Lys)(CUU) did not contain an intron, as shown also by PCR. The tRNA(Lys)(UUU) that corresponded to the analyzed gene was isolated and characterized. The present study provides the first example of an intron-containing gene encoding a tRNA in mollusks and suggests the universality of introns in such genes in higher eukaryotes.

  6. Evolution reversed: the ability to bind iron restored to the N-lobe of the murine inhibitor of carbonic anhydrase by strategic mutagenesis.

    PubMed

    Mason, Anne B; Judson, Gregory L; Bravo, Maria Cristina; Edelstein, Andrew; Byrne, Shaina L; James, Nicholas G; Roush, Eric D; Fierke, Carol A; Bobst, Cedric E; Kaltashov, Igor A; Daughtery, Margaret A

    2008-09-16

    The murine inhibitor of carbonic anhydrase (mICA) is a member of the superfamily related to the bilobal iron transport protein transferrin (TF), which binds a ferric ion within a cleft in each lobe. Although the gene encoding ICA in humans is classified as a pseudogene, an apparently functional ICA gene has been annotated in mice, rats, cows, pigs, and dogs. All ICAs lack one (or more) of the amino acid ligands in each lobe essential for high-affinity coordination of iron and the requisite synergistic anion, carbonate. The reason why ICA family members have lost the ability to bind iron is potentially related to acquiring a new function(s), one of which is inhibition of certain carbonic anhydrase (CA) isoforms. A recombinant mutant of the mICA (W124R/S188Y) was created with the goal of restoring the ligands required for both anion (Arg124) and iron (Tyr188) binding in the N-lobe. Absorption and fluorescence spectra definitively show that the mutant binds ferric iron in the N-lobe. Electrospray ionization mass spectrometry confirms the presence of both ferric iron and carbonate. At the putative endosomal pH of 5.6, iron is released by two slow processes indicative of high-affinity coordination. Induction of specific iron binding implies that (1) the structure of mICA resembles those of other TF family members and (2) the N-lobe can adopt a conformation in which the cleft closes when iron binds. Because the conformational change in the N-lobe indicated by metal binding does not impact the inhibitory activity of mICA, inhibition of CA was tentatively assigned to the C-lobe. Proof of this assignment is provided by limited trypsin proteolysis of porcine ICA.

  7. The Genomes of Two Bat Species with Long Constant Frequency Echolocation Calls.

    PubMed

    Dong, Dong; Lei, Ming; Hua, Panyu; Pan, Yi-Hsuan; Mu, Shuo; Zheng, Guantao; Pang, Erli; Lin, Kui; Zhang, Shuyi

    2017-01-01

    Bats can perceive the world by using a wide range of sensory systems, and some of the systems have become highly specialized, such as auditory sensory perception. Among bat species, the Old World leaf-nosed bats and horseshoe bats (rhinolophoid bats) possess the most sophisticated echolocation systems. Here, we reported the whole-genome sequencing and de novo assembles of two rhinolophoid bats-the great leaf-nosed bat (Hipposideros armiger) and the Chinese rufous horseshoe bat (Rhinolophus sinicus). Comparative genomic analyses revealed the adaptation of auditory sensory perception in the rhinolophoid bat lineages, probably resulting from the extreme selectivity used in the auditory processing by these bats. Pseudogenization of some vision-related genes in rhinolophoid bats was observed, suggesting that these genes have undergone relaxed natural selection. An extensive contraction of olfactory receptor gene repertoires was observed in the lineage leading to the common ancestor of bats. Further extensive gene contractions can be observed in the branch leading to the rhinolophoid bats. Such concordance suggested that molecular changes at one sensory gene might have direct consequences for genes controlling for other sensory modalities. To characterize the population genetic structure and patterns of evolution, we re-sequenced the genome of 20 great leaf-nosed bats from four different geographical locations of China. The result showed similar sequence diversity values and little differentiation among populations. Moreover, evidence of genetic adaptations to high altitudes in the great leaf-nosed bats was observed. Taken together, our work provided a useful resource for future research on the evolution of bats. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Evolution of major histocompatibility complex class I and class II genes in the brown bear

    PubMed Central

    2012-01-01

    Background Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. Results We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Conclusions Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South–north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia. PMID:23031405

  9. Evolution of major histocompatibility complex class I and class II genes in the brown bear.

    PubMed

    Kuduk, Katarzyna; Babik, Wiesław; Bojarska, Katarzyna; Sliwińska, Ewa B; Kindberg, Jonas; Taberlet, Pierre; Swenson, Jon E; Radwan, Jacek

    2012-10-02

    Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South-north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia.

  10. Whole genome sequencing and comparative genomics of closely related Fusarium Head Blight fungi: Fusarium graminearum, F. meridionale and F. asiaticum.

    PubMed

    Walkowiak, Sean; Rowland, Owen; Rodrigue, Nicolas; Subramaniam, Rajagopal

    2016-12-09

    The Fusarium graminearum species complex is composed of many distinct fungal species that cause several diseases in economically important crops, including Fusarium Head Blight of wheat. Despite being closely related, these species and individuals within species have distinct phenotypic differences in toxin production and pathogenicity, with some isolates reported as non-pathogenic on certain hosts. In this report, we compare genomes and gene content of six new isolates from the species complex, including the first available genomes of F. asiaticum and F. meridionale, with four other genomes reported in previous studies. A comparison of genome structure and gene content revealed a 93-99% overlap across all ten genomes. We identified more than 700 k base pairs (kb) of single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) within common regions of the genome, which validated the species and genetic populations reported within species. We constructed a non-redundant pan gene list containing 15,297 genes from the ten genomes and among them 1827 genes or 12% were absent in at least one genome. These genes were co-localized in telomeric regions and select regions within chromosomes with a corresponding increase in SNPs and indels. Many are also predicted to encode for proteins involved in secondary metabolism and other functions associated with disease. Genes that were common between isolates contained high levels of nucleotide variation and may be pseudogenes, allelic, or under diversifying selection. The genomic resources we have contributed will be useful for the identification of genes that contribute to the phenotypic variation and niche specialization that have been reported among members of the F. graminearum species complex.

  11. Preservation of Eumelanin Hair Pigmentation in Proopiomelanocortin-Deficient Mice on a Nonagouti (a/a) Genetic Background

    PubMed Central

    Slominski, Andrzej; Plonka, Przemyslaw M.; Pisarchik, Alexander; Smart, James L.; Tolle, Virginie; Wortsman, Jacobo; Low, Malcolm J.

    2005-01-01

    The original strain of proopiomelanocortin (POMC)-deficient mice (Pomc−/− ) was generated by homologous recombination in 129X1/SvJ (Aw/Aw)-derived embryonic stem cells using a targeting construct that deleted exon 3, encoding all the known functional POMC-derived peptides including αMSH, from the Pomc gene. Although these Pomc−/− mice exhibited adrenal hypoplasia and obesity similar to the syndrome of POMC deficiency in children, their agouti coat color was only subtly altered. To further investigate the mechanism of hair pigmentation in the absence of POMC peptides, we studied wild-type (Pomc+/+), heterozygous (Pomc+/−), and homozygous (Pomc−/−) mice on a nonagouti (a/a) 129;B6 hybrid genetic background. All three genotypes had similar black fur pigmentation with yellow hairs behind the ears, around the nipples, and in the perianal area characteristic of inbred C57BL/6 mice. Histologic and electron paramagnetic resonance spectrometry examination demonstrated that hair follicles in back skin of Pomc−/− mice developed with normal structure and eumelanin pigmentation; corresponding molecular analyses, however, excluded local production of αMSH and ACTH because neither Pomc nor putative Pomc pseudogene mRNAs were detected in the skin. Thus, 129;B6 Pomc null mutant mice produce abundant eumelanin hair pigmentation despite their congenital absence of melanocortin ligands. These results suggest that either the mouse melanocortin receptor 1 has sufficient basal activity to trigger and sustain eumelanogenesis in vivo or that redundant nonmelanocortin pathway(s) compensate for the melanocortin deficiency. Whereas the latter implies feedback control of melanogenesis, it is also possible that the two mechanisms operate jointly in hair follicles. PMID:15564334

  12. Morphological and molecular evidence for a stepwise evolutionary transition from teeth to baleen in mysticete whales.

    PubMed

    Deméré, Thomas A; McGowen, Michael R; Berta, Annalisa; Gatesy, John

    2008-02-01

    The origin of baleen in mysticete whales represents a major transition in the phylogenetic history of Cetacea. This key specialization, a keratinous sieve that enables filter-feeding, permitted exploitation of a new ecological niche and heralded the evolution of modern baleen-bearing whales, the largest animals on Earth. To date, all formally described mysticete fossils conform to two types: toothed species from Oligocene-age rocks ( approximately 24 to 34 million years old) and toothless species that presumably utilized baleen to feed (Recent to approximately 30 million years old). Here, we show that several Oligocene toothed mysticetes have nutrient foramina and associated sulci on the lateral portions of their palates, homologous structures in extant mysticetes house vessels that nourish baleen. The simultaneous occurrence of teeth and nutrient foramina implies that both teeth and baleen were present in these early mysticetes. Phylogenetic analyses of a supermatrix that includes extinct taxa and new data for 11 nuclear genes consistently resolve relationships at the base of Mysticeti. The combined data set of 27,340 characters supports a stepwise transition from a toothed ancestor, to a mosaic intermediate with both teeth and baleen, to modern baleen whales that lack an adult dentition but retain developmental and genetic evidence of their ancestral toothed heritage. Comparative sequence data for ENAM (enamelin) and AMBN (ameloblastin) indicate that enamel-specific loci are present in Mysticeti but have degraded to pseudogenes in this group. The dramatic transformation in mysticete feeding anatomy documents an apparently rare, stepwise mode of evolution in which a composite phenotype bridged the gap between primitive and derived morphologies; a combination of fossil and molecular evidence provides a multifaceted record of this macroevolutionary pattern.

  13. Gentle Masking of Low-Complexity Sequences Improves Homology Search

    PubMed Central

    Frith, Martin C.

    2011-01-01

    Detection of sequences that are homologous, i.e. descended from a common ancestor, is a fundamental task in computational biology. This task is confounded by low-complexity tracts (such as atatatatatat), which arise frequently and independently, causing strong similarities that are not homologies. There has been much research on identifying low-complexity tracts, but little research on how to treat them during homology search. We propose to find homologies by aligning sequences with “gentle” masking of low-complexity tracts. Gentle masking means that the match score involving a masked letter is , where is the unmasked score. Gentle masking slightly but noticeably improves the sensitivity of homology search (compared to “harsh” masking), without harming specificity. We show examples in three useful homology search problems: detection of NUMTs (nuclear copies of mitochondrial DNA), recruitment of metagenomic DNA reads to reference genomes, and pseudogene detection. Gentle masking is currently the best way to treat low-complexity tracts during homology search. PMID:22205972

  14. Unusual loss of chymosin in mammalian lineages parallels neo-natal immune transfer strategies.

    PubMed

    Lopes-Marques, Mónica; Ruivo, Raquel; Fonseca, Elza; Teixeira, Ana; Castro, L Filipe C

    2017-11-01

    Gene duplication and loss are powerful drivers of evolutionary change. The role of loss in phenotypic diversification is notably illustrated by the variable enzymatic repertoire involved in vertebrate protein digestion. Among these we find the pepsin family of aspartic proteinases, including chymosin (Cmy). Previous studies demonstrated that Cmy, a neo-natal digestive pepsin, is inactivated in some primates, including humans. This pseudogenization event was hypothesized to result from the acquisition of maternal immune immunoglobulin G (IgG) transfer. By investigating 94 mammalian subgenomes we reveal an unprecedented level of Cmy erosion in placental mammals, with numerous independent events of gene loss taking place in Primates, Dermoptera, Rodentia, Cetacea and Perissodactyla. Our findings strongly suggest that the recurrent inactivation of Cmy correlates with the evolution of the passive transfer of IgG and uncovers a noteworthy case of evolutionary cross-talk between the digestive and the immune system, modulated by gene loss. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. The Florida manatee (Trichechus manatus latirostris) T cell receptor loci exhibit V subgroup synteny and chain-specific evolution

    USGS Publications Warehouse

    Breaux, Breanna; Hunter, Margaret; Cruz-Schneider, Maria Paula; Sena, Leonardo; Bonde, Robert K.; Criscitiello, Michael F.

    2018-01-01

    The Florida manatee (Trichechus manatus latirostris) has limited diversity in the immunoglobulin heavy chain. We therefore investigated the antigen receptor loci of the other arm of the adaptive immune system: the T cell receptor. Manatees are the first species from Afrotheria, a basal eutherian superorder, to have an in-depth characterization of all T cell receptor loci. By annotating the genome and expressed transcripts, we found that each chain has distinct features that correlates to their individual functions. The genomic organization also plays a role in modulating sequence conservation between species. There were extensive V subgroup synteny blocks in the TRA and TRB loci between T. m. latirostrisand human. Increased genomic locus complexity correlated to increased locus synteny. We also identified evidence for a VHD pseudogene for the first time in a eutherian mammal. These findings emphasize the value of including species within this basal eutherian radiation in comparative studies.

  16. Genes in one megabase of the HLA class I region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wei, H.; Fan, Wu-Fang; Xu, Hongxia

    1993-11-15

    To define the gene content of the HLA class I region, cDNA selection was applied to three overlapping yeast artificial chromosomes (YACs) that spanned 1 megabase (Mb) of this region of the human major histocompatibility complex. These YACs extended from the region centromeric to HLA-E to the region telomeric to HLA-F. In additions to the recognized class I genes and pseudogenes and the anonymous non-class-I genes described recently by the authors and others, 20 additional anonymous cDNA clones were identified from this 1-Mb region. They also identified a long repetitive DNA element in the region between HLA-B and HLA-E. Homologuesmore » of this outside of the HLA complex. The portion of the HLA class I region represented by these YACs shows an average gene density as high as the class II and class III regions. Thus, the high gene density portion of the HLA complex is extended to more than 3 Mb.« less

  17. Underlying mathematics in diversification of human olfactory receptors in different loci.

    PubMed

    Hassan, Sk Sarif; Choudhury, Pabitra Pal; Goswami, Arunava

    2013-12-01

    As per conservative estimate, approximately 51-105 Olfactory Receptors (ORs) loci are present in human genome occurring in clusters. These clusters are apparently unevenly spread as mosaics over 21 pairs of human chromosomes. Olfactory Receptor (OR) gene families which are thought to have expanded for the need to provide recognition capability for a huge number of pure and complex odorants, form the largest known multigene family in the human genome. Recent studies have shown that 388 full length and 414 OR pseudo-genes are present in these OR genomic clusters. In this paper, the authors report a classification method for all human ORs based on their sequential quantitative information like presence of poly strings of nucleotides bases, long range correlation and so on. An L-System generated sequence has been taken as an input into a star-model of specific subfamily members and resultant sequence has been mapped to a specific OR based on the classification scheme using fractal parameters like Hurst exponent and fractal dimensions.

  18. Advances in esophageal cancer: A new perspective on pathogenesis associated with long non-coding RNAs.

    PubMed

    Huang, Xiaomei; Zhou, Xi; Hu, Qing; Sun, Binyu; Deng, Mingming; Qi, Xiaolong; Lü, Muhan

    2018-01-28

    Esophageal cancer is a malignant digestive tract cancer with high mortality. Although studies have found that esophageal cancer is involved in a complex and important gene regulation network, the pathogenesis remains unclear. The recently described long non-coding RNAs (lncRNAs) are one effective part of the gene regulation network. However, in past decades, lncRNAs were thought to be "transcript noise" or "pseudogenes" and were thus ignored. Early studies indicated that lncRNAs play pivotal roles during evolution. However, in recent years, increasing research has revealed that many lncRNAs are associated with tumorigenesis. In particular, lncRNAs may act as important elements for epigenetic regulation, transcription, post-transcriptional regulation and post-translational modification of proteins. Additionally, they may be novel biomarkers for tumors and therapeutic targets in cancer. Here, we summarize the functions of lncRNAs in esophageal cancer, with an emphasis on lncRNA-mediated regulatory mechanisms that affect the biological characteristics of esophageal cancer. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. The developmental proteome of Drosophila melanogaster

    PubMed Central

    Casas-Vila, Nuria; Bluhm, Alina; Sayols, Sergi; Dinges, Nadja; Dejung, Mario; Altenhein, Tina; Kappei, Dennis; Altenhein, Benjamin; Roignant, Jean-Yves; Butter, Falk

    2017-01-01

    Drosophila melanogaster is a widely used genetic model organism in developmental biology. While this model organism has been intensively studied at the RNA level, a comprehensive proteomic study covering the complete life cycle is still missing. Here, we apply label-free quantitative proteomics to explore proteome remodeling across Drosophila’s life cycle, resulting in 7952 proteins, and provide a high temporal-resolved embryogenesis proteome of 5458 proteins. Our proteome data enabled us to monitor isoform-specific expression of 34 genes during development, to identify the pseudogene Cyp9f3Ψ as a protein-coding gene, and to obtain evidence of 268 small proteins. Moreover, the comparison with available transcriptomic data uncovered examples of poor correlation between mRNA and protein, underscoring the importance of proteomics to study developmental progression. Data integration of our embryogenesis proteome with tissue-specific data revealed spatial and temporal information for further functional studies of yet uncharacterized proteins. Overall, our high resolution proteomes provide a powerful resource and can be explored in detail in our interactive web interface. PMID:28381612

  20. The Florida manatee (Trichechus manatus latirostris) T cell receptor loci exhibit V subgroup synteny and chain-specific evolution.

    PubMed

    Breaux, Breanna; Hunter, Margaret E; Cruz-Schneider, Maria Paula; Sena, Leonardo; Bonde, Robert K; Criscitiello, Michael F

    2018-08-01

    The Florida manatee (Trichechus manatus latirostris) has limited diversity in the immunoglobulin heavy chain. We therefore investigated the antigen receptor loci of the other arm of the adaptive immune system: the T cell receptor. Manatees are the first species from Afrotheria, a basal eutherian superorder, to have an in-depth characterization of all T cell receptor loci. By annotating the genome and expressed transcripts, we found that each chain has distinct features that correlates to their individual functions. The genomic organization also plays a role in modulating sequence conservation between species. There were extensive V subgroup synteny blocks in the TRA and TRB loci between T. m. latirostris and human. Increased genomic locus complexity correlated to increased locus synteny. We also identified evidence for a VHD pseudogene for the first time in a eutherian mammal. These findings emphasize the value of including species within this basal eutherian radiation in comparative studies. Copyright © 2018. Published by Elsevier Ltd.

Top