Schouten, Henk J; Vande Geest, Henri; Papadimitriou, Sofia; Bemer, Marian; Schaart, Jan G; Smulders, Marinus J M; Perez, Gabino Sanchez; Schijlen, Elio
2017-03-01
Transformation resulted in deletions and translocations at T-DNA inserts, but not in genome-wide small mutations. A tiny T-DNA splinter was detected that probably would remain undetected by conventional techniques. We investigated to which extent Agrobacterium tumefaciens-mediated transformation is mutagenic, on top of inserting T-DNA. To prevent mutations due to in vitro propagation, we applied floral dip transformation of Arabidopsis thaliana. We re-sequenced the genomes of five primary transformants, and compared these to genomic sequences derived from a pool of four wild-type plants. By genome-wide comparisons, we identified ten small mutations in the genomes of the five transgenic plants, not correlated to the positions or number of T-DNA inserts. This mutation frequency is within the range of spontaneous mutations occurring during seed propagation in A. thaliana, as determined earlier. In addition, we detected small as well as large deletions specifically at the T-DNA insert sites. Furthermore, we detected partial T-DNA inserts, one of these a tiny 50-bp fragment originating from a central part of the T-DNA construct used, inserted into the plant genome without flanking other T-DNA. Because of its small size, we named this fragment a T-DNA splinter. As far as we know this is the first report of such a small T-DNA fragment insert in absence of any T-DNA border sequence. Finally, we found evidence for translocations from other chromosomes, flanking T-DNA inserts. In this study, we showed that next-generation sequencing (NGS) is a highly sensitive approach to detect T-DNA inserts in transgenic plants.
Small molecules enhance CRISPR genome editing in pluripotent stem cells.
Yu, Chen; Liu, Yanxia; Ma, Tianhua; Liu, Kai; Xu, Shaohua; Zhang, Yu; Liu, Honglei; La Russa, Marie; Xie, Min; Ding, Sheng; Qi, Lei S
2015-02-05
The bacterial CRISPR-Cas9 system has emerged as an effective tool for sequence-specific gene knockout through non-homologous end joining (NHEJ), but it remains inefficient for precise editing of genome sequences. Here we develop a reporter-based screening approach for high-throughput identification of chemical compounds that can modulate precise genome editing through homology-directed repair (HDR). Using our screening method, we have identified small molecules that can enhance CRISPR-mediated HDR efficiency, 3-fold for large fragment insertions and 9-fold for point mutations. Interestingly, we have also observed that a small molecule that inhibits HDR can enhance frame shift insertion and deletion (indel) mutations mediated by NHEJ. The identified small molecules function robustly in diverse cell types with minimal toxicity. The use of small molecules provides a simple and effective strategy to enhance precise genome engineering applications and facilitates the study of DNA repair mechanisms in mammalian cells. Copyright © 2015 Elsevier Inc. All rights reserved.
Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng
2016-02-01
The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor.
Landscape of Insertion Polymorphisms in the Human Genome
Onozawa, Masahiro; Goldberg, Liat; Aplan, Peter D.
2015-01-01
Nucleotide substitutions, small (<50 bp) insertions or deletions (indels), and large (>50 bp) deletions are well-known causes of genetic variation within the human genome. We recently reported a previously unrecognized form of polymorphic insertions, termed templated sequence insertion polymorphism (TSIP), in which the inserted sequence was templated from a distant genomic region, and was inserted in the genome through reverse transcription of an RNA intermediate. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; class 1 TSIPs show target site duplication, polyadenylation, and preference for insertion at a 5′-TTTT/A-3′ sequence, suggesting a LINE-1 based insertion mechanism, whereas class 2 TSIPs show features consistent with repair of a DNA double strand break by nonhomologous end joining. To gain a more complete picture of TSIPs throughout the human population, we evaluated whole-genome sequence from 52 individuals, and identified 171 TSIPs. Most individuals had 25–30 TSIPs, and common (present in >20% of individuals) TSIPs were found in individuals throughout the world, whereas rare TSIPs tended to cluster in specific geographic regions. The number of rare TSIPs was greater than the number of common TSIPs, suggesting that TSIP generation is an ongoing process. Intriguingly, mitochondrial sequences were a frequent template for class 2 insertions, used more commonly than any nuclear chromosome. Similar to single nucleotide polymorphisms and indels, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases, and can be useful in tracking historical migration of populations. PMID:25745018
Chintalapati, Manjusha; Dannemann, Michael; Prüfer, Kay
2017-08-04
Small insertions and deletions occur in humans at a lower rate compared to nucleotide changes, but evolve under more constraint than nucleotide changes. While the evolution of insertions and deletions have been investigated using ape outgroups, the now available genome of a Neandertal can shed light on the evolution of indels in more recent times. We used the Neandertal genome together with several primate outgroup genomes to differentiate between human insertion/deletion changes that likely occurred before the split from Neandertals and those that likely arose later. Changes that pre-date the split from Neandertals show a smaller proportion of deletions than those that occurred later. The presence of a Neandertal-shared allele in Europeans or Asians but the absence in Africans was used to detect putatively introgressed indels in Europeans and Asians. A larger proportion of these variants reside in intergenic regions compared to other modern human variants, and some variants are linked to SNPs that have been associated with traits in modern humans. Our results are in agreement with earlier results that suggested that deletions evolve under more constraint than insertions. When considering Neandertal introgressed variants, we find some evidence that negative selection affected these variants more than other variants segregating in modern humans. Among introgressed variants we also identify indels that may influence the phenotype of their carriers. In particular an introgressed deletion associated with a decrease in the time to menarche may constitute an example of a former Neandertal-specific trait contributing to modern human phenotypic diversity.
Many P-Element Insertions Affect Wing Shape in Drosophila melanogaster
Weber, Kenneth; Johnson, Nancy; Champlin, David; Patty, April
2005-01-01
A screen of random, autosomal, homozygous-viable P-element insertions in D. melanogaster found small effects on wing shape in 11 of 50 lines. The effects were due to single insertions and remained stable and significant for over 5 years, in repeated, high-resolution measurements. All 11 insertions were within or near protein-coding transcription units, none of which were previously known to affect wing shape. Many sites in the genome can affect wing shape. PMID:15545659
Many P-element insertions affect wing shape in Drosophila melanogaster.
Weber, Kenneth; Johnson, Nancy; Champlin, David; Patty, April
2005-03-01
A screen of random, autosomal, homozygous-viable P-element insertions in D. melanogaster found small effects on wing shape in 11 of 50 lines. The effects were due to single insertions and remained stable and significant for over 5 years, in repeated, high-resolution measurements. All 11 insertions were within or near protein-coding transcription units, none of which were previously known to affect wing shape. Many sites in the genome can affect wing shape.
Olovnikov, Ivan; Abramov, Yuri; Kalmykova, Alla
2014-01-01
The control of transposable element (TE) activity in germ cells provides genome integrity over generations. A distinct small RNA–mediated pathway utilizing Piwi-interacting RNAs (piRNAs) suppresses TE expression in gonads of metazoans. In the fly, primary piRNAs derive from so-called piRNA clusters, which are enriched in damaged repeated sequences. These piRNAs launch a cycle of TE and piRNA cluster transcript cleavages resulting in the amplification of piRNA and TE silencing. Using genome-wide comparison of TE insertions and ovarian small RNA libraries from two Drosophila strains, we found that individual TEs inserted into euchromatic loci form novel dual-stranded piRNA clusters. Formation of the piRNA-generating loci by active individual TEs provides a more potent silencing response to the TE expansion. Like all piRNA clusters, individual TEs are also capable of triggering the production of endogenous small interfering (endo-si) RNAs. Small RNA production by individual TEs spreads into the flanking genomic regions including coding cellular genes. We show that formation of TE-associated small RNA clusters can down-regulate expression of nearby genes in ovaries. Integration of TEs into the 3′ untranslated region of actively transcribed genes induces piRNA production towards the 3′-end of transcripts, causing the appearance of genic piRNA clusters, a phenomenon that has been reported in different organisms. These data suggest a significant role of TE-associated small RNAs in the evolution of regulatory networks in the germline. PMID:24516406
Canver, Matthew C.; Bauer, Daniel E.; Dass, Abhishek; Yien, Yvette Y.; Chung, Jacky; Masuda, Takeshi; Maeda, Takahiro; Paw, Barry H.; Orkin, Stuart H.
2014-01-01
The clustered regularly interspaced palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 nuclease system has provided a powerful tool for genome engineering. Double strand breaks may trigger nonhomologous end joining repair, leading to frameshift mutations, or homology-directed repair using an extrachromosomal template. Alternatively, genomic deletions may be produced by a pair of double strand breaks. The efficiency of CRISPR/Cas9-mediated genomic deletions has not been systematically explored. Here, we present a methodology for the production of deletions in mammalian cells, ranging from 1.3 kb to greater than 1 Mb. We observed a high frequency of intended genomic deletions. Nondeleted alleles are nonetheless often edited with inversions or small insertion/deletions produced at CRISPR recognition sites. Deleted alleles also typically include small insertion/deletions at predicted deletion junctions. We retrieved cells with biallelic deletion at a frequency exceeding that of probabilistic expectation. We demonstrate an inverse relationship between deletion frequency and deletion size. This work suggests that CRISPR/Cas9 is a robust system to produce a spectrum of genomic deletions to allow investigation of genes and genetic elements. PMID:24907273
Boussaha, Mekki; Michot, Pauline; Letaief, Rabia; Hozé, Chris; Fritz, Sébastien; Grohs, Cécile; Esquerré, Diane; Duchesne, Amandine; Philippe, Romain; Blanquet, Véronique; Phocas, Florence; Floriot, Sandrine; Rocha, Dominique; Klopp, Christophe; Capitan, Aurélien; Boichard, Didier
2016-11-15
In recent years, several bovine genome sequencing projects were carried out with the aim of developing genomic tools to improve dairy and beef production efficiency and sustainability. In this study, we describe the first French cattle genome variation dataset obtained by sequencing 274 whole genomes representing several major dairy and beef breeds. This dataset contains over 28 million single nucleotide polymorphisms (SNPs) and small insertions and deletions. Comparisons between sequencing results and SNP array genotypes revealed a very high genotype concordance rate, which indicates the good quality of our data. To our knowledge, this is the first large-scale catalog of small genomic variations in French dairy and beef cattle. This resource will contribute to the study of gene functions and population structure and also help to improve traits through genotype-guided selection.
SvABA: genome-wide detection of structural variants and indels by local assembly.
Wala, Jeremiah A; Bandopadhayay, Pratiti; Greenwald, Noah F; O'Rourke, Ryan; Sharpe, Ted; Stewart, Chip; Schumacher, Steve; Li, Yilong; Weischenfeldt, Joachim; Yao, Xiaotong; Nusbaum, Chad; Campbell, Peter; Getz, Gad; Meyerson, Matthew; Zhang, Cheng-Zhong; Imielinski, Marcin; Beroukhim, Rameen
2018-04-01
Structural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA's performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs and substantially improves detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (<1000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types and found that short templated-sequence insertions occur in ∼4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized (50-300 bp) SVs. © 2018 Wala et al.; Published by Cold Spring Harbor Laboratory Press.
El Kafsi, Hela; Binesse, Johan; Loux, Valentin; Buratti, Julien; Boudebbouze, Samira; Dervyn, Rozenn; Hammani, Amal; Maguin, Emmanuelle; van de Guchte, Maarten
2014-07-17
Lactobacillus delbrueckii subsp. lactis CNRZ327 is a dairy bacterium with anti-inflammatory properties both in vitro and in vivo. Here, we report the genome sequence of this bacterium, which appears to contain no less than 215 insertion sequence (IS) elements, an exceptionally high number regarding the small genome size of the strain. Copyright © 2014 El Kafsi et al.
Canver, Matthew C; Bauer, Daniel E; Dass, Abhishek; Yien, Yvette Y; Chung, Jacky; Masuda, Takeshi; Maeda, Takahiro; Paw, Barry H; Orkin, Stuart H
2014-08-01
The clustered regularly interspaced short [corrected] palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 nuclease system has provided a powerful tool for genome engineering. Double strand breaks may trigger nonhomologous end joining repair, leading to frameshift mutations, or homology-directed repair using an extrachromosomal template. Alternatively, genomic deletions may be produced by a pair of double strand breaks. The efficiency of CRISPR/Cas9-mediated genomic deletions has not been systematically explored. Here, we present a methodology for the production of deletions in mammalian cells, ranging from 1.3 kb to greater than 1 Mb. We observed a high frequency of intended genomic deletions. Nondeleted alleles are nonetheless often edited with inversions or small insertion/deletions produced at CRISPR recognition sites. Deleted alleles also typically include small insertion/deletions at predicted deletion junctions. We retrieved cells with biallelic deletion at a frequency exceeding that of probabilistic expectation. We demonstrate an inverse relationship between deletion frequency and deletion size. This work suggests that CRISPR/Cas9 is a robust system to produce a spectrum of genomic deletions to allow investigation of genes and genetic elements. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Vishwakarma, Manish K; Kale, Sandip M; Sriswathi, Manda; Naresh, Talari; Shasidhar, Yaduru; Garg, Vanika; Pandey, Manish K; Varshney, Rajeev K
2017-01-01
Small insertions and deletions (InDels) are the second most prevalent and the most abundant structural variations in plant genomes. In order to deploy these genetic variations for genetic analysis in genus Arachis , we conducted comparative analysis of the draft genome assemblies of both the diploid progenitor species of cultivated tetraploid groundnut ( Arachis hypogaea L.) i.e., Arachis duranensis (A subgenome) and Arachis ipaënsis (B subgenome) and identified 515,223 InDels. These InDels include 269,973 insertions identified in A. ipaënsis against A. duranensis while 245,250 deletions in A. duranensis against A. ipaënsis . The majority of the InDels were of single bp (43.7%) and 2-10 bp (39.9%) while the remaining were >10 bp (16.4%). Phylogenetic analysis using genotyping data for 86 (40.19%) polymorphic markers grouped 96 diverse Arachis accessions into eight clusters mostly by the affinity of their genome. This study also provided evidence for the existence of "K" genome, although distinct from both the "A" and "B" genomes, but more similar to "B" genome. The complete homology between A. monticola and A. hypogaea tetraploid taxa showed a very similar genome composition. The above analysis has provided greater insights into the phylogenetic relationship among accessions, genomes, sub species and sections. These InDel markers are very useful resource for groundnut research community for genetic analysis and breeding applications.
Vishwakarma, Manish K.; Kale, Sandip M.; Sriswathi, Manda; Naresh, Talari; Shasidhar, Yaduru; Garg, Vanika; Pandey, Manish K.; Varshney, Rajeev K.
2017-01-01
Small insertions and deletions (InDels) are the second most prevalent and the most abundant structural variations in plant genomes. In order to deploy these genetic variations for genetic analysis in genus Arachis, we conducted comparative analysis of the draft genome assemblies of both the diploid progenitor species of cultivated tetraploid groundnut (Arachis hypogaea L.) i.e., Arachis duranensis (A subgenome) and Arachis ipaënsis (B subgenome) and identified 515,223 InDels. These InDels include 269,973 insertions identified in A. ipaënsis against A. duranensis while 245,250 deletions in A. duranensis against A. ipaënsis. The majority of the InDels were of single bp (43.7%) and 2–10 bp (39.9%) while the remaining were >10 bp (16.4%). Phylogenetic analysis using genotyping data for 86 (40.19%) polymorphic markers grouped 96 diverse Arachis accessions into eight clusters mostly by the affinity of their genome. This study also provided evidence for the existence of “K” genome, although distinct from both the “A” and “B” genomes, but more similar to “B” genome. The complete homology between A. monticola and A. hypogaea tetraploid taxa showed a very similar genome composition. The above analysis has provided greater insights into the phylogenetic relationship among accessions, genomes, sub species and sections. These InDel markers are very useful resource for groundnut research community for genetic analysis and breeding applications. PMID:29312366
Wu, Chengcang; Proestou, Dina; Carter, Dorothy; Nicholson, Erica; Santos, Filippe; Zhao, Shaying; Zhang, Hong-Bin; Goldsmith, Marian R
2009-01-01
Background Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion The high-quality and large-insert BAC libraries of the insects, together with the identified BACs containing genes of interest, provide valuable information, resources and tools for comprehensive understanding and studies of the insect genomes and for addressing many fundamental questions in Lepidoptera. The sample of the genomic sequences provides the first insight into the constitution and evolution of the insect genomes. PMID:19558662
Different Evolutionary Paths to Complexity for Small and Large Populations of Digital Organisms
2016-01-01
A major aim of evolutionary biology is to explain the respective roles of adaptive versus non-adaptive changes in the evolution of complexity. While selection is certainly responsible for the spread and maintenance of complex phenotypes, this does not automatically imply that strong selection enhances the chance for the emergence of novel traits, that is, the origination of complexity. Population size is one parameter that alters the relative importance of adaptive and non-adaptive processes: as population size decreases, selection weakens and genetic drift grows in importance. Because of this relationship, many theories invoke a role for population size in the evolution of complexity. Such theories are difficult to test empirically because of the time required for the evolution of complexity in biological populations. Here, we used digital experimental evolution to test whether large or small asexual populations tend to evolve greater complexity. We find that both small and large—but not intermediate-sized—populations are favored to evolve larger genomes, which provides the opportunity for subsequent increases in phenotypic complexity. However, small and large populations followed different evolutionary paths towards these novel traits. Small populations evolved larger genomes by fixing slightly deleterious insertions, while large populations fixed rare beneficial insertions that increased genome size. These results demonstrate that genetic drift can lead to the evolution of complexity in small populations and that purifying selection is not powerful enough to prevent the evolution of complexity in large populations. PMID:27923053
Gladyshev, Eugene A; Arkhipova, Irina R
2009-12-15
Ribosomal DNA genes in many eukaryotes contain insertions of non-LTR retrotransposable elements belonging to the R2 clade. These elements persist in the host genomes by inserting site-specifically into multicopy target sites, thereby avoiding random disruption of single-copy host genes. Here we describe R9 retrotransposons from the R2 clade in the 28S RNA genes of bdelloid rotifers, small freshwater invertebrate animals best known for their long-term asexuality and for their ability to survive repeated cycles of desiccation and rehydration. While the structural organization of R9 elements is highly similar to that of other members of the R2 clade, they are characterized by two distinct features: site-specific insertion into a previously unreported target sequence within the 28S gene, and an unusually long target site duplication of 126 bp. We discuss the implications of these findings in the context of bdelloid genome organization and the mechanisms of target-primed reverse transcription.
Hackman, Sarah; Calvey, Laura; Bernreuter, Kristen; Mark, Mengya Wang; Starnes, Sarah; Batanian, Jacqueline R
2015-09-01
Alveolar rhabdomyosarcoma (ARMS) is a pediatric soft tissue neoplasm with a characteristic translocation, t(2;13)(q35;q14), which is detected in 70-80% of cases. This well-described translocation produces the gene fusion product PAX3-FOXO1. Cryptic rearrangements of this fusion have never before been reported in ARMS. Here we describe a patient with ARMS that showed, by fluorescence in situ hybridization and G-banded chromosomes, a cryptic insertion of 3'FOXO1 into inverted chromosome 2q. The inversion breakpoints were depicted by array comparative genomic hybridization as two small interstitial duplications, one of which involved the PAX3 gene. In addition, the array comparative genomic hybridization results revealed 1q gain, 16q loss, and 11 more small duplications, with one of them involving the FOXO1 gene. Although the pathogenesis in classic ARMS cases is thought to be driven by the 5'PAX3-3'FOXO1 fusion on derivative chromosome 13, here we report a novel cryptic insertion of 3'FOXO1 resulting in a pathogenic fusion with 5'PAX3 on inverted chromosome 2q. Copyright © 2015 Elsevier Inc. All rights reserved.
Michalovova, M; Vyskot, B; Kejnovsky, E
2013-10-01
We analysed the size, relative age and chromosomal localization of nuclear sequences of plastid and mitochondrial origin (NUPTs-nuclear plastid DNA and NUMTs-nuclear mitochondrial DNA) in six completely sequenced plant species. We found that the largest insertions showed lower divergence from organelle DNA than shorter insertions in all species, indicating their recent origin. The largest NUPT and NUMT insertions were localized in the vicinity of the centromeres in the small genomes of Arabidopsis and rice. They were also present in other chromosomal regions in the large genomes of soybean and maize. Localization of NUPTs and NUMTs correlated positively with distribution of transposable elements (TEs) in Arabidopsis and sorghum, negatively in grapevine and soybean, and did not correlate in rice or maize. We propose a model where new plastid and mitochondrial DNA sequences are inserted close to centromeres and are later fragmented by TE insertions and reshuffled away from the centromere or removed by ectopic recombination. The mode and tempo of TE dynamism determines the turnover of NUPTs and NUMTs resulting in their species-specific chromosomal distributions.
Final technical report for: Insertional Mutagenesis of Brachypodium distachyon DE-AI02-07ER64452
DOE Office of Scientific and Technical Information (OSTI.GOV)
John, Vogel P.
Several bioenergy grasses are poised to become a major source of energy in the United States. Despite their increasing importance, we know little about the basic biology underlying the traits that control the utility of grasses as energy crops. Better knowledge of grass biology (e.g. identification of the genes that control cell wall composition, plant architecture, cell size, cell division, reproduction, nutrient uptake, carbon flux, etc.) could be used to design rational strategies for crop improvement and shorten the time required to domesticate these species. The use of an appropriate model system is an efficient way to gain this knowledge.more » Brachypodium distachyon is a small annual grass with all the attributes needed to be a modern model organism including simple growth requirements, fast generation time, small stature, small genome size and self-fertility. These attributes led to the recommendation in the DOE’s “Breaking the Biological Barriers to Cellulosic Ethanol: A Joint Research Agenda” report to propose developing and using B. distachyon as a model for energy crops to accelerate their domestication. Strategic investments (e.g. genome sequencing) in B. distachyon by the DOE are now bearing fruit and B. distachyon is being used as a model grass by hundreds of laboratories worldwide. Sequence indexed insertional mutants are an extremely powerful tool for both forward and reverse genetics. They allow researchers to order mutants in any gene tagged in the collection by simply emailing a request. The goal of this project was to create a collection of sequence indexed insertional mutants (T-DNA lines) for the model grass Brachypodium distachyon in order to facilitate research by the scientific community. During the course of this grant we created a collection of 23,649 B. distachyon T-DNA lines and identified 26,112 unique insertion sites. The collection can be queried through the project website (http://jgi.doe.gov/our-science/science-programs/plant-genomics/brachypodium/brachypodium-t-dna-collection/) and through the Phytozome genome browser (http://phytozome.jgi.doe.gov/pz/portal.html). The collection has been heavily utilized by the research community and, as of October 23, 2015, 223 orders for 12,069 seeds packets have been filled. In addition to creating this resource, we also optimized methods for transformation and sequencing DNA flanking insertion sites.« less
Mobile elements reveal small population size in the ancient ancestors of Homo sapiens.
Huff, Chad D; Xing, Jinchuan; Rogers, Alan R; Witherspoon, David; Jorde, Lynn B
2010-02-02
The genealogies of different genetic loci vary in depth. The deeper the genealogy, the greater the chance that it will include a rare event, such as the insertion of a mobile element. Therefore, the genealogy of a region that contains a mobile element is on average older than that of the rest of the genome. In a simple demographic model, the expected time to most recent common ancestor (TMRCA) is doubled if a rare insertion is present. We test this expectation by examining single nucleotide polymorphisms around polymorphic Alu insertions from two completely sequenced human genomes. The estimated TMRCA for regions containing a polymorphic insertion is two times larger than the genomic average (P < <10(-30)), as predicted. Because genealogies that contain polymorphic mobile elements are old, they are shaped largely by the forces of ancient population history and are insensitive to recent demographic events, such as bottlenecks and expansions. Remarkably, the information in just two human DNA sequences provides substantial information about ancient human population size. By comparing the likelihood of various demographic models, we estimate that the effective population size of human ancestors living before 1.2 million years ago was 18,500, and we can reject all models where the ancient effective population size was larger than 26,000. This result implies an unusually small population for a species spread across the entire Old World, particularly in light of the effective population sizes of chimpanzees (21,000) and gorillas (25,000), which each inhabit only one part of a single continent.
Lesmana, Harry; Dyer, Lisa; Li, Xia; Denton, James; Griffiths, Jenna; Chonat, Satheesh; Seu, Katie G; Heeney, Matthew M; Zhang, Kejian; Hopkin, Robert J; Kalfa, Theodosia A
2018-03-01
Pyruvate kinase deficiency (PKD) is the most frequent red blood cell enzyme abnormality of the glycolytic pathway and the most common cause of hereditary nonspherocytic hemolytic anemia. Over 250 PKLR-gene mutations have been described, including missense/nonsense, splicing and regulatory mutations, small insertions, small and gross deletions, causing PKD and hemolytic anemia of variable severity. Alu retrotransposons are the most abundant mobile DNA sequences in the human genome, contributing to almost 11% of its mass. Alu insertions have been associated with a number of human diseases either by disrupting a coding region or a splice signal. Here, we report on two unrelated Middle Eastern patients, both born from consanguineous parents, with transfusion-dependent hemolytic anemia, where sequence analysis revealed a homozygous insertion of AluYb9 within exon 6 of the PKLR gene, causing precipitous decrease of PKLR RNA levels. This Alu element insertion consists a previously unrecognized mechanism underlying pathogenesis of PKD. © 2017 Wiley Periodicals, Inc.
Going, going, gone: predicting the fate of genomic insertions in plant RNA viruses.
Willemsen, Anouk; Carrasco, José L; Elena, Santiago F; Zwart, Mark P
2018-05-10
Horizontal gene transfer is common among viruses, while they also have highly compact genomes and tend to lose artificial genomic insertions rapidly. Understanding the stability of genomic insertions in viral genomes is therefore relevant for explaining and predicting their evolutionary patterns. Here, we revisit a large body of experimental research on a plant RNA virus, tobacco etch potyvirus (TEV), to identify the patterns underlying the stability of a range of homologous and heterologous insertions in the viral genome. We obtained a wide range of estimates for the recombination rate-the rate at which deletions removing the insertion occur-and these appeared to be independent of the type of insertion and its location. Of the factors we considered, recombination rate was the best predictor of insertion stability, although we could not identify the specific sequence characteristics that would help predict insertion instability. We also considered experimentally the possibility that functional insertions lead to higher mutational robustness through increased redundancy. However, our observations suggest that both functional and non-functional increases in genome size decreased the mutational robustness. Our results therefore demonstrate the importance of recombination rates for predicting the long-term stability and evolution of viral RNA genomes and suggest that there are unexpected drawbacks to increases in genome size for mutational robustness.
Johnson, Stephen M.; Eltahla, Auda A.; Aloi, Maria; Aloia, Amanda L.; McDevitt, Christopher A.; Bull, Rowena A.
2017-01-01
ABSTRACT Dengue virus (DENV) is a major global pathogen that causes significant morbidity and mortality in tropical and subtropical areas worldwide. An improved understanding of the regions within the DENV genome and its encoded proteins that are required for the virus replication cycle will expedite the development of urgently required therapeutics and vaccines. We subjected an infectious DENV genome to unbiased insertional mutagenesis and used next-generation sequencing to identify sites that tolerate 15-nucleotide insertions during the virus replication cycle in hepatic cell culture. This revealed that the regions within capsid, NS1, and the 3′ untranslated region were the most tolerant of insertions. In contrast, prM- and NS2A-encoding regions were largely intolerant of insertions. Notably, the multifunctional NS1 protein readily tolerated insertions in regions within the Wing, connector, and β-ladder domains with minimal effects on viral RNA replication and infectious virus production. Using this information, we generated infectious reporter viruses, including a variant encoding the APEX2 electron microscopy tag in NS1 that uniquely enabled high-resolution imaging of its localization to the surface and interior of viral replication vesicles. In addition, we generated a tagged virus bearing an mScarlet fluorescent protein insertion in NS1 that, despite an impact on fitness, enabled live cell imaging of NS1 localization and traffic in infected cells. Overall, this genome-wide profile of DENV genome flexibility may be further dissected and exploited in reporter virus generation and antiviral strategies. IMPORTANCE Regions of genetic flexibility in viral genomes can be exploited in the generation of reporter virus tools and should arguably be avoided in antiviral drug and vaccine design. Here, we subjected the DENV genome to high-throughput insertional mutagenesis to identify regions of genetic flexibility and enable tagged reporter virus generation. In particular, the viral NS1 protein displayed remarkable tolerance of small insertions. This genetic flexibility enabled generation of several novel NS1-tagged reporter viruses, including an APEX2-tagged virus that we used in high-resolution imaging of NS1 localization in infected cells by electron microscopy. For the first time, this analysis revealed the localization of NS1 within viral replication factories known as “vesicle packets” (VPs), in addition to its acknowledged localization to the luminal surface of these VPs. Together, this genetic profile of DENV may be further refined and exploited in the identification of antiviral targets and the generation of reporter virus tools. PMID:28956770
Ferlaino, Michael; Rogers, Mark F.; Shihab, Hashem A.; Mort, Matthew; Cooper, David N.; Gaunt, Tom R.; Campbell, Colin
2018-01-01
Background Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. Results We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. Conclusions FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome. PMID:28985712
Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin
2017-10-06
Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.
Natural mutagenesis of human genomes by endogenous retrotransposons.
Iskow, Rebecca C; McCabe, Michael T; Mills, Ryan E; Torene, Spencer; Pittard, W Stephen; Neuwald, Andrew F; Van Meir, Erwin G; Vertino, Paula M; Devine, Scott E
2010-06-25
Two abundant classes of mobile elements, namely Alu and L1 elements, continue to generate new retrotransposon insertions in human genomes. Estimates suggest that these elements have generated millions of new germline insertions in individual human genomes worldwide. Unfortunately, current technologies are not capable of detecting most of these young insertions, and the true extent of germline mutagenesis by endogenous human retrotransposons has been difficult to examine. Here, we describe technologies for detecting these young retrotransposon insertions and demonstrate that such insertions indeed are abundant in human populations. We also found that new somatic L1 insertions occur at high frequencies in human lung cancer genomes. Genome-wide analysis suggests that altered DNA methylation may be responsible for the high levels of L1 mobilization observed in these tumors. Our data indicate that transposon-mediated mutagenesis is extensive in human genomes and is likely to have a major impact on human biology and diseases.
Iida, Takayuki; Itakura, Manabu; Anda, Mizue; Sugawara, Masayuki; Isawa, Tsuyoshi; Okubo, Takashi; Sato, Shusei; Chiba-Kakizaki, Kaori
2015-01-01
Extra-slow-growing bradyrhizobia from root nodules of field-grown soybeans harbor abundant insertion sequences (ISs) and are termed highly reiterated sequence-possessing (HRS) strains. We analyzed the genome organization of HRS strains with the focus on IS distribution and symbiosis island structure. Using pulsed-field gel electrophoresis, we consistently detected several plasmids (0.07 to 0.4 Mb) in the HRS strains (NK5, NK6, USDA135, 2281, USDA123, and T2), whereas no plasmids were detected in the non-HRS strain USDA110. The chromosomes of the six HRS strains (9.7 to 10.7 Mb) were larger than that of USDA110 (9.1 Mb). Using MiSeq sequences of 6 HRS and 17 non-HRS strains mapped to the USDA110 genome, we found that the copy numbers of ISRj1, ISRj2, ISFK1, IS1632, ISB27, ISBj8, and IS1631 were markedly higher in HRS strains. Whole-genome sequencing showed that the HRS strain NK6 had four small plasmids (136 to 212 kb) and a large chromosome (9,780 kb). Strong colinearity was found between 7.4-Mb core regions of the NK6 and USDA110 chromosomes. USDA110 symbiosis islands corresponded mainly to five small regions (S1 to S5) within two variable regions, V1 (0.8 Mb) and V2 (1.6 Mb), of the NK6 chromosome. The USDA110 nif gene cluster (nifDKENXSBZHQW-fixBCX) was split into two regions, S2 and S3, where ISRj1-mediated rearrangement occurred between nifS and nifB. ISs were also scattered in NK6 core regions, and ISRj1 insertion often disrupted some genes important for survival and environmental responses. These results suggest that HRS strains of soybean bradyrhizobia were subjected to IS-mediated symbiosis island shuffling and core genome degradation. PMID:25862225
A universal method for automated gene mapping
Zipperlen, Peder; Nairz, Knud; Rimann, Ivo; Basler, Konrad; Hafen, Ernst; Hengartner, Michael; Hajnal, Alex
2005-01-01
Small insertions or deletions (InDels) constitute a ubiquituous class of sequence polymorphisms found in eukaryotic genomes. Here, we present an automated high-throughput genotyping method that relies on the detection of fragment-length polymorphisms (FLPs) caused by InDels. The protocol utilizes standard sequencers and genotyping software. We have established genome-wide FLP maps for both Caenorhabditis elegans and Drosophila melanogaster that facilitate genetic mapping with a minimum of manual input and at comparatively low cost. PMID:15693948
USDA-ARS?s Scientific Manuscript database
Premise of the study: Develop microsatellites from Fothergilla ×intermedia to establish loci capable of distinguishing species and cultivars, and assess genetic diversity for use by ornamental breeders, and for transfer within Hamamelidaceae. Methods and Results: A small insert genomic library enric...
Miura, Naoki; Kucho, Ken-Ichi; Noguchi, Michiko; Miyoshi, Noriaki; Uchiumi, Toshiki; Kawaguchi, Hiroaki; Tanimoto, Akihide
2014-01-01
The microminipig, which weighs less than 10 kg at an early stage of maturity, has been reported as a potential experimental model animal. Its extremely small size and other distinct characteristics suggest the possibility of a number of differences between the genome of the microminipig and that of conventional pigs. In this study, we analyzed the genomes of two healthy microminipigs using a next-generation sequencer SOLiD™ system. We then compared the obtained genomic sequences with a genomic database for the domestic pig (Sus scrofa). The mapping coverage of sequenced tag from the microminipig to conventional pig genomic sequences was greater than 96% and we detected no clear, substantial genomic variance from these data. The results may indicate that the distinct characteristics of the microminipig derive from small-scale alterations in the genome, such as Single Nucleotide Polymorphisms or translational modifications, rather than large-scale deletion or insertion polymorphisms. Further investigation of the entire genomic sequence of the microminipig with methods enabling deeper coverage is required to elucidate the genetic basis of its distinct phenotypic traits. Copyright © 2014 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Frahry, Matthew Blake; Sun, Cheng; Chong, Rebecca A; Mueller, Rachel Lockridge
2015-02-01
Across the tree of life, species vary dramatically in nuclear genome size. Mutations that add or remove sequences from genomes-insertions or deletions, or indels-are the ultimate source of this variation. Differences in the tempo and mode of insertion and deletion across taxa have been proposed to contribute to evolutionary diversity in genome size. Among vertebrates, most of the largest genomes are found within the salamanders, an amphibian clade with genome sizes ranging from ~14 to ~120 Gb. Salamander genomes have been shown to experience slower rates of DNA loss through small (i.e., <30 bp) deletions than do other vertebrate genomes. However, no studies have addressed DNA loss from salamander genomes resulting from larger deletions. Here, we focus on one type of large deletion-ectopic-recombination-mediated removal of LTR retrotransposon sequences. In ectopic recombination, double-strand breaks are repaired using a "wrong" (i.e., ectopic, or non-allelic) template sequence-typically another locus of similar sequence. When breaks occur within the LTR portions of LTR retrotransposons, ectopic-recombination-mediated repair can produce deletions that remove the internal transposon sequence and the equivalent of one of the two LTR sequences. These deletions leave a signature in the genome-a solo LTR sequence. We compared levels of solo LTRs in the genomes of four salamander species with levels present in five vertebrates with smaller genomes. Our results demonstrate that salamanders have low levels of solo LTRs, suggesting that ectopic-recombination-mediated deletion of LTR retrotransposons occurs more slowly than in other vertebrates with smaller genomes.
Scanning the human genome at kilobase resolution.
Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming
2008-05-01
Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.
2010-10-14
High-Resolution Functional Mapping of the Venezuelan Equine Encephalitis Virus Genome by Insertional Mutagenesis and Massively Parallel Sequencing...Venezuelan equine encephalitis virus (VEEV) genome. We initially used a capillary electrophoresis method to gain insight into the role of the VEEV...Smith JM, Schmaljohn CS (2010) High-Resolution Functional Mapping of the Venezuelan Equine Encephalitis Virus Genome by Insertional Mutagenesis and
Templated sequence insertion polymorphisms in the human genome
NASA Astrophysics Data System (ADS)
Onozawa, Masahiro; Aplan, Peter
2016-11-01
Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.
Quantifying the Number of Independent Organelle DNA Insertions in Genome Evolution and Human Health
Martin, William F.
2017-01-01
Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments of mitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundance of numts associated with tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data. Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline the main technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health. PMID:28444372
USDA-ARS?s Scientific Manuscript database
Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Whole-genome analysis of a patient with early-stage small-cell lung cancer.
Han, J-Y; Lee, Y-S; Kim, B C; Lee, G K; Lee, S; Kim, E-H; Kim, H-M; Bhak, J
2014-12-01
We performed whole-genome sequencing (WGS) of a case of early-stage small-cell lung cancer (SCLC) to analyze the genomic features. WGS revealed a lot of single-nucleotide variations (SNVs), small insertion/deletions and chromosomal abnormality. Chromosomes 4p, 5q, 13q, 15q, 17p and 22q contained many block deletions. Especially, copy loss was observed in tumor suppressor genes RB1 and TP53, and copy gain in oncogene hTERT. Somatic mutations were found in TP53 and CREBBP. Novel nonsynonymous (ns) SNVs in C6ORF103 and SLC5A4 genes were also found. Sanger sequencing of the SLC5A4 gene in 23 independent SCLC samples showed another nsSNV in the SLC5A4 gene, indicating that nsSNVs in the SLC5A4 gene are recurrent in SCLC. WGS of an early-stage SCLC identified novel recurrent mutations and validated known variations, including copy number variations. These findings provide insight into the genomic landscape contributing to SCLC development.
Guschinskaya, Natalia; Brunel, Romain; Tourte, Maxime; Lipscomb, Gina L; Adams, Michael W W; Oger, Philippe; Charpentier, Xavier
2016-11-08
Transposition mutagenesis is a powerful tool to identify the function of genes, reveal essential genes and generally to unravel the genetic basis of living organisms. However, transposon-mediated mutagenesis has only been successfully applied to a limited number of archaeal species and has never been reported in Thermococcales. Here, we report random insertion mutagenesis in the hyperthermophilic archaeon Pyrococcus furiosus. The strategy takes advantage of the natural transformability of derivatives of the P. furiosus COM1 strain and of in vitro Mariner-based transposition. A transposon bearing a genetic marker is randomly transposed in vitro in genomic DNA that is then used for natural transformation of P. furiosus. A small-scale transposition reaction routinely generates several hundred and up to two thousands transformants. Southern analysis and sequencing showed that the obtained mutants contain a single and random genomic insertion. Polyploidy has been reported in Thermococcales and P. furiosus is suspected of being polyploid. Yet, about half of the mutants obtained on the first selection are homozygous for the transposon insertion. Two rounds of isolation on selective medium were sufficient to obtain gene conversion in initially heterozygous mutants. This transposition mutagenesis strategy will greatly facilitate functional exploration of the Thermococcales genomes.
Large Genomic Fragment Deletions and Insertions in Mouse Using CRISPR/Cas9
Satheka, Achim Cchitvsanzwhoh; Togo, Jacques; An, Yao; Humphrey, Mabwi; Ban, Luying; Ji, Yan; Jin, Honghong; Feng, Xuechao; Zheng, Yaowu
2015-01-01
ZFN, TALENs and CRISPR/Cas9 system have been used to generate point mutations and large fragment deletions and insertions in genomic modifications. CRISPR/Cas9 system is the most flexible and fast developing technology that has been extensively used to make mutations in all kinds of organisms. However, the most mutations reported up to date are small insertions and deletions. In this report, CRISPR/Cas9 system was used to make large DNA fragment deletions and insertions, including entire Dip2a gene deletion, about 65kb in size, and β-galactosidase (lacZ) reporter gene insertion of larger than 5kb in mouse. About 11.8% (11/93) are positive for 65kb deletion from transfected and diluted ES clones. High targeting efficiencies in ES cells were also achieved with G418 selection, 46.2% (12/26) and 73.1% (19/26) for left and right arms respectively. Targeted large fragment deletion efficiency is about 21.4% of live pups or 6.0% of injected embryos. Targeted insertion of lacZ reporter with NEO cassette showed 27.1% (13/48) of targeting rate by ES cell transfection and 11.1% (2/18) by direct zygote injection. The procedures have bypassed in vitro transcription by directly co-injection of zygotes or co-transfection of embryonic stem cells with circular plasmid DNA. The methods are technically easy, time saving, and cost effective in generating mouse models and will certainly facilitate gene function studies. PMID:25803037
O'Brien, Heath E; Gong, Yunchen; Fung, Pauline; Wang, Pauline W; Guttman, David S
2011-01-01
Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.
Quantifying the Number of Independent Organelle DNA Insertions in Genome Evolution and Human Health.
Hazkani-Covo, Einat; Martin, William F
2017-05-01
Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments of mitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundance of numts associated with tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data. Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline the main technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Wheeler, Bayly S
2013-12-01
Transposons are mobile genetic elements that are a major constituent of most genomes. Organisms regulate transposable element expression, transposition, and insertion site preference, mitigating the genome instability caused by uncontrolled transposition. A recent burst of research has demonstrated the critical role of small non-coding RNAs in regulating transposition in fungi, plants, and animals. While mechanistically distinct, these pathways work through a conserved paradigm. The presence of a transposon is communicated by the presence of its RNA or by its integration into specific genomic loci. These signals are then translated into small non-coding RNAs that guide epigenetic modifications and gene silencing back to the transposon. In addition to being regulated by the host, transposable elements are themselves capable of influencing host gene expression. Transposon expression is responsive to environmental signals, and many transposons are activated by various cellular stresses. TEs can confer local gene regulation by acting as enhancers and can also confer global gene regulation through their non-coding RNAs. Thus, transposable elements can act as stress-responsive regulators that control host gene expression in cis and trans.
Cas9-Guide RNA Directed Genome Editing in Soybean[OPEN
Li, Zhongsen; Liu, Zhan-Bin; Xing, Aiqiu; Moon, Bryan P.; Koellhoffer, Jessica P.; Huang, Lingxia; Ward, R. Timothy; Clifton, Elizabeth; Falco, S. Carl; Cigan, A. Mark
2015-01-01
Recently discovered bacteria and archaea adaptive immune system consisting of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) endonuclease has been explored in targeted genome editing in different species. Streptococcus pyogenes Cas9-guide RNA (gRNA) was successfully applied to generate targeted mutagenesis, gene integration, and gene editing in soybean (Glycine max). Two genomic sites, DD20 and DD43 on chromosome 4, were mutagenized with frequencies of 59% and 76%, respectively. Sequencing randomly selected transgenic events confirmed that the genome modifications were specific to the Cas9-gRNA cleavage sites and consisted of small deletions or insertions. Targeted gene integrations through homology-directed recombination were detected by border-specific polymerase chain reaction analysis for both sites at callus stage, and one DD43 homology-directed recombination event was transmitted to T1 generation. T1 progenies of the integration event segregated according to Mendelian laws and clean homozygous T1 plants with the donor gene precisely inserted at the DD43 target site were obtained. The Cas9-gRNA system was also successfully applied to make a directed P178S mutation of acetolactate synthase1 gene through in planta gene editing. PMID:26294043
Prats, A C; Sarih, L; Gabus, C; Litvak, S; Keith, G; Darlix, J L
1988-01-01
Retrovirus virions carry a diploid genome associated with a large number of small viral finger protein molecules which are required for encapsidation. Our present results show that finger protein p12 of Rous sarcoma virus (RSV) and p10 of murine leukaemia virus (MuLV) positions replication primer tRNA on the replication initiation site (PBS) at the 5' end of the RNA genome. An RSV mutant with a Val-Pro insertion in the finger motif of p12 is able to partially encapsidate genomic RNA but is not infectious because mutated p12 is incapable of positioning the replication primer, tRNATrp. Since all known replication competent retroviruses, and the plant virus CaMV, code for finger proteins analogous to RSV p12 or MuLV p10, the initial stage of reverse transcription in avian, mammalian and human retroviruses and in CaMV is probably controlled in an analogous way. Images PMID:2458920
Prats, A C; Sarih, L; Gabus, C; Litvak, S; Keith, G; Darlix, J L
1988-06-01
Retrovirus virions carry a diploid genome associated with a large number of small viral finger protein molecules which are required for encapsidation. Our present results show that finger protein p12 of Rous sarcoma virus (RSV) and p10 of murine leukaemia virus (MuLV) positions replication primer tRNA on the replication initiation site (PBS) at the 5' end of the RNA genome. An RSV mutant with a Val-Pro insertion in the finger motif of p12 is able to partially encapsidate genomic RNA but is not infectious because mutated p12 is incapable of positioning the replication primer, tRNATrp. Since all known replication competent retroviruses, and the plant virus CaMV, code for finger proteins analogous to RSV p12 or MuLV p10, the initial stage of reverse transcription in avian, mammalian and human retroviruses and in CaMV is probably controlled in an analogous way.
2009-10-05
to be located within a small plasmid [11]. The genomic sequence data for the Eklund 17B strain verified the presence of bont/np b within a unique...average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed...three BoNT/A1 strains (ATCC 3502, ATCC 19397, Hall) revealed that these strains are nearly identical in genomic organization ( data not shown). The
Onozawa, Masahiro; Zhang, Zhenhua; Kim, Yoo Jung; Goldberg, Liat; Varga, Tamas; Bergsagel, P Leif; Kuehl, W Michael; Aplan, Peter D
2014-05-27
We used the I-SceI endonuclease to produce DNA double-strand breaks (DSBs) and observed that a fraction of these DSBs were repaired by insertion of sequences, which we termed "templated sequence insertions" (TSIs), derived from distant regions of the genome. These TSIs were derived from genic, retrotransposon, or telomere sequences and were not deleted from the donor site in the genome, leading to the hypothesis that they were derived from reverse-transcribed RNA. Cotransfection of RNA and an I-SceI expression vector demonstrated insertion of RNA-derived sequences at the DNA-DSB site, and TSIs were suppressed by reverse-transcriptase inhibitors. Both observations support the hypothesis that TSIs were derived from RNA templates. In addition, similar insertions were detected at sites of DNA DSBs induced by transcription activator-like effector nuclease proteins. Whole-genome sequencing of myeloma cell lines revealed additional TSIs, demonstrating that repair of DNA DSBs via insertion was not restricted to experimentally produced DNA DSBs. Analysis of publicly available databases revealed that many of these TSIs are polymorphic in the human genome. Taken together, these results indicate that insertional events should be considered as alternatives to gross chromosomal rearrangements in the interpretation of whole-genome sequence data and that this mutagenic form of DNA repair may play a role in genetic disease, exon shuffling, and mammalian evolution.
Pavlícek, Adam; Paces, Jan; Elleder, Daniel; Hejnar, Jirí
2002-03-01
We report here the presence of numerous processed pseudogenes derived from the W family of endogenous retroviruses in the human genome. These pseudogenes are structurally colinear with the retroviral mRNA followed by a poly(A) tail. Our analysis of insertion sites of HERV-W processed pseudogenes shows a strong preference for the insertion motif of long interspersed nuclear element (LINE) retrotransposons. The genomic distribution, stability during evolution, and frequent truncations at the 5' end resemble those of the pseudogenes generated by LINEs. We therefore suggest that HERV-W processed pseudogenes arose by multiple and independent LINE-mediated retrotransposition of retroviral mRNA. These data document that the majority of HERV-W copies are actually nontranscribed promoterless pseudogenes. The current search for HERV-Ws associated with several human diseases should concentrate on a small subset of transcriptionally competent elements.
Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji
2012-12-01
In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
A non-canonical transferred DNA insertion at the BRI1 locus in Arabidopsis thaliana.
Zhao, Zhong; Zhu, Yan; Erhardt, Mathieu; Ruan, Ying; Shen, Wen-Hui
2009-04-01
Agrobacterium-mediated transformation is widely used in transgenic plant engineering and has been proven to be a powerful tool for insertional mutagenesis of the plant genome. The transferred DNA (T-DNA) from Agrobacterium is integrated into the plant genome through illegitimate recombination between the T-DNA and the plant DNA. Contrasting to the canonical insertion, here we report on a locus showing a complex mutation associated with T-DNA insertion at the BRI1 gene in Arabidopsis thaliana. We obtained a mutant line, named salade for its phenotype of dwarf stature and proliferating rosette. Molecular characterization of this mutant revealed that in addition to T-DNA a non-T-DNA-localized transposon from bacteria was inserted in the Arabidopsis genome and that a region of more than 11.5 kb of the Arabidopsis genome was deleted at the insertion site. The deleted region contains the brassinosteroid receptor gene BRI1 and the transcription factor gene WRKY13. Our finding reveals non-canonical T-DNA insertion, implicating horizontal gene transfer and cautioning the use of T-DNA as mutagen in transgenic research.
Recent amplification and impact of MITEs on the genome of grapevine (Vitis vinifera L.)
Benjak, Andrej; Boué, Stéphanie; Forneck, Astrid
2009-01-01
Miniature inverted-repeat transposable elements (MITEs) are a particular type of defective class II transposons present in genomes as highly homogeneous populations of small elements. Their high copy number and close association to genes make their potential impact on gene evolution particularly relevant. Here, we present a detailed analysis of the MITE families directly related to grapevine “cut-and-paste” transposons. Our results show that grapevine MITEs have transduplicated and amplified genomic sequences, including gene sequences and fragments of other mobile elements. Our results also show that although some of the MITE families were already present in the ancestor of the European and American Vitis wild species, they have been amplified and have been actively transposing accompanying grapevine domestication and breeding. We show that MITEs are abundant in grapevine and some of them are frequently inserted within the untranslated regions of grapevine genes. MITE insertions are highly polymorphic among grapevine cultivars, which frequently generate transcript variability. The data presented here show that MITEs have greatly contributed to the grapevine genetic diversity which has been used for grapevine domestication and breeding. PMID:20333179
The detection of large deletions or duplications in genomic DNA.
Armour, J A L; Barton, D E; Cockburn, D J; Taylor, G R
2002-11-01
While methods for the detection of point mutations and small insertions or deletions in genomic DNA are well established, the detection of larger (>100 bp) genomic duplications or deletions can be more difficult. Most mutation scanning methods use PCR as a first step, but the subsequent analyses are usually qualitative rather than quantitative. Gene dosage methods based on PCR need to be quantitative (i.e., they should report molar quantities of starting material) or semi-quantitative (i.e., they should report gene dosage relative to an internal standard). Without some sort of quantitation, heterozygous deletions and duplications may be overlooked and therefore be under-ascertained. Gene dosage methods provide the additional benefit of reporting allele drop-out in the PCR. This could impact on SNP surveys, where large-scale genotyping may miss null alleles. Here we review recent developments in techniques for the detection of this type of mutation and compare their relative strengths and weaknesses. We emphasize that comprehensive mutation analysis should include scanning for large insertions and deletions and duplications. Copyright 2002 Wiley-Liss, Inc.
Adrian-Kalchhauser, Irene; Svensson, Ola; Kutschera, Verena E; Alm Rosenblad, Magnus; Pippel, Martin; Winkler, Sylke; Schloissnig, Siegfried; Blomberg, Anders; Burkhardt-Holm, Patricia
2017-02-16
Vertebrate mitochondrial genomes are optimized for fast replication and low cost of RNA expression. Accordingly, they are devoid of introns, are transcribed as polycistrons and contain very little intergenic sequences. Usually, vertebrate mitochondrial genomes measure between 16.5 and 17 kilobases (kb). During genome sequencing projects for two novel vertebrate models, the invasive round goby and the sand goby, we found that the sand goby genome is exceptionally small (16.4 kb), while the mitochondrial genome of the round goby is much larger than expected for a vertebrate. It is 19 kb in size and is thus one of the largest fish and even vertebrate mitochondrial genomes known to date. The expansion is attributable to a sequence insertion downstream of the putative transcriptional start site. This insertion carries traces of repeats from the control region, but is mostly novel. To get more information about this phenomenon, we gathered all available mitochondrial genomes of Gobiidae and of nine gobioid species, performed phylogenetic analyses, analysed gene arrangements, and compared gobiid mitochondrial genome sizes, ecological information and other species characteristics with respect to the mitochondrial phylogeny. This allowed us amongst others to identify a unique arrangement of tRNAs among Ponto-Caspian gobies. Our results indicate that the round goby mitochondrial genome may contain novel features. Since mitochondrial genome organisation is tightly linked to energy metabolism, these features may be linked to its invasion success. Also, the unique tRNA arrangement among Ponto-Caspian gobies may be helpful in studying the evolution of this highly adaptive and invasive species group. Finally, we find that the phylogeny of gobiids can be further refined by the use of longer stretches of linked DNA sequence.
Windsor, Aaron J.; Schranz, M. Eric; Formanová, Nataša; Gebauer-Jung, Steffi; Bishop, John G.; Schnabelrauch, Domenica; Kroymann, Juergen; Mitchell-Olds, Thomas
2006-01-01
Comparative genomics provides insight into the evolutionary dynamics that shape discrete sequences as well as whole genomes. To advance comparative genomics within the Brassicaceae, we have end sequenced 23,136 medium-sized insert clones from Boechera stricta, a wild relative of Arabidopsis (Arabidopsis thaliana). A significant proportion of these sequences, 18,797, are nonredundant and display highly significant similarity (BLASTn e-value ≤ 10−30) to low copy number Arabidopsis genomic regions, including more than 9,000 annotated coding sequences. We have used this dataset to identify orthologous gene pairs in the two species and to perform a global comparison of DNA regions 5′ to annotated coding regions. On average, the 500 nucleotides upstream to coding sequences display 71.4% identity between the two species. In a similar analysis, 61.4% identity was observed between 5′ noncoding sequences of Brassica oleracea and Arabidopsis, indicating that regulatory regions are not as diverged among these lineages as previously anticipated. By mapping the B. stricta end sequences onto the Arabidopsis genome, we have identified nearly 2,000 conserved blocks of microsynteny (bracketing 26% of the Arabidopsis genome). A comparison of fully sequenced B. stricta inserts to their homologous Arabidopsis genomic regions indicates that indel polymorphisms >5 kb contribute substantially to the genome size difference observed between the two species. Further, we demonstrate that microsynteny inferred from end-sequence data can be applied to the rapid identification and cloning of genomic regions of interest from nonmodel species. These results suggest that among diploid relatives of Arabidopsis, small- to medium-scale shotgun sequencing approaches can provide rapid and cost-effective benefits to evolutionary and/or functional comparative genomic frameworks. PMID:16607030
Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements
Liu, Pengfei; Erez, Ayelet; Sreenath Nagamani, Sandesh C.; Dhar, Shweta U.; Kołodziejska, Katarzyna E.; Dharmadhikari, Avinash V.; Cooper, M. Lance; Wiszniewska, Joanna; Zhang, Feng; Withers, Marjorie A.; Bacino, Carlos A.; Campos-Acevedo, Luis Daniel; Delgado, Mauricio R.; Freedenberg, Debra; Garnica, Adolfo; Grebe, Theresa A.; Hernández-Almaguer, Dolores; Immken, LaDonna; Lalani, Seema R.; McLean, Scott D.; Northrup, Hope; Scaglia, Fernando; Strathearn, Lane; Trapane, Pamela; Kang, Sung-Hae L.; Patel, Ankita; Cheung, Sau Wai; Hastings, P. J.; Stankiewicz, Paweł; Lupski, James R.; Bi, Weimin
2011-01-01
SUMMARY Complex genomic rearrangements (CGR) consisting of two or more breakpoint junctions have been observed in genomic disorders. Recently, a chromosome catastrophe phenomenon termed chromothripsis, in which numerous genomic rearrangements are apparently acquired in one single catastrophic event, was described in multiple cancers. Here we show that constitutionally acquired CGRs share similarities with cancer chromothripsis. In the 17 CGR cases investigated we observed localization and multiple copy number changes including deletions, duplications and/or triplications, as well as extensive translocations and inversions. Genomic rearrangements involved varied in size and complexities; in one case, array comparative genomic hybridization revealed 18 copy number changes. Breakpoint sequencing identified characteristic features, including small templated insertions at breakpoints and microhomology at breakpoint junctions, which have been attributed to replicative processes. The resemblance between CGR and chromothripsis suggests similar mechanistic underpinnings. Such chromosome catastrophic events appear to reflect basic DNA metabolism operative throughout an organism’s life cycle. PMID:21925314
Bashir, Ali; Bansal, Vikas; Bafna, Vineet
2010-06-18
Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)
Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn
2009-01-01
Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
Schiavo, Giuseppina; Hoffmann, Orsolya Ivett; Ribani, Anisa; Utzeri, Valerio Joe; Ghionda, Marco Ciro; Bertolini, Francesca; Geraci, Claudia; Bovo, Samuele; Fontanesi, Luca
2017-10-01
Nuclear DNA sequences of mitochondrial origin (numts) are derived by insertion of mitochondrial DNA (mtDNA), into the nuclear genome. In this study, we provide, for the first time, a genome picture of numts inserted in the pig nuclear genome. The Sus scrofa reference nuclear genome (Sscrofa10.2) was aligned with circularized and consensus mtDNA sequences using LAST software. A total of 430 numt sequences that may represent 246 different numt integration events (57 numt regions determined by at least two numt sequences and 189 singletons) were identified, covering about 0.0078% of the nuclear genome. Numt integration events were correlated (0.99) to the chromosome length. The longest numt sequence (about 11 kbp) was located on SSC2. Six numts were sequenced and PCR amplified in pigs of European commercial and local pig breeds, of the Chinese Meishan breed and in European wild boars. Three of them were polymorphic for the presence or absence of the insertion. Surprisingly, the estimated age of insertion of two of the three polymorphic numts was more ancient than that of the speciation time of the Sus scrofa, supporting that these polymorphic sites were originated from interspecies admixture that contributed to shape the pig genome. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Mobile Interspersed Repeats Are Major Structural Variants in the Human Genome
Huang, Cheng Ran Lisa; Schneider, Anna M.; Lu, Yunqi; Niranjan, Tejasvi; Shen, Peilin; Robinson, Matoya A.; Steranka, Jared P.; Valle, David; Civin, Curt I.; Wang, Tao; Wheelan, Sarah J.; Ji, Hongkai; Boeke, Jef D.; Burns, Kathleen H.
2010-01-01
Summary Characterizing structural variants in the human genome is of great importance, but a genome wide analysis to detect interspersed repeats has not been done. Thus, the degree to which mobile DNAs contribute to genetic diversity, heritable disease, and oncogenesis remains speculative. We perform transposon insertion profiling by microarray (TIP-chip) to map human L1(Ta) retrotransposons (LINE-1 s) genome-wide. This identified numerous novel human L1(Ta) insertional polymorphisms with highly variant allelic frequencies. We also explored TIP-chip's usefulness to identify candidate alleles associated with different phenotypes in clinical cohorts. Our data suggest that the occurrence of new insertions is twice as high as previously estimated, and that these repeats are under-recognized as sources of human genomic and phenotypic diversity. We have just begun to probe the universe of human L1(Ta) polymorphisms, and as TIP-chip is applied to other insertions such as Alu SINEs, it will expand the catalog of genomic variants even further. PMID:20602999
Sinha, Rahul; Goyal, Pankaj; Grapputo, Alessandro
2011-01-01
Background Insertions of spliceosomal introns are very rare events during evolution of vertebrates and the mechanisms governing creation of novel intron(s) remain obscure. Largely, gene structures of melanocortin (MC) receptors are characterized by intron-less architecture. However, recently a few exceptions have been reported in some fishes. This warrants a systematic survey of MC receptors for understanding intron insertion events during vertebrate evolution. Methodology/Principal Findings We have compiled an extended list of MC receptors from different vertebrate genomes with variations in fishes. Notably, the closely linked MC2Rs and MC5Rs from a group of ray-finned fishes have three and one intron insertion(s), respectively, with conserved positions and intron phase. In both genes, one novel insertion was in the highly conserved DRY motif at the end of helix TM3. Further, the proto-splice site MAG↑R is maintained at intron insertion sites in these two genes. However, the orthologs of these receptors from zebrafish and tetrapods are intron-less, suggesting these introns are simultaneously created in selected fishes. Surprisingly, these novel introns are traceable only in four fish genomes. We found that these fish genomes are severely compacted after the separation from zebrafish. Furthermore, we also report novel intron insertions in P2Y receptors and in CHRM3. Finally, we report ultrasmall introns in MC2R genes from selected fishes. Conclusions/Significance The current repository of MC receptors illustrates that fishes have no MC3R ortholog. MC2R, MC5R, P2Y receptors and CHRM3 have novel intron insertions only in ray-finned fishes that underwent genome compaction. These receptors share one intron at an identical position suggestive of being inserted contemporaneously. In addition to repetitive elements, genome compaction is now believed to be a new hallmark that promotes intron insertions, as it requires rapid DNA breakage and subsequent repair processes to gain back normal functionality. PMID:21850219
Guo, Bingfu; Guo, Yong; Hong, Huilong; Qiu, Li-Juan
2016-01-01
Molecular characterization of sequence flanking exogenous fragment insertion is essential for safety assessment and labeling of genetically modified organism (GMO). In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS) method. More than 22.4 Gb sequence data (∼21 × coverage) for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundaries of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of genomic insertion sites of G2-EPSPS and GAT transgenes will facilitate the utilization of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS was a cost-effective and rapid method for identifying sites of T-DNA insertions and flanking sequences in soybean.
Highly efficient CRISPR/HDR-mediated knock-in for mouse embryonic stem cells and zygotes.
Wang, Bangmei; Li, Kunyu; Wang, Amy; Reiser, Michelle; Saunders, Thom; Lockey, Richard F; Wang, Jia-Wang
2015-10-01
The clustered regularly interspaced short palindromic repeat (CRISPR) gene editing technique, based on the non-homologous end-joining (NHEJ) repair pathway, has been used to generate gene knock-outs with variable sizes of small insertion/deletions with high efficiency. More precise genome editing, either the insertion or deletion of a desired fragment, can be done by combining the homology-directed-repair (HDR) pathway with CRISPR cleavage. However, HDR-mediated gene knock-in experiments are typically inefficient, and there have been no reports of successful gene knock-in with DNA fragments larger than 4 kb. Here, we describe the targeted insertion of large DNA fragments (7.4 and 5.8 kb) into the genomes of mouse embryonic stem (ES) cells and zygotes, respectively, using the CRISPR/HDR technique without NHEJ inhibitors. Our data show that CRISPR/HDR without NHEJ inhibitors can result in highly efficient gene knock-in, equivalent to CRISPR/HDR with NHEJ inhibitors. Although NHEJ is the dominant repair pathway associated with CRISPR-mediated double-strand breaks (DSBs), and biallelic gene knock-ins are common, NHEJ and biallelic gene knock-ins were not detected. Our results demonstrate that efficient targeted insertion of large DNA fragments without NHEJ inhibitors is possible, a result that should stimulate interest in understanding the mechanisms of high efficiency CRISPR targeting in general.
Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph; Aury, Jean-Marc
2017-02-01
Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. © The Author 2017. Published by Oxford University Press.
Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph
2017-01-01
Abstract Background: Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Results: Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Conclusion: Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. PMID:28369459
Co-evolution of plant LTR-retrotransposons and their host genomes.
Zhao, Meixia; Ma, Jianxin
2013-07-01
Transposable elements (TEs), particularly, long terminal repeat retrotransposons (LTR-RTs), are the most abundant DNA components in all plant species that have been investigated, and are largely responsible for plant genome size variation. Although plant genomes have experienced periodic proliferation and/or recent burst of LTR-retrotransposons, the majority of LTR-RTs are inactivated by DNA methylation and small RNA-mediated silencing mechanisms, and/or were deleted/truncated by unequal homologous recombination and illegitimate recombination, as suppression mechanisms that counteract genome expansion caused by LTR-RT amplification. LTR-RT DNA is generally enriched in pericentromeric regions of the host genomes, which appears to be the outcomes of preferential insertions of LTR-RTs in these regions and low effectiveness of selection that purges LTR-RT DNA from these regions relative to chromosomal arms. Potential functions of various TEs in their host genomes remain blurry; nevertheless, LTR-RTs have been recognized to play important roles in maintaining chromatin structures and centromere functions and regulation of gene expressions in their host genomes.
The evolution of small insertions and deletions in the coding genes of Drosophila melanogaster.
Chong, Zechen; Zhai, Weiwei; Li, Chunyan; Gao, Min; Gong, Qiang; Ruan, Jue; Li, Juan; Jiang, Lan; Lv, Xuemei; Hungate, Eric; Wu, Chung-I
2013-12-01
Studies of protein evolution have focused on amino acid substitutions with much less systematic analysis on insertion and deletions (indels) in protein coding genes. We hence surveyed 7,500 genes between Drosophila melanogaster and D. simulans, using D. yakuba as an outgroup for this purpose. The evolutionary rate of coding indels is indeed low, at only 3% of that of nonsynonymous substitutions. As coding indels follow a geometric distribution in size and tend to fall in low-complexity regions of proteins, it is unclear whether selection or mutation underlies this low rate. To resolve the issue, we collected genomic sequences from an isogenic African line of D. melanogaster (ZS30) at a high coverage of 70× and analyzed indel polymorphism between ZS30 and the reference genome. In comparing polymorphism and divergence, we found that the divergence to polymorphism ratio (i.e., fixation index) for smaller indels (size ≤ 10 bp) is very similar to that for synonymous changes, suggesting that most of the within-species polymorphism and between-species divergence for indels are selectively neutral. Interestingly, deletions of larger sizes (size ≥ 11 bp and ≤ 30 bp) have a much higher fixation index than synonymous mutations and 44.4% of fixed middle-sized deletions are estimated to be adaptive. To our surprise, this pattern is not found for insertions. Protein indel evolution appear to be in a dynamic flux of neutrally driven expansion (insertions) together with adaptive-driven contraction (deletions), and these observations provide important insights for understanding the fitness of new mutations as well as the evolutionary driving forces for genomic evolution in Drosophila species.
Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.
Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N
2014-07-01
Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Clear Cell Sarcoma of the Kidney (CCSK) is a rare childhood tumor whose molecular pathogenesis remains poorly understood. We analyzed a discovery set of 13 CCSKs for changes in chromosome copy number, mutations, rearrangements, global gene expression and global DNA methylation. No recurrent segmental chromosomal copy number changes or somatic variants (single nucleotide or small insertion/deletion) were identified.
Gallus, Susanne; Janke, Axel
2017-01-01
Abstract Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation. PMID:28985298
Partier, A; Gay, G; Tassy, C; Beckert, M; Feuillet, C; Barret, P
2017-10-01
A large, 53-kbp, intact DNA fragment was inserted into the wheat ( Triticum aestivum L.) genome. FISH analyses of individual transgenic events revealed multiple insertions of intact fragments. Transferring large intact DNA fragments containing clusters of resistance genes or complete metabolic pathways into the wheat genome remains a challenge. In a previous work, we showed that the use of dephosphorylated cassettes for wheat transformation enabled the production of simple integration patterns. Here, we used the same technology to produce a cassette containing a 44-kb Arabidopsis thaliana BAC, flanked by one selection gene and one reporter gene. This 53-kb linear cassette was integrated in the bread wheat (Triticum aestivum L.) genome by biolistic transformation. Our results showed that transgenic plants harboring the entire cassette were generated. The inheritability of the cassette was demonstrated in the T1 and T2 generation. Surprisingly, FISH analysis performed on T1 progeny of independent events identified double genomic insertions of intact fragments in non-homoeologous positions. Inheritability of these double insertions was demonstrated by FISH analysis of the T1 generation. Relative conclusions that can be drawn from molecular or FISH analysis are discussed along with future prospects of the engineering of large fragments for wheat transformation or genome editing.
Shimoda, Yoshikazu; Mitsui, Hisayuki; Kamimatsuse, Hiroko; Minamisawa, Kiwamu; Nishiyama, Eri; Ohtsubo, Yoshiyuki; Nagata, Yuji; Tsuda, Masataka; Shinpo, Sayaka; Watanabe, Akiko; Kohara, Mitsuyo; Yamada, Manabu; Nakamura, Yasukazu; Tabata, Satoshi; Sato, Shusei
2008-01-01
Rhizobia are nitrogen-fixing soil bacteria that establish endosymbiosis with some leguminous plants. The completion of several rhizobial genome sequences provides opportunities for genome-wide functional studies of the physiological roles of many rhizobial genes. In order to carry out genome-wide phenotypic screenings, we have constructed a large mutant library of the nitrogen-fixing symbiotic bacterium, Mesorhizobium loti, by transposon mutagenesis. Transposon insertion mutants were generated using the signature-tagged mutagenesis (STM) technique and a total of 29 330 independent mutants were obtained. Along with the collection of transposon mutants, we have determined the transposon insertion sites for 7892 clones, and confirmed insertions in 3680 non-redundant M. loti genes (50.5% of the total number of M. loti genes). Transposon insertions were randomly distributed throughout the M. loti genome without any bias toward G+C contents of insertion target sites and transposon plasmids used for the mutagenesis. We also show the utility of STM mutants by examining the specificity of signature tags and test screenings for growth- and nodulation-deficient mutants. This defined mutant library allows for genome-wide forward- and reverse-genetic functional studies of M. loti and will serve as an invaluable resource for researchers to further our understanding of rhizobial biology. PMID:18658183
The Nucleotide Excision Repair Pathway Limits L1 Retrotransposition
Servant, Geraldine; Streva, Vincent A.; Derbes, Rebecca S.; Wijetunge, Madushani I.; Neeland, Marc; White, Travis B.; Belancio, Victoria P.; Roy-Engel, Astrid M.; Deininger, Prescott L.
2017-01-01
Long interspersed elements 1 (L1) are active mobile elements that constitute almost 17% of the human genome. They amplify through a “copy-and-paste” mechanism termed retrotransposition, and de novo insertions related to these elements have been reported to cause 0.2% of genetic diseases. Our previous data demonstrated that the endonuclease complex ERCC1-XPF, which cleaves a 3′ DNA flap structure, limits L1 retrotransposition. Although the ERCC1-XPF endonuclease participates in several different DNA repair pathways, such as single-strand annealing, or in telomere maintenance, its recruitment to DNA lesions is best characterized in the nucleotide excision repair (NER) pathway. To determine if the NER pathway prevents the insertion of retroelements in the genome, we monitored the retrotransposition efficiencies of engineered L1 elements in NER-deficient cells and in their complemented versions. Core proteins of the NER pathway, XPD and XPA, and the lesion binding protein, XPC, are involved in limiting L1 retrotransposition. In addition, sequence analysis of recovered de novo L1 inserts and their genomic locations in NER-deficient cells demonstrated the presence of abnormally large duplications at the site of insertion, suggesting that NER proteins may also play a role in the normal L1 insertion process. Here, we propose new functions for the NER pathway in the maintenance of genome integrity: limitation of insertional mutations caused by retrotransposons and the prevention of potentially mutagenic large genomic duplications at the site of retrotransposon insertion events. PMID:28049704
Sanchez-Luque, Francisco J; Richardson, Sandra R; Faulkner, Geoffrey J
2016-01-01
Mobile genetic elements (MGEs) are of critical importance in genomics and developmental biology. Polymorphic and somatic MGE insertions have the potential to impact the phenotype of an individual, depending on their genomic locations and functional consequences. However, the identification of polymorphic and somatic insertions among the plethora of copies residing in the genome presents a formidable technical challenge. Whole genome sequencing has the potential to address this problem; however, its efficacy depends on the abundance of cells carrying the new insertion. Robust detection of somatic insertions present in only a subset of cells within a given sample can also be prohibitively expensive due to a requirement for high sequencing depth. Here, we describe retrotransposon capture sequencing (RC-seq), a sequence capture approach in which Illumina libraries are enriched for fragments containing the 5' and 3' termini of specific MGEs. RC-seq allows the detection of known polymorphic insertions present in an individual, as well as the identification of rare or private germline insertions not previously described. Furthermore, RC-seq can be used to detect and characterize somatic insertions, providing a valuable tool to elucidate the extent and characteristics of MGE activity in healthy tissues and in various disease states.
Gibbon genome and the fast karyotype evolution of small apes.
Carbone, Lucia; Harris, R Alan; Gnerre, Sante; Veeramah, Krishna R; Lorente-Galdos, Belen; Huddleston, John; Meyer, Thomas J; Herrero, Javier; Roos, Christian; Aken, Bronwen; Anaclerio, Fabio; Archidiacono, Nicoletta; Baker, Carl; Barrell, Daniel; Batzer, Mark A; Beal, Kathryn; Blancher, Antoine; Bohrson, Craig L; Brameier, Markus; Campbell, Michael S; Capozzi, Oronzo; Casola, Claudio; Chiatante, Giorgia; Cree, Andrew; Damert, Annette; de Jong, Pieter J; Dumas, Laura; Fernandez-Callejo, Marcos; Flicek, Paul; Fuchs, Nina V; Gut, Ivo; Gut, Marta; Hahn, Matthew W; Hernandez-Rodriguez, Jessica; Hillier, LaDeana W; Hubley, Robert; Ianc, Bianca; Izsvák, Zsuzsanna; Jablonski, Nina G; Johnstone, Laurel M; Karimpour-Fard, Anis; Konkel, Miriam K; Kostka, Dennis; Lazar, Nathan H; Lee, Sandra L; Lewis, Lora R; Liu, Yue; Locke, Devin P; Mallick, Swapan; Mendez, Fernando L; Muffato, Matthieu; Nazareth, Lynne V; Nevonen, Kimberly A; O'Bleness, Majesta; Ochis, Cornelia; Odom, Duncan T; Pollard, Katherine S; Quilez, Javier; Reich, David; Rocchi, Mariano; Schumann, Gerald G; Searle, Stephen; Sikela, James M; Skollar, Gabriella; Smit, Arian; Sonmez, Kemal; ten Hallers, Boudewijn; Terhune, Elizabeth; Thomas, Gregg W C; Ullmer, Brygg; Ventura, Mario; Walker, Jerilyn A; Wall, Jeffrey D; Walter, Lutz; Ward, Michelle C; Wheelan, Sarah J; Whelan, Christopher W; White, Simon; Wilhelm, Larry J; Woerner, August E; Yandell, Mark; Zhu, Baoli; Hammer, Michael F; Marques-Bonet, Tomas; Eichler, Evan E; Fulton, Lucinda; Fronick, Catrina; Muzny, Donna M; Warren, Wesley C; Worley, Kim C; Rogers, Jeffrey; Wilson, Richard K; Gibbs, Richard A
2014-09-11
Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation ∼5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.
In vivo and in vitro disease modeling with CRISPR/Cas9.
Kato, Tomoko; Takada, Shuji
2017-01-01
In the past few years, extensive progress has been made in the development of genome-editing technology. Among several genome-editing tools, the clustered regularly interspaced short palindrome repeat-associated Cas9 nuclease (CRISPR/Cas9) system is particularly widely used owing to the ease of sequence-specific nuclease construction and the highly efficient introduction of mutations. The CRISPR/Cas9 system was originally constructed to induce small insertion and deletion mutations, but various methods have been developed to introduce point mutations, deletions, insertions, chromosomal translocations and so on. These methods should be useful for the reconstruction of disease-causing mutations in cultured cell lines and living organisms to elucidate disease pathogenesis and for disease prevention, treatment and drug discovery. This review summarizes the current technical aspects of the CRISPR/Cas9 system for disease modeling in cultured cells and living organisms, mainly mice. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Development of genetic tools for in vivo virulence analysis of Streptococcus sanguinis.
Turner, Lauren Senty; Das, Sankar; Kanamoto, Taisei; Munro, Cindy L; Kitten, Todd
2009-08-01
Completion of the genome sequence of Streptococcus sanguinis SK36 necessitates tools for further characterization of this species. It is often desirable to insert antibiotic resistance markers and other exogenous genes into the chromosome; therefore, we sought to identify a chromosomal site for ectopic expression of foreign genes, and to verify that insertion into this site did not affect important cellular phenotypes. We designed three plasmid constructs for insertion of erm, aad9 or tetM resistance determinants into a genomic region encoding only a small (65 aa) hypothetical protein. To determine whether this insertion affected important cellular properties, SK36 and its erythromycin-resistant derivative, JFP36, were compared for: (i) growth in vitro, (ii) genetic competence, (iii) biofilm formation and (iv) virulence for endocarditis in the rabbit model of infective endocarditis (IE). The spectinomycin-resistant strain, JFP56, and tetracycline-resistant strain, JFP76, were also tested for virulence in vivo. Insertion of erm did not affect growth, competence or biofilm development of JFP36. Recovery of bacteria from heart valves of co-inoculated rabbits was similar to wild-type for JFP36, JFP56 and JFP76, indicating that IE virulence was not significantly affected. The capacity for mutant complementation in vivo was explored in an avirulent ssaB mutant background. Expression of ssaB from its predicted promoter in the target region restored IE virulence. Thus, the chromosomal site utilized is a good candidate for further manipulations of S. sanguinis. In addition, the resistant strains developed may be further applied as controls to facilitate screening for virulence factors in vivo.
Development of genetic tools for in vivo virulence analysis of Streptococcus sanguinis
Senty Turner, Lauren; Das, Sankar; Kanamoto, Taisei; Munro, Cindy L.; Kitten, Todd
2009-01-01
Completion of the genome sequence of Streptococcus sanguinis SK36 necessitates tools for further characterization of this species. It is often desirable to insert antibiotic resistance markers and other exogenous genes into the chromosome; therefore, we sought to identify a chromosomal site for ectopic expression of foreign genes, and to verify that insertion into this site did not affect important cellular phenotypes. We designed three plasmid constructs for insertion of erm, aad9 or tetM resistance determinants into a genomic region encoding only a small (65 aa) hypothetical protein. To determine whether this insertion affected important cellular properties, SK36 and its erythromycin-resistant derivative, JFP36, were compared for: (i) growth in vitro, (ii) genetic competence, (iii) biofilm formation and (iv) virulence for endocarditis in the rabbit model of infective endocarditis (IE). The spectinomycin-resistant strain, JFP56, and tetracycline-resistant strain, JFP76, were also tested for virulence in vivo. Insertion of erm did not affect growth, competence or biofilm development of JFP36. Recovery of bacteria from heart valves of co-inoculated rabbits was similar to wild-type for JFP36, JFP56 and JFP76, indicating that IE virulence was not significantly affected. The capacity for mutant complementation in vivo was explored in an avirulent ssaB mutant background. Expression of ssaB from its predicted promoter in the target region restored IE virulence. Thus, the chromosomal site utilized is a good candidate for further manipulations of S. sanguinis. In addition, the resistant strains developed may be further applied as controls to facilitate screening for virulence factors in vivo. PMID:19423626
Adrion, Jeffrey R.; Song, Michael J.; Schrider, Daniel R.; Hahn, Matthew W.
2017-01-01
Abstract Knowing the rate at which transposable elements (TEs) insert and delete is critical for understanding their role in genome evolution. We estimated spontaneous rates of insertion and deletion for all known, active TE superfamilies present in a set of Drosophila melanogaster mutation-accumulation (MA) lines using whole genome sequence data. Our results demonstrate that TE insertions far outpace TE deletions in D. melanogaster. We found a significant effect of background genotype on TE activity, with higher rates of insertions in one MA line. We also found significant rate heterogeneity between the chromosomes, with both insertion and deletion rates elevated on the X relative to the autosomes. Further, we identified significant associations between TE activity and chromatin state, and tested for associations between TE activity and other features of the local genomic environment such as TE content, exon content, GC content, and recombination rate. Our results provide the most detailed assessment of TE mobility in any organism to date, and provide a useful benchmark for both addressing theoretical predictions of TE dynamics and for exploring large-scale patterns of TE movement in D. melanogaster and other species. PMID:28338986
Sorting genomes by reciprocal translocations, insertions, and deletions.
Qi, Xingqin; Li, Guojun; Li, Shuguang; Xu, Ying
2010-01-01
The problem of sorting by reciprocal translocations (abbreviated as SBT) arises from the field of comparative genomics, which is to find a shortest sequence of reciprocal translocations that transforms one genome Pi into another genome Gamma, with the restriction that Pi and Gamma contain the same genes. SBT has been proved to be polynomial-time solvable, and several polynomial algorithms have been developed. In this paper, we show how to extend Bergeron's SBT algorithm to include insertions and deletions, allowing to compare genomes containing different genes. In particular, if the gene set of Pi is a subset (or superset, respectively) of the gene set of Gamma, we present an approximation algorithm for transforming Pi into Gamma by reciprocal translocations and deletions (insertions, respectively), providing a sorting sequence with length at most OPT + 2, where OPT is the minimum number of translocations and deletions (insertions, respectively) needed to transform Pi into Gamma; if Pi and Gamma have different genes but not containing each other, we give a heuristic to transform Pi into Gamma by a shortest sequence of reciprocal translocations, insertions, and deletions, with bounds for the length of the sorting sequence it outputs. At a conceptual level, there is some similarity between our algorithm and the algorithm developed by El Mabrouk which is used to sort two chromosomes with different gene contents by reversals, insertions, and deletions.
Alu repeat discovery and characterization within human genomes
Hormozdiari, Fereydoun; Alkan, Can; Ventura, Mario; Hajirasouliha, Iman; Malig, Maika; Hach, Faraz; Yorukoglu, Deniz; Dao, Phuong; Bakhshi, Marzieh; Sahinalp, S. Cenk; Eichler, Evan E.
2011-01-01
Human genomes are now being rapidly sequenced, but not all forms of genetic variation are routinely characterized. In this study, we focus on Alu retrotransposition events and seek to characterize differences in the pattern of mobile insertion between individuals based on the analysis of eight human genomes sequenced using next-generation sequencing. Applying a rapid read-pair analysis algorithm, we discover 4342 Alu insertions not found in the human reference genome and show that 98% of a selected subset (63/64) experimentally validate. Of these new insertions, 89% correspond to AluY elements, suggesting that they arose by retrotransposition. Eighty percent of the Alu insertions have not been previously reported and more novel events were detected in Africans when compared with non-African samples (76% vs. 69%). Using these data, we develop an experimental and computational screen to identify ancestry informative Alu retrotransposition events among different human populations. PMID:21131385
Lammers, Fritjof; Gallus, Susanne; Janke, Axel; Nilsson, Maria A
2017-10-01
Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
van der Klift, Heleen M; Tops, Carli M; Hes, Frederik J; Devilee, Peter; Wijnen, Juul T
2012-07-01
Heterozygous germline mutations in the mismatch repair gene PMS2 predispose carriers for Lynch syndrome, an autosomal dominant predisposition to cancer. Here, we present a LINE-1-mediated retrotranspositional insertion in PMS2 as a novel mutation type for Lynch syndrome. This insertion, detected with Southern blot analysis in the genomic DNA of the patient, is characterized as a 2.2 kb long 5' truncated SVA_F element. The insertion is not detectable by current diagnostic testing limited to MLPA and direct Sanger sequencing on genomic DNA. The molecular nature of this insertion could only be resolved in RNA from cultured lymphocytes in which nonsense-mediated RNA decay was inhibited. Our report illustrates the technical problems encountered in the detection of this mutation type. Especially large heterozygous insertions will remain unnoticed because of preferential amplification of the smaller wild-type allele in genomic DNA, and are probably underreported in the mutation spectra of autosomal dominant disorders. © 2012 Wiley Periodicals, Inc.
Horta-Valerdi, Guillermo; Sanchez-Alonso, Maria Patricia; Perez-Marquez, Victor M; Negrete-Abascal, Erasmo; Vaca-Pacheco, Sergio; Hernandez-Gonzalez, Ismael; Gomez-Lunar, Zulema; Olmedo-Álvarez, Gabriela; Vázquez-Cruz, Candelario
2017-04-13
The draft genome sequence of Avibacterium paragallinarum strain CL serovar C is reported here. The genome comprises 154 contigs corresponding to 2.4 Mb with 41% G+C content and many insertion sequence (IS) elements, a characteristic not previously reported in A. paragallinarum . Copyright © 2017 Horta-Valerdi et al.
Orangutan Alu quiescence reveals possible source element: support for ancient backseat drivers
2012-01-01
Background Sequence analysis of the orangutan genome revealed that recent proliferative activity of Alu elements has been uncharacteristically quiescent in the Pongo (orangutan) lineage, compared with all previously studied primate genomes. With relatively few young polymorphic insertions, the genomic landscape of the orangutan seemed like the ideal place to search for a driver, or source element, of Alu retrotransposition. Results Here we report the identification of a nearly pristine insertion possessing all the known putative hallmarks of a retrotranspositionally competent Alu element. It is located in an intronic sequence of the DGKB gene on chromosome 7 and is highly conserved in Hominidae (the great apes), but absent from Hylobatidae (gibbon and siamang). We provide evidence for the evolution of a lineage-specific subfamily of this shared Alu insertion in orangutans and possibly the lineage leading to humans. In the orangutan genome, this insertion contains three orangutan-specific diagnostic mutations which are characteristic of the youngest polymorphic Alu subfamily, AluYe5b5_Pongo. In the Homininae lineage (human, chimpanzee and gorilla), this insertion has acquired three different mutations which are also found in a single human-specific Alu insertion. Conclusions This seemingly stealth-like amplification, ongoing at a very low rate over millions of years of evolution, suggests that this shared insertion may represent an ancient backseat driver of Alu element expansion. PMID:22541534
Orangutan Alu quiescence reveals possible source element: support for ancient backseat drivers.
Walker, Jerilyn A; Konkel, Miriam K; Ullmer, Brygg; Monceaux, Christopher P; Ryder, Oliver A; Hubley, Robert; Smit, Arian Fa; Batzer, Mark A
2012-04-30
Sequence analysis of the orangutan genome revealed that recent proliferative activity of Alu elements has been uncharacteristically quiescent in the Pongo (orangutan) lineage, compared with all previously studied primate genomes. With relatively few young polymorphic insertions, the genomic landscape of the orangutan seemed like the ideal place to search for a driver, or source element, of Alu retrotransposition. Here we report the identification of a nearly pristine insertion possessing all the known putative hallmarks of a retrotranspositionally competent Alu element. It is located in an intronic sequence of the DGKB gene on chromosome 7 and is highly conserved in Hominidae (the great apes), but absent from Hylobatidae (gibbon and siamang). We provide evidence for the evolution of a lineage-specific subfamily of this shared Alu insertion in orangutans and possibly the lineage leading to humans. In the orangutan genome, this insertion contains three orangutan-specific diagnostic mutations which are characteristic of the youngest polymorphic Alu subfamily, AluYe5b5_Pongo. In the Homininae lineage (human, chimpanzee and gorilla), this insertion has acquired three different mutations which are also found in a single human-specific Alu insertion. This seemingly stealth-like amplification, ongoing at a very low rate over millions of years of evolution, suggests that this shared insertion may represent an ancient backseat driver of Alu element expansion.
A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome
Konkel, Miriam K.; Batzer, Mark A.
2010-01-01
It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families – long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements – mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development. PMID:20307669
Bhattacharya, D; Surek, B; Rüsing, M; Damberger, S; Melkonian, M
1994-01-01
Group I introns are found in organellar genomes, in the genomes of eubacteria and phages, and in nuclear-encoded rRNAs. The origin and distribution of nuclear-encoded rRNA group I introns are not understood. To elucidate their evolutionary relationships, we analyzed diverse nuclear-encoded small-subunit rRNA group I introns including nine sequences from the green-algal order Zygnematales (Charophyceae). Phylogenetic analyses of group I introns and rRNA coding regions suggest that lateral transfers have occurred in the evolutionary history of group I introns and that, after transfer, some of these elements may form stable components of the host-cell nuclear genomes. The Zygnematales introns, which share a common insertion site (position 1506 relative to the Escherichia coli small-subunit rRNA), form one subfamily of group I introns that has, after its origin, been inherited through common ancestry. Since the first Zygnematales appear in the middle Devonian within the fossil record, the "1506" group I intron presumably has been a stable component of the Zygnematales small-subunit rRNA coding region for 350-400 million years. PMID:7937917
Zhu, Lingxiang; Yan, Zhongqiang; Zhang, Zhaojun; Zhou, Qiming; Zhou, Jinchun; Wakeland, Edward K; Fang, Xiangdong; Xuan, Zhenyu; Shen, Dingxia; Li, Quan-Zhen
2013-01-01
The emergence and rapid spreading of multidrug-resistant Acinetobacter baumannii strains has become a major health threat worldwide. To better understand the genetic recombination related with the acquisition of drug-resistant elements during bacterial infection, we performed complete genome analysis on three newly isolated multidrug-resistant A. baumannii strains from Beijing using next-generation sequencing technology. Whole genome comparison revealed that all 3 strains share some common drug resistant elements including carbapenem-resistant bla OXA-23 and tetracycline (tet) resistance islands, but the genome structures are diversified among strains. Various genomic islands intersperse on the genome with transposons and insertions, reflecting the recombination flexibility during the acquisition of the resistant elements. The blood-isolated BJAB07104 and ascites-isolated BJAB0868 exhibit high similarity on their genome structure with most of the global clone II strains, suggesting these two strains belong to the dominant outbreak strains prevalent worldwide. A large resistance island (RI) of about 121-kb, carrying a cluster of resistance-related genes, was inserted into the ATPase gene on BJAB07104 and BJAB0868 genomes. A 78-kb insertion element carrying tra-locus and bla OXA-23 island, can be either inserted into one of the tniB gene in the 121-kb RI on the chromosome, or transformed to conjugative plasmid in the two BJAB strains. The third strains of this study, BJAB0715, which was isolated from spinal fluid, exhibit much more divergence compared with above two strains. It harbors multiple drug-resistance elements including a truncated AbaR-22-like RI on its genome. One of the unique features of this strain is that it carries both bla OXA-23 and bla OXA-58 genes on its genome. Besides, an Acinetobacter lwoffii adeABC efflux element was found inserted into the ATPase position in BJAB0715. Our comparative analysis on currently completed Acinetobacter baumannii genomes revealed extensive and dynamic genome organizations, which may facilitate the bacteria to acquire drug-resistance elements into their genomes.
Chao, Michael C.; Pritchard, Justin R.; Zhang, Yanjia J.; Rubin, Eric J.; Livny, Jonathan; Davis, Brigid M.; Waldor, Matthew K.
2013-01-01
The coupling of high-density transposon mutagenesis to high-throughput DNA sequencing (transposon-insertion sequencing) enables simultaneous and genome-wide assessment of the contributions of individual loci to bacterial growth and survival. We have refined analysis of transposon-insertion sequencing data by normalizing for the effect of DNA replication on sequencing output and using a hidden Markov model (HMM)-based filter to exploit heretofore unappreciated information inherent in all transposon-insertion sequencing data sets. The HMM can smooth variations in read abundance and thereby reduce the effects of read noise, as well as permit fine scale mapping that is independent of genomic annotation and enable classification of loci into several functional categories (e.g. essential, domain essential or ‘sick’). We generated a high-resolution map of genomic loci (encompassing both intra- and intergenic sequences) that are required or beneficial for in vitro growth of the cholera pathogen, Vibrio cholerae. This work uncovered new metabolic and physiologic requirements for V. cholerae survival, and by combining transposon-insertion sequencing and transcriptomic data sets, we also identified several novel noncoding RNA species that contribute to V. cholerae growth. Our findings suggest that HMM-based approaches will enhance extraction of biological meaning from transposon-insertion sequencing genomic data. PMID:23901011
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, P; Garcia, E
2003-02-06
The goal of this proposed effort was to assess the difficulty in identifying and characterizing virulence candidate genes in an organism for which very limited data exists. This was accomplished by first addressing the finishing phase of draft-sequenced F. tularensis genomes and conducting comparative analyses to determine the coding potential of each genome; to discover the differences in genome structure and content, and to identify potential genes whose products may be involved in the F. tularensis virulence process. The project was divided into three parts: (1) Genome finishing: This part involves determining the order and orientation of the consensus sequencesmore » of contigs obtained from Phrap assemblies of random draft genomic sequences. This tedious process consists of linking contig ends using information embedded in each sequence file that relates the sequence to the original cloned insert. Since inserts are sequenced from both ends, we can establish a link between these paired-ends in different contigs and thus order and orient contigs. Since these genomes carry numerous copies of insertion sequences, these repeated elements ''confuse'' the Phrap assembly program. It is thus necessary to break these contigs apart at the repeated sequences and individually join the proper flanking regions using paired-end information, or using results of comparisons against a similar genome. Larger repeated elements such as the small subunit ribosomal RNA operon require verification with PCR. Tandem repeats require manual intervention and typically rely on single nucleotide polymorphisms to be resolved. Remaining gaps require PCR reactions and sequencing. Once the genomes have been ''closed'', low quality regions are addressed by resequencing reactions. (2) Genome analysis: The final consensus sequences are processed by combining the results of three gene modelers: Glimmer, Critica and Generation. The final gene models are submitted to a battery of homology searches and domain prediction programs in order to annotate them (e.g. BLAST, Pfam, TIGRfam, COG, KEGG, InterPro, TMhmm, SignalP). The genome structure is also assessed in terms of G+C content, GC bias (GC skew), and locations of repeated regions (e.g. IS elements) and phage-like genes. (3) Comparative genomics: The results of the various genome analyses are compared between the finished (or almost finished) genomes. Here, we have compared the F. tularensis genomes from the extremely lethal strain Schu4 (subsp. tularensis), the vaccine strain LVS (subsp. holartica), and strain UT01-4992 of the less virulent, opportunistic subsp. novicida. Regions present in the highly virulent strain that are absent from the other less virulent strains may provide insight into what factors are required for the high level of virulence.« less
The expanding universe of transposon technologies for gene and cell engineering.
Ivics, Zoltán; Izsvák, Zsuzsanna
2010-12-07
Transposable elements can be viewed as natural DNA transfer vehicles that, similar to integrating viruses, are capable of efficient genomic insertion. The mobility of class II transposable elements (DNA transposons) can be controlled by conditionally providing the transposase component of the transposition reaction. Thus, a DNA of interest (be it a fluorescent marker, a small hairpin (sh)RNA expression cassette, a mutagenic gene trap or a therapeutic gene construct) cloned between the inverted repeat sequences of a transposon-based vector can be used for stable genomic insertion in a regulated and highly efficient manner. This methodological paradigm opened up a number of avenues for genome manipulations in vertebrates, including transgenesis for the generation of transgenic cells in tissue culture, the production of germline transgenic animals for basic and applied research, forward genetic screens for functional gene annotation in model species, and therapy of genetic disorders in humans. Sleeping Beauty (SB) was the first transposon shown to be capable of gene transfer in vertebrate cells, and recent results confirm that SB supports a full spectrum of genetic engineering including transgenesis, insertional mutagenesis, and therapeutic somatic gene transfer both ex vivo and in vivo. The first clinical application of the SB system will help to validate both the safety and efficacy of this approach. In this review, we describe the major transposon systems currently available (with special emphasis on SB), discuss the various parameters and considerations pertinent to their experimental use, and highlight the state of the art in transposon technology in diverse genetic applications.
The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes.
Shmakov, Sergey A; Sitnik, Vassilii; Makarova, Kira S; Wolf, Yuri I; Severinov, Konstantin V; Koonin, Eugene V
2017-09-19
Clustered regularly interspaced short palindromic repeats and CRISPR-associated protein (CRISPR-Cas) systems store the memory of past encounters with foreign DNA in unique spacers that are inserted between direct repeats in CRISPR arrays. For only a small fraction of the spacers, homologous sequences, called protospacers, are detectable in viral, plasmid, and microbial genomes. The rest of the spacers remain the CRISPR "dark matter." We performed a comprehensive analysis of the spacers from all CRISPR- cas loci identified in bacterial and archaeal genomes, and we found that, depending on the CRISPR-Cas subtype and the prokaryotic phylum, protospacers were detectable for 1% to about 19% of the spacers (~7% global average). Among the detected protospacers, the majority, typically 80 to 90%, originated from viral genomes, including proviruses, and among the rest, the most common source was genes that are integrated into microbial chromosomes but are involved in plasmid conjugation or replication. Thus, almost all spacers with identifiable protospacers target mobile genetic elements (MGE). The GC content, as well as dinucleotide and tetranucleotide compositions, of microbial genomes, their spacer complements, and the cognate viral genomes showed a nearly perfect correlation and were almost identical. Given the near absence of self-targeting spacers, these findings are most compatible with the possibility that the spacers, including the dark matter, are derived almost completely from the species-specific microbial mobilomes. IMPORTANCE The principal function of CRISPR-Cas systems is thought to be protection of bacteria and archaea against viruses and other parasitic genetic elements. The CRISPR defense function is mediated by sequences from parasitic elements, known as spacers, that are inserted into CRISPR arrays and then transcribed and employed as guides to identify and inactivate the cognate parasitic genomes. However, only a small fraction of the CRISPR spacers match any sequences in the current databases, and of these, only a minority correspond to known parasitic elements. We show that nearly all spacers with matches originate from viral or plasmid genomes that are either free or have been integrated into the host genome. We further demonstrate that spacers with no matches have the same properties as those of identifiable origins, strongly suggesting that all spacers originate from mobile elements.
Whole Wiskott‑Aldrich syndrome protein gene deletion identified by high throughput sequencing.
He, Xiangling; Zou, Runying; Zhang, Bing; You, Yalan; Yang, Yang; Tian, Xin
2017-11-01
Wiskott‑Aldrich syndrome (WAS) is a rare X‑linked recessive immunodeficiency disorder, characterized by thrombocytopenia, small platelets, eczema and recurrent infections associated with increased risk of autoimmunity and malignancy disorders. Mutations in the WAS protein (WASP) gene are responsible for WAS. To date, WASP mutations, including missense/nonsense, splicing, small deletions, small insertions, gross deletions, and gross insertions have been identified in patients with WAS. In addition, WASP‑interacting proteins are suspected in patients with clinical features of WAS, in whom the WASP gene sequence and mRNA levels are normal. The present study aimed to investigate the application of next generation sequencing in definitive diagnosis and clinical therapy for WAS. A 5 month‑old child with WAS who displayed symptoms of thrombocytopenia was examined. Whole exome sequence analysis of genomic DNA showed that the coverage and depth of WASP were extremely low. Quantitative polymerase chain reaction indicated total WASP gene deletion in the proband. In conclusion, high throughput sequencing is useful for the verification of WAS on the genetic profile, and has implications for family planning guidance and establishment of clinical programs.
Kim, Shin-Hee; Nayak, Subhashree; Paldurai, Anandan; Nayak, Baibaswata; Samuel, Arthur; Aplogan, Gilbert L.; Awoume, Kodzo A.; Webby, Richard J.; Ducatez, Mariette F.; Collins, Peter L.
2012-01-01
The complete genome sequence of an African Newcastle disease virus (NDV) strain isolated from a chicken in Togo in 2009 was determined. The genome is 15,198 nucleotides (nt) in length and is classified in genotype VII in the class II cluster. Compared to common vaccine strains, the African strain contains a previously described 6-nt insert in the downstream untranslated region of the N gene and a novel 6-nt insert in the HN-L intergenic region. Genome length differences are a marker of the natural history of NDV. This is the first description of a class II NDV strain with a genome of 15,198 nt and a 6-nt insert in the HN-L intergenic region. Sequence divergence relative to vaccine strains was substantial, likely contributes to outbreaks, and illustrates the continued evolution of new NDV strains in West Africa. PMID:22997417
Fajardo, Diego; Schlautman, Brandon; Steffan, Shawn; Polashock, James; Vorsa, Nicholi; Zalapa, Juan
2014-02-25
This is the first de novo assembly and annotation of a complete mitochondrial genome in the Ericales order from the American cranberry (Vaccinium macrocarpon Ait.). Moreover, only four complete Asterid mitochondrial genomes have been made publicly available. The cranberry mitochondrial genome was assembled and reconstructed from whole genome 454 Roche GS-FLX and Illumina shotgun sequences. Compared with other Asterids, the reconstruction of the genome revealed an average size mitochondrion (459,678 nt) with relatively little repetitive sequences and DNA of plastid origin. The complete mitochondrial genome of cranberry was annotated obtaining a total of 34 genes classified based on their putative function, plus three ribosomal RNAs, and 17 transfer RNAs. Maternal organellar cranberry inheritance was inferred by analyzing gene variation in the cranberry mitochondria and plastid genomes. The annotation of cranberry mitochondrial genome revealed the presence of two copies of tRNA-Sec and a selenocysteine insertion sequence (SECIS) element which were lost in plants during evolution. This is the first report of a land plant possessing selenocysteine insertion machinery at the sequence level. Published by Elsevier B.V.
A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome.
Konkel, Miriam K; Batzer, Mark A
2010-08-01
It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families - long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements - mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development. Copyright © 2010 Elsevier Ltd. All rights reserved.
Genetics of attention deficit hyperactivity disorder.
Faraone, Stephen V; Larsson, Henrik
2018-06-11
Decades of research show that genes play an vital role in the etiology of attention deficit hyperactivity disorder (ADHD) and its comorbidity with other disorders. Family, twin, and adoption studies show that ADHD runs in families. ADHD's high heritability of 74% motivated the search for ADHD susceptibility genes. Genetic linkage studies show that the effects of DNA risk variants on ADHD must, individually, be very small. Genome-wide association studies (GWAS) have implicated several genetic loci at the genome-wide level of statistical significance. These studies also show that about a third of ADHD's heritability is due to a polygenic component comprising many common variants each having small effects. From studies of copy number variants we have also learned that the rare insertions or deletions account for part of ADHD's heritability. These findings have implicated new biological pathways that may eventually have implications for treatment development.
The insertional history of an active family of L1 retrotransposons in humans.
Boissinot, Stéphane; Entezam, Ali; Young, Lynn; Munson, Peter J; Furano, Anthony V
2004-07-01
As humans contain a currently active L1 (LINE-1) non-LTR retrotransposon family (Ta-1), the human genome database likely provides only a partial picture of Ta-1-generated diversity. We used a non-biased method to clone Ta-1 retrotransposon-containing loci from representatives of four ethnic populations. We obtained 277 distinct Ta-1 loci and identified an additional 67 loci in the human genome database. This collection represents approximately 90% of the Ta-1 population in the individuals examined and is thus more representative of the insertional history of Ta-1 than the human genome database, which lacked approximately 40% of our cloned Ta-1 elements. As both polymorphic and fixed Ta-1 elements are as abundant in the GC-poor genomic regions as in ancestral L1 elements, the enrichment of L1 elements in GC-poor areas is likely due to insertional bias rather than selection. Although the chromosomal distribution of Ta-1 inserts is generally a function of chromosomal length and gene density, chromosome 4 significantly deviates from this pattern and has been much more hospitable to Ta-1 insertions than any other chromosome. Also, the intra-chromosomal distribution of Ta-1 elements is not uniform. Ta-1 elements tend to cluster, and the maximal gaps between Ta-1 inserts are larger than would be expected from a model of uniform random insertion. Copyright 2004 Cold Spring Harbor Laboratory Press ISSN
High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.
Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca
2015-01-01
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Unique transposon landscapes are pervasive across Drosophila melanogaster genomes
Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay; Sytnikova, Yuliya A.; Brembs, Björn; Bergman, Casey M.; Lau, Nelson C.
2015-01-01
To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains. PMID:26578579
Insertional mutagenesis using Tnt1 retrotransposon in potato
USDA-ARS?s Scientific Manuscript database
Potato is the third most important food crop in the world. However, genetics and genomics research of potato has lagged behind many major crop species due to its autotetraploidy and a highly heterogeneous genome. Insertional mutagenesis using T-DNA or transposable elements, which is available in sev...
2014-01-01
Background Trichomonas vaginalis is the most prevalent non-viral sexually transmitted parasite. Although the protist is presumed to reproduce asexually, 60% of its haploid genome contains transposable elements (TEs), known contributors to genome variability. The availability of a draft genome sequence and our collection of >200 global isolates of T. vaginalis facilitate the study and analysis of TE population dynamics and their contribution to genomic variability in this protist. Results We present here a pilot study of a subset of class II Tc1/mariner TEs that belong to the T. vaginalis Tvmar1 family. We report the genetic structure of 19 Tvmar1 loci, their ability to encode a full-length transposase protein, and their insertion frequencies in 94 global isolates from seven regions of the world. While most of the Tvmar1 elements studied exhibited low insertion frequencies, two of the 19 loci (locus 1 and locus 9) show high insertion frequencies of 1.00 and 0.96, respectively. The genetic structuring of the global populations identified by principal component analysis (PCA) of the Tvmar1 loci is in general agreement with published data based on genotyping, showing that Tvmar1 polymorphisms are a robust indicator of T. vaginalis genetic history. Analysis of expression of 22 genes flanking 13 Tvmar1 loci indicated significantly altered expression of six of the genes next to five Tvmar1 insertions, suggesting that the insertions have functional implications for T. vaginalis gene expression. Conclusions Our study is the first in T. vaginalis to describe Tvmar1 population dynamics and its contribution to genetic variability of the parasite. We show that a majority of our studied Tvmar1 insertion loci exist at very low frequencies in the global population, and insertions are variable between geographical isolates. In addition, we observe that low frequency insertion is related to reduced or abolished expression of flanking genes. While low insertion frequencies might be expected, we identified two Tvmar1 insertion loci that are fixed across global populations. This observation indicates that Tvmar1 insertion may have differing impacts and fitness costs in the host genome and may play varying roles in the adaptive evolution of T. vaginalis. PMID:24834134
Complete Genomic Structure of the Bloom-forming Toxic Cyanobacterium Microcystis aeruginosa NIES-843
Kaneko, Takakazu; Nakajima, Nobuyoshi; Okamoto, Shinobu; Suzuki, Iwane; Tanabe, Yuuhiko; Tamaoki, Masanori; Nakamura, Yasukazu; Kasai, Fumie; Watanabe, Akiko; Kawashima, Kumiko; Kishida, Yoshie; Ono, Akiko; Shimizu, Yoshimi; Takahashi, Chika; Minami, Chiharu; Fujishiro, Tsunakazu; Kohara, Mitsuyo; Katoh, Midori; Nakazaki, Naomi; Nakayama, Shinobu; Yamada, Manabu; Tabata, Satoshi; Watanabe, Makoto M.
2007-01-01
Abstract The nucleotide sequence of the complete genome of a cyanobacterium, Microcystis aeruginosa NIES-843, was determined. The genome of M. aeruginosa is a single, circular chromosome of 5 842 795 base pairs (bp) in length, with an average GC content of 42.3%. The chromosome comprises 6312 putative protein-encoding genes, two sets of rRNA genes, 42 tRNA genes representing 41 tRNA species, and genes for tmRNA, the B subunit of RNase P, SRP RNA, and 6Sa RNA. Forty-five percent of the putative protein-encoding sequences showed sequence similarity to genes of known function, 32% were similar to hypothetical genes, and the remaining 23% had no apparent similarity to reported genes. A total of 688 kb of the genome, equivalent to 11.8% of the entire genome, were composed of both insertion sequences and miniature inverted-repeat transposable elements. This is indicative of a plasticity of the M. aeruginosa genome, through a mechanism that involves homologous recombination mediated by repetitive DNA elements. In addition to known gene clusters related to the synthesis of microcystin and cyanopeptolin, novel gene clusters that may be involved in the synthesis and modification of toxic small polypeptides were identified. Compared with other cyanobacteria, a relatively small number of genes for two component systems and a large number of genes for restriction-modification systems were notable characteristics of the M. aeruginosa genome. PMID:18192279
Lim, Kwang-il; Klimczak, Ryan; Yu, Julie H.; Schaffer, David V.
2010-01-01
Retroviral vectors offer benefits of efficient delivery and stable gene expression; however, their clinical use raises the concerns of insertional mutagenesis and potential oncogenesis due to genomic integration preferences in transcriptional start sites (TSS). We have shifted the integration preferences of retroviral vectors by generating a library of viral variants with a DNA-binding domain inserted at random positions throughout murine leukemia virus Gag-Pol, then selecting for variants that are viable and exhibit altered integration properties. We found seven permissive zinc finger domain (ZFD) insertion sites throughout Gag-Pol, including within p12, reverse transcriptase, and integrase. Comprehensive genome integration analysis showed that several ZFD insertions yielded retroviral vector variants with shifted integration patterns that did not favor TSS. Furthermore, integration site analysis revealed selective integration for numerous mutants. For example, two retroviral variants with a given ZFD at appropriate positions in Gag-Pol strikingly integrated primarily into four common sites out of 3.1 × 109 possible human genome locations (P = 4.6 × 10-29). Our findings demonstrate that insertion of DNA-binding motifs into multiple locations in Gag-Pol can make considerable progress toward engineering safer retroviral vectors that integrate into a significantly narrowed pool of sites on human genome and overcome the preference for TSS. PMID:20616052
Efficient mouse genome engineering by CRISPR-EZ technology.
Modzelewski, Andrew J; Chen, Sean; Willis, Brandon J; Lloyd, K C Kent; Wood, Joshua A; He, Lin
2018-06-01
CRISPR/Cas9 technology has transformed mouse genome editing with unprecedented precision, efficiency, and ease; however, the current practice of microinjecting CRISPR reagents into pronuclear-stage embryos remains rate-limiting. We thus developed CRISPR ribonucleoprotein (RNP) electroporation of zygotes (CRISPR-EZ), an electroporation-based technology that outperforms pronuclear and cytoplasmic microinjection in efficiency, simplicity, cost, and throughput. In C57BL/6J and C57BL/6N mouse strains, CRISPR-EZ achieves 100% delivery of Cas9/single-guide RNA (sgRNA) RNPs, facilitating indel mutations (insertions or deletions), exon deletions, point mutations, and small insertions. In a side-by-side comparison in the high-throughput KnockOut Mouse Project (KOMP) pipeline, CRISPR-EZ consistently outperformed microinjection. Here, we provide an optimized protocol covering sgRNA synthesis, embryo collection, RNP electroporation, mouse generation, and genotyping strategies. Using CRISPR-EZ, a graduate-level researcher with basic embryo-manipulation skills can obtain genetically modified mice in 6 weeks. Altogether, CRISPR-EZ is a simple, economic, efficient, and high-throughput technology that is potentially applicable to other mammalian species.
Sternburg, Erin L; Dias, Kristen C; Karginov, Fedor V
2017-06-16
The CRISPR/Cas9 genome engineering system has revolutionized biology by allowing for precise genome editing with little effort. Guided by a single guide RNA (sgRNA) that confers specificity, the Cas9 protein cleaves both DNA strands at the targeted locus. The DNA break can trigger either non-homologous end joining (NHEJ) or homology directed repair (HDR). NHEJ can introduce small deletions or insertions which lead to frame-shift mutations, while HDR allows for larger and more precise perturbations. Here, we present protocols for generating knockout cell lines by coupling established CRISPR/Cas9 methods with two options for downstream selection/screening. The NHEJ approach uses a single sgRNA cut site and selection-independent screening, where protein production is assessed by dot immunoblot in a high-throughput manner. The HDR approach uses two sgRNA cut sites that span the gene of interest. Together with a provided HDR template, this method can achieve deletion of tens of kb, aided by the inserted selectable resistance marker. The appropriate applications and advantages of each method are discussed.
CRISPR-Cpf1 assisted genome editing of Corynebacterium glutamicum
Jiang, Yu; Qian, Fenghui; Yang, Junjie; Liu, Yingmiao; Dong, Feng; Xu, Chongmao; Sun, Bingbing; Chen, Biao; Xu, Xiaoshu; Li, Yan; Wang, Renxiao; Yang, Sheng
2017-01-01
Corynebacterium glutamicum is an important industrial metabolite producer that is difficult to genetically engineer. Although the Streptococcus pyogenes (Sp) CRISPR-Cas9 system has been adapted for genome editing of multiple bacteria, it cannot be introduced into C. glutamicum. Here we report a Francisella novicida (Fn) CRISPR-Cpf1-based genome-editing method for C. glutamicum. CRISPR-Cpf1, combined with single-stranded DNA (ssDNA) recombineering, precisely introduces small changes into the bacterial genome at efficiencies of 86–100%. Large gene deletions and insertions are also obtained using an all-in-one plasmid consisting of FnCpf1, CRISPR RNA, and homologous arms. The two CRISPR-Cpf1-assisted systems enable N iterative rounds of genome editing in 3N+4 or 3N+2 days. A proof-of-concept, codon saturation mutagenesis at G149 of γ-glutamyl kinase relieves L-proline inhibition using Cpf1-assisted ssDNA recombineering. Thus, CRISPR-Cpf1-based genome editing provides a highly efficient tool for genetic engineering of Corynebacterium and other bacteria that cannot utilize the Sp CRISPR-Cas9 system. PMID:28469274
Liu, Siyang; Huang, Shujia; Rao, Junhua; Ye, Weijian; Krogh, Anders; Wang, Jun
2015-01-01
Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.
2009-01-01
Background Insertional mutagenesis is an effective method for functional genomic studies in various organisms. It can rapidly generate easily tractable mutations. A large-scale insertional mutagenesis with the piggyBac (PB) transposon is currently performed in mice at the Institute of Developmental Biology and Molecular Medicine (IDM), Fudan University in Shanghai, China. This project is carried out via collaborations among multiple groups overseeing interconnected experimental steps and generates a large volume of experimental data continuously. Therefore, the project calls for an efficient database system for recording, management, statistical analysis, and information exchange. Results This paper presents a database application called MP-PBmice (insertional mutation mapping system of PB Mutagenesis Information Center), which is developed to serve the on-going large-scale PB insertional mutagenesis project. A lightweight enterprise-level development framework Struts-Spring-Hibernate is used here to ensure constructive and flexible support to the application. The MP-PBmice database system has three major features: strict access-control, efficient workflow control, and good expandability. It supports the collaboration among different groups that enter data and exchange information on daily basis, and is capable of providing real time progress reports for the whole project. MP-PBmice can be easily adapted for other large-scale insertional mutation mapping projects and the source code of this software is freely available at http://www.idmshanghai.cn/PBmice. Conclusion MP-PBmice is a web-based application for large-scale insertional mutation mapping onto the mouse genome, implemented with the widely used framework Struts-Spring-Hibernate. This system is already in use by the on-going genome-wide PB insertional mutation mapping project at IDM, Fudan University. PMID:19958505
High-throughput analysis of T-DNA location and structure using sequence capture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
High-throughput analysis of T-DNA location and structure using sequence capture
Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...
2015-10-07
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton
2015-01-01
Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of "domestication" of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci.
Population and clinical genetics of human transposable elements in the (post) genomic era
Rishishwar, Lavanya; Wang, Lu; Clayton, Evan A.; Mariño-Ramírez, Leonardo; McDonald, John F.; Jordan, I. King
2017-01-01
ABSTRACT Recent technological developments—in genomics, bioinformatics and high-throughput experimental techniques—are providing opportunities to study ongoing human transposable element (TE) activity at an unprecedented level of detail. It is now possible to characterize genome-wide collections of TE insertion sites for multiple human individuals, within and between populations, and for a variety of tissue types. Comparison of TE insertion site profiles between individuals captures the germline activity of TEs and reveals insertion site variants that segregate as polymorphisms among human populations, whereas comparison among tissue types ascertains somatic TE activity that generates cellular heterogeneity. In this review, we provide an overview of these new technologies and explore their implications for population and clinical genetic studies of human TEs. We cover both recent published results on human TE insertion activity as well as the prospects for future TE studies related to human evolution and health. PMID:28228978
Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A
2015-08-29
The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Rapid discrimination of sequences flanking and within T-DNA insertions in the Arabidopsis genome.
Ponce, M R; Quesada, V; Micol, J L
1998-05-01
An improvement to previous methods for recovering Arabidopsis thaliana genomic DNA flanking T-DNA insertions is presented that allows for the avoidance of some of the cloning difficulties caused by the concatameric nature of T-DNA inserts. The principle of the procedure is to categorize by size restriction fragments of mutant DNA, produced in separate digestions with NdeI and Bst1107I. Given that the sites for these two enzymes are contiguous within the pGV3850:1003 T-DNA construct, the restriction fragments obtained fall into two categories: those showing identical size in both digestions, which correspond to sequences internal to T-DNA concatamers; and those of different sizes, that contain the junctions between plant DNA and the T-DNA insert. Such a criterion makes it possible to easily distinguish the digestion products corresponding to internal T-DNA parts, which do not deserve further attention, and those which presumably include a segment of the locus of interest. Discrimination between restriction fragments of genomic mutant DNA can be made on rescued plasmids, inverse PCR amplification products or bands in a genomic blot.
Insertion sequence diversity in archaea.
Filée, J; Siguier, P; Chandler, M
2007-03-01
Insertion sequences (ISs) can constitute an important component of prokaryotic (bacterial and archaeal) genomes. Over 1,500 individual ISs are included at present in the ISfinder database (www-is.biotoul.fr), and these represent only a small portion of those in the available prokaryotic genome sequences and those that are being discovered in ongoing sequencing projects. In spite of this diversity, the transposition mechanisms of only a few of these ubiquitous mobile genetic elements are known, and these are all restricted to those present in bacteria. This review presents an overview of ISs within the archaeal kingdom. We first provide a general historical summary of the known properties and behaviors of archaeal ISs. We then consider how transposition might be regulated in some cases by small antisense RNAs and by termination codon readthrough. This is followed by an extensive analysis of the IS content in the sequenced archaeal genomes present in the public databases as of June 2006, which provides an overview of their distribution among the major archaeal classes and species. We show that the diversity of archaeal ISs is very great and comparable to that of bacteria. We compare archaeal ISs to known bacterial ISs and find that most are clearly members of families first described for bacteria. Several cases of lateral gene transfer between bacteria and archaea are clearly documented, notably for methanogenic archaea. However, several archaeal ISs do not have bacterial equivalents but can be grouped into Archaea-specific groups or families. In addition to ISs, we identify and list nonautonomous IS-derived elements, such as miniature inverted-repeat transposable elements. Finally, we present a possible scenario for the evolutionary history of ISs in the Archaea.
Schiavo, G; Strillacci, M G; Ribani, A; Bovo, S; Roman-Ponce, S I; Cerolini, S; Bertolini, F; Bagnato, A; Fontanesi, L
2018-06-01
Mitochondrial DNA (mtDNA) insertions have been detected in the nuclear genome of many eukaryotes. These sequences are pseudogenes originated by horizontal transfer of mtDNA fragments into the nuclear genome, producing nuclear DNA sequences of mitochondrial origin (numt). In this study we determined the frequency and distribution of mtDNA-originated pseudogenes in the turkey (Meleagris gallopavo) nuclear genome. The turkey reference genome (Turkey_2.01) was aligned with the reference linearized mtDNA sequence using last. A total of 32 numt sequences (corresponding to 18 numt regions derived by unique insertional events) were identified in the turkey nuclear genome (size ranging from 66 to 1415 bp; identity against the modern turkey mtDNA corresponding region ranging from 62% to 100%). Numts were distributed in nine chromosomes and in one scaffold. They derived from parts of 10 mtDNA protein-coding genes, ribosomal genes, the control region and 10 tRNA genes. Seven numt regions reported in the turkey genome were identified in orthologues positions in the Gallus gallus genome and therefore were present in the ancestral genome that in the Cretaceous originated the lineages of the modern crown Galliformes. Five recently integrated turkey numts were validated by PCR in 168 turkeys of six different domestic populations. None of the analysed numts were polymorphic (i.e. absence of the inserted sequence, as reported in numts of recent integration in other species), suggesting that the reticulate speciation model is not useful for explaining the origin of the domesticated turkey lineage. © 2018 Stichting International Foundation for Animal Genetics.
Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda
2012-01-01
Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of -45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a -10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated.
Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E.; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda
2012-01-01
Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of ∼45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a ∼10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated. PMID:23071448
Burke, W D; Calalang, C C; Eickbush, T H
1987-01-01
Two classes of DNA elements interrupt a fraction of the rRNA repeats of Bombyx mori. We have analyzed by genomic blotting and sequence analysis one class of these elements which we have named R2. These elements occupy approximately 9% of the rDNA units of B. mori and appear to be homologous to the type II rDNA insertions detected in Drosophila melanogaster. Approximately 25 copies of R2 exist within the B. mori genome, of which at least 20 are located at a precise location within otherwise typical rDNA units. Nucleotide sequence analysis has revealed that the 4.2-kilobase-pair R2 element has a single large open reading frame, occupying over 82% of the total length of the element. The central region of this 1,151-amino-acid open reading frame shows homology to the reverse transcriptase enzymes found in retroviruses and certain transposable elements. Amino acid homology of this region is highest to the mobile line 1 elements of mammals, followed by the mitochondrial type II introns of fungi, and the pol gene of retroviruses. Less homology exists with transposable elements of D. melanogaster and Saccharomyces cerevisiae. Two additional regions of sequence homology between L1 and R2 elements were also found outside the reverse transcriptase region. We suggest that the R2 elements are retrotransposons that are site specific in their insertion into the genome. Such mobility would enable these elements to occupy a small fraction of the rDNA units of B. mori despite their continual elimination from the rDNA locus by sequence turnover. Images PMID:2439905
Shi, Xue; Zeng, Haiyang; Xue, Yadong; Luo, Meizhong
2011-10-11
Large-insert BAC and BIBAC libraries are important tools for structural and functional genomics studies of eukaryotic genomes. To facilitate the construction of BAC and BIBAC libraries and the transfer of complete large BAC inserts into BIBAC vectors, which is desired in positional cloning, we developed a pair of new BAC and BIBAC vectors. The new BAC vector pIndigoBAC536-S and the new BIBAC vector BIBAC-S have the following features: 1) both contain two 18-bp non-palindromic I-SceI sites in an inverted orientation at positions that flank an identical DNA fragment containing the lacZ selection marker and the cloning site. Large DNA inserts can be excised from the vectors as single fragments by cutting with I-SceI, allowing the inserts to be easily sized. More importantly, because the two vectors contain different antibiotic resistance genes for transformant selection and produce the same non-complementary 3' protruding ATAA ends by I-SceI that suppress self- and inter-ligations, the exchange of intact large genomic DNA inserts between the BAC and BIBAC vectors is straightforward; 2) both were constructed as high-copy composite vectors. Reliable linearized and dephosphorylated original low-copy pIndigoBAC536-S and BIBAC-S vectors that are ready for library construction can be prepared from the high-copy composite vectors pHZAUBAC1 and pHZAUBIBAC1, respectively, without the need for additional preparation steps or special reagents, thus simplifying the construction of BAC and BIBAC libraries. BIBAC clones constructed with the new BIBAC-S vector are stable in both E. coli and Agrobacterium. The vectors can be accessed through our website http://GResource.hzau.edu.cn. The two new vectors and their respective high-copy composite vectors can largely facilitate the construction and characterization of BAC and BIBAC libraries. The transfer of complete large genomic DNA inserts from one vector to the other is made straightforward.
Incomplete Lineage Sorting and Hybridization Statistics for Large-Scale Retroposon Insertion Data
Kuritzin, Andrej; Kischka, Tabea
2016-01-01
Ancient retroposon insertions can be used as virtually homoplasy-free markers to reconstruct the phylogenetic history of species. Inherited, orthologous insertions in related species offer reliable signals of a common origin of the given species. One prerequisite for such a phylogenetically informative insertion is that the inserted element was fixed in the ancestral population before speciation; if not, polymorphically inserted elements may lead to random distributions of presence/absence states during speciation and possibly to apparently conflicting reconstructions of their ancestry. Fortunately, such misleading fixed cases are relatively rare but nevertheless, need to be considered. Here, we present novel, comprehensive statistical models applicable for (1) analyzing any pattern of rare genomic changes, (2) testing and differentiating conflicting phylogenetic reconstructions based on rare genomic changes caused by incomplete lineage sorting or/and ancestral hybridization, and (3) differentiating between search strategies involving genome information from one or several lineages. When the new statistics are applied, in non-conflicting cases a minimum of three elements present in both of two species and absent in a third group are considered significant support (p<0.05) for the branching of the third from the other two, if all three of the given species are screened equally for genome or experimental data. Five elements are necessary for significant support (p<0.05) if a diagnostic locus derived from only one of three species is screened, and no conflicting markers are detected. Most potentially conflicting patterns can be evaluated for their significance and ancestral hybridization can be distinguished from incomplete lineage sorting by considering symmetric or asymmetric distribution of rare genomic changes among possible tree configurations. Additionally, we provide an R-application to make the new KKSC insertion significance test available for the scientific community at http://retrogenomics.uni-muenster.de:3838/KKSC_significance_test/. PMID:26967525
The complete chloroplast genome of Capsicum annuum var. glabriusculum using Illumina sequencing.
Raveendar, Sebastin; Na, Young-Wang; Lee, Jung-Ro; Shim, Donghwan; Ma, Kyung-Ho; Lee, Sok-Young; Chung, Jong-Wook
2015-07-20
Chloroplast (cp) genome sequences provide a valuable source for DNA barcoding. Molecular phylogenetic studies have concentrated on DNA sequencing of conserved gene loci. However, this approach is time consuming and more difficult to implement when gene organization differs among species. Here we report the complete re-sequencing of the cp genome of Capsicum pepper (Capsicum annuum var. glabriusculum) using the Illumina platform. The total length of the cp genome is 156,817 bp with a 37.7% overall GC content. A pair of inverted repeats (IRs) of 50,284 bp were separated by a small single copy (SSC; 18,948 bp) and a large single copy (LSC; 87,446 bp). The number of cp genes in C. annuum var. glabriusculum is the same as that in other Capsicum species. Variations in the lengths of LSC; SSC and IR regions were the main contributors to the size variation in the cp genome of this species. A total of 125 simple sequence repeat (SSR) and 48 insertions or deletions variants were found by sequence alignment of Capsicum cp genome. These findings provide a foundation for further investigation of cp genome evolution in Capsicum and other higher plants.
Kuhn, Alexandre; Ong, Yao Min; Cheng, Ching-Yu; Wong, Tien Yin; Quake, Stephen R; Burkholder, William F
2014-06-03
Insertions of the human-specific subfamily of LINE-1 (L1) retrotransposon are highly polymorphic across individuals and can critically influence the human transcriptome. We hypothesized that L1 insertions could represent genetic variants determining important human phenotypic traits, and performed an integrated analysis of L1 elements and single nucleotide polymorphisms (SNPs) in several human populations. We found that a large fraction of L1s were in high linkage disequilibrium with their surrounding genomic regions and that they were well tagged by SNPs. However, L1 variants were only partially captured by SNPs on standard SNP arrays, so that their potential phenotypic impact would be frequently missed by SNP array-based genome-wide association studies. We next identified potential phenotypic effects of L1s by looking for signatures of natural selection linked to L1 insertions; significant extended haplotype homozygosity was detected around several L1 insertions. This finding suggests that some of these L1 insertions may have been the target of recent positive selection.
Ogier, Jean-Claude; Pagès, Sylvie; Bisch, Gaëlle; Chiapello, Hélène; Médigue, Claudine; Rouy, Zoé; Teyssier, Corinne; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie
2014-01-01
Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Unlike other Xenorhabdus species, Xenorhabdus poinarii is avirulent when injected into insects in the absence of its nematode host. We sequenced the genome of the X. poinarii strain G6 and the closely related but virulent X. doucetiae strain FRM16. G6 had a smaller genome (500–700 kb smaller) than virulent Xenorhabdus strains and lacked genes encoding potential virulence factors (hemolysins, type 5 secretion systems, enzymes involved in the synthesis of secondary metabolites, and toxin–antitoxin systems). The genomes of all the X. poinarii strains analyzed here had a similar small size. We did not observe the accumulation of pseudogenes, insertion sequences or decrease in coding density usually seen as a sign of genomic erosion driven by genetic drift in host-adapted bacteria. Instead, genome reduction of X. poinarii seems to have been mediated by the excision of genomic blocks from the flexible genome, as reported for the genomes of attenuated free pathogenic bacteria and some facultative mutualistic bacteria growing exclusively within hosts. This evolutionary pathway probably reflects the adaptation of X. poinarii to specific host. PMID:24904010
Biological Characterization of CVRM2-BAC, A Recombinant CV1988 Virus Containing an REV LTR Insertion
USDA-ARS?s Scientific Manuscript database
It has been previously reported that avian retroviruses, i.e. avian leukosis virus (ALV) and reticoloendotheliosis virus (REV), integrate in the Marek’s disease virus genome affecting MDV pathogenicity. RM-2 is an attenuated serotype 1 MDV virus generated by insertion of the REV LTR in the genome of...
Fast ancestral gene order reconstruction of genomes with unequal gene content.
Feijão, Pedro; Araujo, Eloi
2016-11-11
During evolution, genomes are modified by large scale structural events, such as rearrangements, deletions or insertions of large blocks of DNA. Of particular interest, in order to better understand how this type of genomic evolution happens, is the reconstruction of ancestral genomes, given a phylogenetic tree with extant genomes at its leaves. One way of solving this problem is to assume a rearrangement model, such as Double Cut and Join (DCJ), and find a set of ancestral genomes that minimizes the number of events on the input tree. Since this problem is NP-hard for most rearrangement models, exact solutions are practical only for small instances, and heuristics have to be used for larger datasets. This type of approach can be called event-based. Another common approach is based on finding conserved structures between the input genomes, such as adjacencies between genes, possibly also assigning weights that indicate a measure of confidence or probability that this particular structure is present on each ancestral genome, and then finding a set of non conflicting adjacencies that optimize some given function, usually trying to maximize total weight and minimizing character changes in the tree. We call this type of methods homology-based. In previous work, we proposed an ancestral reconstruction method that combines homology- and event-based ideas, using the concept of intermediate genomes, that arise in DCJ rearrangement scenarios. This method showed better rate of correctly reconstructed adjacencies than other methods, while also being faster, since the use of intermediate genomes greatly reduces the search space. Here, we generalize the intermediate genome concept to genomes with unequal gene content, extending our method to account for gene insertions and deletions of any length. In many of the simulated datasets, our proposed method had better results than MLGO and MGRA, two state-of-the-art algorithms for ancestral reconstruction with unequal gene content, while running much faster, making it more scalable to larger datasets. Studing ancestral reconstruction problems under a new light, using the concept of intermediate genomes, allows the design of very fast algorithms by greatly reducing the solution search space, while also giving very good results. The algorithms introduced in this paper were implemented in an open-source software called RINGO (ancestral Reconstruction with INtermediate GenOmes), available at https://github.com/pedrofeijao/RINGO .
Evolution of histone 2A for chromatin compaction in eukaryotes
Macadangdang, Benjamin R; Oberai, Amit; Spektor, Tanya; Campos, Oscar A; Sheng, Fang; Carey, Michael F; Vogelauer, Maria; Kurdistani, Siavash K
2014-01-01
During eukaryotic evolution, genome size has increased disproportionately to nuclear volume, necessitating greater degrees of chromatin compaction in higher eukaryotes, which have evolved several mechanisms for genome compaction. However, it is unknown whether histones themselves have evolved to regulate chromatin compaction. Analysis of histone sequences from 160 eukaryotes revealed that the H2A N-terminus has systematically acquired arginines as genomes expanded. Insertion of arginines into their evolutionarily conserved position in H2A of a small-genome organism increased linear compaction by as much as 40%, while their absence markedly diminished compaction in cells with large genomes. This effect was recapitulated in vitro with nucleosomal arrays using unmodified histones, indicating that the H2A N-terminus directly modulates the chromatin fiber likely through intra- and inter-nucleosomal arginine–DNA contacts to enable tighter nucleosomal packing. Our findings reveal a novel evolutionary mechanism for regulation of chromatin compaction and may explain the frequent mutations of the H2A N-terminus in cancer. DOI: http://dx.doi.org/10.7554/eLife.02792.001 PMID:24939988
Scala, Valeria; Grottoli, Alessandro; Aiese Cigliano, Riccardo; Anzar, Irantzu; Beccaccioli, Marzia; Fanelli, Corrado; Dall'Asta, Chiara; Battilani, Paola; Reverberi, Massimo; Sanseverino, Walter
2017-05-31
Fusarium verticillioides causes ear rot disease in maize and its contamination with fumonisins, mycotoxins harmful for humans and livestock. Lipids, and their oxidized forms, may drive the fate of this disease. In a previous study, we have explored the role of oxylipins in this interaction by deleting by standard transformation procedures a linoleate diol synthase-coding gene, lds1 , in F. verticillioides . A profound phenotypic diversity in the mutants generated has prompted us to investigate more deeply the whole genome of two lds1 -deleted strains. Bioinformatics analyses pinpoint significant differences in the genome sequences emerged between the wild type and the lds1 -mutants further than those trivially attributable to the deletion of the lds1 locus, such as single nucleotide polymorphisms, small deletion/insertion polymorphisms and structural variations. Results suggest that the effect of a (theoretically) punctual transformation event might have enhanced the natural mechanisms of genomic variability and that transformation practices, commonly used in the reverse genetics of fungi, may potentially be responsible for unexpected, stochastic and henceforth off-target rearrangements throughout the genome.
Scala, Valeria; Grottoli, Alessandro; Aiese Cigliano, Riccardo; Anzar, Irantzu; Beccaccioli, Marzia; Fanelli, Corrado; Dall’Asta, Chiara; Battilani, Paola; Reverberi, Massimo; Sanseverino, Walter
2017-01-01
Fusarium verticillioides causes ear rot disease in maize and its contamination with fumonisins, mycotoxins harmful for humans and livestock. Lipids, and their oxidized forms, may drive the fate of this disease. In a previous study, we have explored the role of oxylipins in this interaction by deleting by standard transformation procedures a linoleate diol synthase-coding gene, lds1, in F. verticillioides. A profound phenotypic diversity in the mutants generated has prompted us to investigate more deeply the whole genome of two lds1-deleted strains. Bioinformatics analyses pinpoint significant differences in the genome sequences emerged between the wild type and the lds1-mutants further than those trivially attributable to the deletion of the lds1 locus, such as single nucleotide polymorphisms, small deletion/insertion polymorphisms and structural variations. Results suggest that the effect of a (theoretically) punctual transformation event might have enhanced the natural mechanisms of genomic variability and that transformation practices, commonly used in the reverse genetics of fungi, may potentially be responsible for unexpected, stochastic and henceforth off-target rearrangements throughout the genome. PMID:28561789
Guérillot, Romain; Siguier, Patricia; Gourbeyre, Edith; Chandler, Michael; Glaser, Philippe
2014-01-01
Transposable elements (TEs) are major components of both prokaryotic and eukaryotic genomes and play a significant role in their evolution. In this study, we have identified new prokaryotic DDE transposase families related to the eukaryotic Mutator-like transposases. These genes were retrieved by cascade PSI-Blast using as initial query the transposase of the streptococcal integrative and conjugative element (ICE) TnGBS2. By combining secondary structure predictions and protein sequence alignments, we predicted the DDE catalytic triad and the DNA-binding domain recognizing the terminal inverted repeats. Furthermore, we systematically characterized the organization and the insertion specificity of the TEs relying on these prokaryotic Mutator-like transposases (p-MULT) for their mobility. Strikingly, two distant TE families target their integration upstream σA dependent promoters. This allowed us to identify a transposase sequence signature associated with this unique insertion specificity and to show that the dissymmetry between the two inverted repeats is responsible for the orientation of the insertion. Surprisingly, while DDE transposases are generally associated with small and simple transposons such as insertion sequences (ISs), p-MULT encoding TEs show an unprecedented diversity with several families of IS, transposons, and ICEs ranging in size from 1.1 to 52 kb. PMID:24418649
Weiser, Keith C.; Liu, Bin; Hansen, Gwenn M.; Skapura, Darlene; Hentges, Kathryn E.; Yarlagadda, Sujatha; Morse III, Herbert C.
2007-01-01
AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFκB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision. PMID:17926094
Weiser, Keith C; Liu, Bin; Hansen, Gwenn M; Skapura, Darlene; Hentges, Kathryn E; Yarlagadda, Sujatha; Morse Iii, Herbert C; Justice, Monica J
2007-10-01
AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFkappaB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision .
Development of a CRISPR/Cas9 genome editing toolbox for Corynebacterium glutamicum.
Liu, Jiao; Wang, Yu; Lu, Yujiao; Zheng, Ping; Sun, Jibin; Ma, Yanhe
2017-11-16
Corynebacterium glutamicum is an important industrial workhorse and advanced genetic engineering tools are urgently demanded. Recently, the clustered regularly interspaced short palindromic repeats (CRISPR) and their CRISPR-associated proteins (Cas) have revolutionized the field of genome engineering. The CRISPR/Cas9 system that utilizes NGG as protospacer adjacent motif (PAM) and has good targeting specificity can be developed into a powerful tool for efficient and precise genome editing of C. glutamicum. Herein, we developed a versatile CRISPR/Cas9 genome editing toolbox for C. glutamicum. Cas9 and gRNA expression cassettes were reconstituted to combat Cas9 toxicity and facilitate effective termination of gRNA transcription. Co-transformation of Cas9 and gRNA expression plasmids was exploited to overcome high-frequency mutation of cas9, allowing not only highly efficient gene deletion and insertion with plasmid-borne editing templates (efficiencies up to 60.0 and 62.5%, respectively) but also simple and time-saving operation. Furthermore, CRISPR/Cas9-mediated ssDNA recombineering was developed to precisely introduce small modifications and single-nucleotide changes into the genome of C. glutamicum with efficiencies over 80.0%. Notably, double-locus editing was also achieved in C. glutamicum. This toolbox works well in several C. glutamicum strains including the widely-used strains ATCC 13032 and ATCC 13869. In this study, we developed a CRISPR/Cas9 toolbox that could facilitate markerless gene deletion, gene insertion, precise base editing, and double-locus editing in C. glutamicum. The CRISPR/Cas9 toolbox holds promise for accelerating the engineering of C. glutamicum and advancing its application in the production of biochemicals and biofuels.
A Forward Genetic Screening for Prostate Cancer Progression Genes
2012-10-01
sequence reads. For verifying the prevalence of insertions in tumors, PCR was performed on genomic DNA corresponding to 15 insertional mutations using...and has been utilized with great effect in many organisms, from the bacterium to the fruit fly Drosophila melanogaster [1,2]. The Sleeping Beauty (SB...TX SL JC TN. References 1. Cooley L, Kelley R, Spradling A (1988) Insertional mutagenesis of the Drosophila genome with single P elements. Science
Evolutionary genomics of miniature inverted-repeat transposable elements (MITEs) in Brassica.
Nouroz, Faisal; Noreen, Shumaila; Heslop-Harrison, J S
2015-12-01
Miniature inverted-repeat transposable elements (MITEs) are truncated derivatives of autonomous DNA transposons, and are dispersed abundantly in most eukaryotic genomes. We aimed to characterize various MITEs families in Brassica in terms of their presence, sequence characteristics and evolutionary activity. Dot plot analyses involving comparison of homoeologous bacterial artificial chromosome (BAC) sequences allowed identification of 15 novel families of mobile MITEs. Of which, 5 were Stowaway-like with TA Target Site Duplications (TSDs), 4 Tourist-like with TAA/TTA TSDs, 5 Mutator-like with 9-10 bp TSDs and 1 novel MITE (BoXMITE1) flanked by 3 bp TSDs. Our data suggested that there are about 30,000 MITE-related sequences in Brassica rapa and B. oleracea genomes. In situ hybridization showed one abundant family was dispersed in the A-genome, while another was located near 45S rDNA sites. PCR analysis using primers flanking sequences of MITE elements detected MITE insertion polymorphisms between and within the three Brassica (AA, BB, CC) genomes, with many insertions being specific to single genomes and others showing evidence of more recent evolutionary insertions. Our BAC sequence comparison strategy enables identification of evolutionarily active MITEs with no prior knowledge of MITE sequences. The details of MITE families reported in Brassica enable their identification, characterization and annotation. Insertion polymorphisms of MITEs and their transposition activity indicated important mechanism of genome evolution and diversification. MITE families derived from known Mariner, Harbinger and Mutator DNA transposons were discovered, as well as some novel structures. The identification of Brassica MITEs will have broad applications in Brassica genomics, breeding, hybridization and phylogeny through their use as DNA markers.
Dictyostelium mobile elements: strategies to amplify in a compact genome.
Winckler, T; Dingermann, T; Glöckner, G
2002-12-01
Dictyostelium discoideum is a eukaryotic microorganism that is attractive for the study of fundamental biological phenomena such as cell-cell communication, formation of multicellularity, cell differentiation and morphogenesis. Large-scale sequencing of the D. discoideum genome has provided new insights into evolutionary strategies evolved by transposable elements (TEs) to settle in compact microbial genomes and to maintain active populations over evolutionary time. The high gene density (about 1 gene/2.6 kb) of the D. discoideum genome leaves limited space for selfish molecular invaders to move and amplify without causing deleterious mutations that eradicate their host. Targeting of transfer RNA (tRNA) gene loci appears to be a generally successful strategy for TEs residing in compact genomes to insert away from coding regions. In D. discoideum, tRNA gene-targeted retrotransposition has evolved independently at least three times by both non-long terminal repeat (LTR) retrotransposons and retrovirus-like LTR retrotransposons. Unlike the nonspecifically inserting D. discoideum TEs, which have a strong tendency to insert into preexisting TE copies and form large and complex clusters near the ends of chromosomes, the tRNA gene-targeted retrotransposons have managed to occupy 75% of the tRNA gene loci spread on chromosome 2 and represent 80% of the TEs recognized on the assembled central 6.5-Mb part of chromosome 2. In this review we update the available information about D. discoideum TEs which emerges both from previous work and current large-scale genome sequencing, with special emphasis on the fact that tRNA genes are principal determinants of retrotransposon insertions into the D. discoideum genome.
Tnt1 Retrotransposon Mutagenesis: A Tool for Soybean Functional Genomics1[W][OA
Cui, Yaya; Barampuram, Shyam; Stacey, Minviluz G.; Hancock, C. Nathan; Findley, Seth; Mathieu, Melanie; Zhang, Zhanyuan; Parrott, Wayne A.; Stacey, Gary
2013-01-01
Insertional mutagenesis is a powerful tool for determining gene function in both model and crop plant species. Tnt1, the transposable element of tobacco (Nicotiana tabacum) cell type 1, is a retrotransposon that replicates via an RNA copy that is reverse transcribed and integrated elsewhere in the plant genome. Based on studies in a variety of plants, Tnt1 appears to be inactive in normal plant tissue but can be reactivated by tissue culture. Our goal was to evaluate the utility of the Tnt1 retrotransposon as a mutagenesis strategy in soybean (Glycine max). Experiments showed that the Tnt1 element was stably transformed into soybean plants by Agrobacterium tumefaciens-mediated transformation. Twenty-seven independent transgenic lines carrying Tnt1 insertions were generated. Southern-blot analysis revealed that the copy number of transposed Tnt1 elements ranged from four to 19 insertions, with an average of approximately eight copies per line. These insertions showed Mendelian segregation and did not transpose under normal growth conditions. Analysis of 99 Tnt1 flanking sequences revealed insertions into 62 (62%) annotated genes, indicating that the element preferentially inserts into protein-coding regions. Tnt1 insertions were found in all 20 soybean chromosomes, indicating that Tnt1 transposed throughout the soybean genome. Furthermore, fluorescence in situ hybridization experiments validated that Tnt1 inserted into multiple chromosomes. Passage of transgenic lines through two different tissue culture treatments resulted in Tnt1 transposition, significantly increasing the number of insertions per line. Thus, our data demonstrate the Tnt1 retrotransposon to be a powerful system that can be used for effective large-scale insertional mutagenesis in soybean. PMID:23124322
USDA-ARS?s Scientific Manuscript database
Marek’s disease (MD) is a lymphoproliferative disease of chicken caused by serotype 1 MD virus (MDV). Vaccination of commercial poultry has drastically reduced losses from MD and the poultry industry cannot be sustained without the use of vaccines. Retrovirus insertion into herpesviruses genome is a...
Fu, Liezhen; Wen, Luan; Luu, Nga; Shi, Yun-Bo
2016-01-01
Genome editing with designer nucleases such as TALEN and CRISPR/Cas enzymes has broad applications. Delivery of these designer nucleases into organisms induces various genetic mutations including deletions, insertions and nucleotide substitutions. Characterizing those mutations is critical for evaluating the efficacy and specificity of targeted genome editing. While a number of methods have been developed to identify the mutations, none other than sequencing allows the identification of the most desired mutations, i.e., out-of-frame insertions/deletions that disrupt genes. Here we report a simple and efficient method to visualize and quantify the efficiency of genomic mutations induced by genome-editing. Our approach is based on the expression of a two-color fusion protein in a vector that allows the insertion of the edited region in the genome in between the two color moieties. We show that our approach not only easily identifies developing animals with desired mutations but also efficiently quantifies the mutation rate in vivo. Furthermore, by using LacZα and GFP as the color moieties, our approach can even eliminate the need for a fluorescent microscope, allowing the analysis with simple bright field visualization. Such an approach will greatly simplify the screen for effective genome-editing enzymes and identify the desired mutant cells/animals. PMID:27748423
SINE Retrotransposition: Evaluation of Alu Activity and Recovery of De Novo Inserts.
Ade, Catherine; Roy-Engel, Astrid M
2016-01-01
Mobile element activity is of great interest due to its impact on genomes. However, the types of mobile elements that inhabit any given genome are remarkably varied. Among the different varieties of mobile elements, the Short Interspersed Elements (SINEs) populate many genomes, including many mammalian species. Although SINEs are parasites of Long Interspersed Elements (LINEs), SINEs have been highly successful in both the primate and rodent genomes. When comparing copy numbers in mammals, SINEs have been vastly more successful than other nonautonomous elements, such as the retropseudogenes and SVA. Interestingly, in the human genome the copy number of Alu (a primate SINE) outnumbers LINE-1 (L1) copies 2 to 1. Estimates suggest that the retrotransposition rate for Alu is tenfold higher than LINE-1 with about 1 insert in every twenty births. Furthermore, Alu-induced mutagenesis is responsible for the majority of the documented instances of human retroelement insertion-induced disease. However, little is known on what contributes to these observed differences between SINEs and LINEs. The development of an assay to monitor SINE retrotransposition in culture has become an important tool for the elucidation of some of these differences. In this chapter, we present details of the SINE retrotransposition assay and the recovery of de novo inserts. We also focus on the nuances that are unique to the SINE assay.
Genome-wide analysis of Tol2 transposon reintegration in zebrafish.
Kondrychyn, Igor; Garcia-Lecea, Marta; Emelyanov, Alexander; Parinov, Sergey; Korzh, Vladimir
2009-09-08
Tol2, a member of the hAT family of transposons, has become a useful tool for genetic manipulation of model animals, but information about its interactions with vertebrate genomes is still limited. Furthermore, published reports on Tol2 have mainly been based on random integration of the transposon system after co-injection of a plasmid DNA harboring the transposon and a transposase mRNA. It is important to understand how Tol2 would behave upon activation after integration into the genome. We performed a large-scale enhancer trap (ET) screen and generated 338 insertions of the Tol2 transposon-based ET cassette into the zebrafish genome. These insertions were generated by remobilizing the transposon from two different donor sites in two transgenic lines. We found that 39% of Tol2 insertions occurred in transcription units, mostly into introns. Analysis of the transposon target sites revealed no strict specificity at the DNA sequence level. However, Tol2 was prone to target AT-rich regions with weak palindromic consensus sequences centered at the insertion site. Our systematic analysis of sequential remobilizations of the Tol2 transposon from two independent sites within a vertebrate genome has revealed properties such as a tendency to integrate into transcription units and into AT-rich palindrome-like sequences. This information will influence the development of various applications involving DNA transposons and Tol2 in particular.
Xiang, Xiaoyu; Huang, Xiaoxing; Wang, Haina; Huang, Li
2015-01-01
Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1) containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs). pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS) and a miniature inverted-repeat transposable element (MITE) were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading. PMID:25686154
Xiang, Xiaoyu; Huang, Xiaoxing; Wang, Haina; Huang, Li
2015-02-12
Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1) containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs). pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS) and a miniature inverted-repeat transposable element (MITE) were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading.
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.
Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K
2015-01-01
The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.
Global mapping of transposon location.
Gabriel, Abram; Dapprich, Johannes; Kunkel, Mark; Gresham, David; Pratt, Stephen C; Dunham, Maitreya J
2006-12-15
Transposable genetic elements are ubiquitous, yet their presence or absence at any given position within a genome can vary between individual cells, tissues, or strains. Transposable elements have profound impacts on host genomes by altering gene expression, assisting in genomic rearrangements, causing insertional mutations, and serving as sources of phenotypic variation. Characterizing a genome's full complement of transposons requires whole genome sequencing, precluding simple studies of the impact of transposition on interindividual variation. Here, we describe a global mapping approach for identifying transposon locations in any genome, using a combination of transposon-specific DNA extraction and microarray-based comparative hybridization analysis. We use this approach to map the repertoire of endogenous transposons in different laboratory strains of Saccharomyces cerevisiae and demonstrate that transposons are a source of extensive genomic variation. We also apply this method to mapping bacterial transposon insertion sites in a yeast genomic library. This unique whole genome view of transposon location will facilitate our exploration of transposon dynamics, as well as defining bases for individual differences and adaptive potential.
CRISPR-based screening of genomic island excision events in bacteria.
Selle, Kurt; Klaenhammer, Todd R; Barrangou, Rodolphe
2015-06-30
Genomic analysis of Streptococcus thermophilus revealed that mobile genetic elements (MGEs) likely contributed to gene acquisition and loss during evolutionary adaptation to milk. Clustered regularly interspaced short palindromic repeats-CRISPR-associated genes (CRISPR-Cas), the adaptive immune system in bacteria, limits genetic diversity by targeting MGEs including bacteriophages, transposons, and plasmids. CRISPR-Cas systems are widespread in streptococci, suggesting that the interplay between CRISPR-Cas systems and MGEs is one of the driving forces governing genome homeostasis in this genus. To investigate the genetic outcomes resulting from CRISPR-Cas targeting of integrated MGEs, in silico prediction revealed four genomic islands without essential genes in lengths from 8 to 102 kbp, totaling 7% of the genome. In this study, the endogenous CRISPR3 type II system was programmed to target the four islands independently through plasmid-based expression of engineered CRISPR arrays. Targeting lacZ within the largest 102-kbp genomic island was lethal to wild-type cells and resulted in a reduction of up to 2.5-log in the surviving population. Genotyping of Lac(-) survivors revealed variable deletion events between the flanking insertion-sequence elements, all resulting in elimination of the Lac-encoding island. Chimeric insertion sequence footprints were observed at the deletion junctions after targeting all of the four genomic islands, suggesting a common mechanism of deletion via recombination between flanking insertion sequences. These results established that self-targeting CRISPR-Cas systems may direct significant evolution of bacterial genomes on a population level, influencing genome homeostasis and remodeling.
A High-Throughput Arabidopsis Reverse Genetics System
Sessions, Allen; Burke, Ellen; Presting, Gernot; Aux, George; McElver, John; Patton, David; Dietrich, Bob; Ho, Patrick; Bacwaden, Johana; Ko, Cynthia; Clarke, Joseph D.; Cotton, David; Bullis, David; Snell, Jennifer; Miguel, Trini; Hutchison, Don; Kimmerly, Bill; Mitzel, Theresa; Katagiri, Fumiaki; Glazebrook, Jane; Law, Marc; Goff, Stephen A.
2002-01-01
A collection of Arabidopsis lines with T-DNA insertions in known sites was generated to increase the efficiency of functional genomics. A high-throughput modified thermal asymetric interlaced (TAIL)-PCR protocol was developed and used to amplify DNA fragments flanking the T-DNA left borders from ∼100,000 transformed lines. A total of 85,108 TAIL-PCR products from 52,964 T-DNA lines were sequenced and compared with the Arabidopsis genome to determine the positions of T-DNAs in each line. Predicted T-DNA insertion sites, when mapped, showed a bias against predicted coding sequences. Predicted insertion mutations in genes of interest can be identified using Arabidopsis Gene Index name searches or by BLAST (Basic Local Alignment Search Tool) search. Insertions can be confirmed by simple PCR assays on individual lines. Predicted insertions were confirmed in 257 of 340 lines tested (76%). This resource has been named SAIL (Syngenta Arabidopsis Insertion Library) and is available to the scientific community at www.tmri.org. PMID:12468722
Characterization of a new high copy Stowaway family MITE, BRAMI-1 in Brassica genome
2013-01-01
Background Miniature inverted-repeat transposable elements (MITEs) are expected to play important roles in evolution of genes and genome in plants, especially in the highly duplicated plant genomes. Various MITE families and their roles in plants have been characterized. However, there have been fewer studies of MITE families and their potential roles in evolution of the recently triplicated Brassica genome. Results We identified a new MITE family, BRAMI-1, belonging to the Stowaway super-family in the Brassica genome. In silico mapping revealed that 697 members are dispersed throughout the euchromatic regions of the B. rapa pseudo-chromosomes. Among them, 548 members (78.6%) are located in gene-rich regions, less than 3 kb from genes. In addition, we identified 516 and 15 members in the 470 Mb and 15 Mb genomic shotgun sequences currently available for B. oleracea and B. napus, respectively. The resulting estimated copy numbers for the entire genomes were 1440, 1464 and 2490 in B. rapa, B. oleracea and B. napus, respectively. Concurrently, only 70 members of the related Arabidopsis ATTIRTA-1 MITE family were identified in the Arabidopsis genome. Phylogenetic analysis revealed that BRAMI-1 elements proliferated in the Brassica genus after divergence from the Arabidopsis lineage. MITE insertion polymorphism (MIP) was inspected for 50 BRAMI-1 members, revealing high levels of insertion polymorphism between and within species of Brassica that clarify BRAMI-1 activation periods up to the present. Comparative analysis of the 71 genes harbouring the BRAMI-1 elements with their non-insertion paralogs (NIPs) showed that the BRAMI-1 insertions mainly reside in non-coding sequences and that the expression levels of genes with the elements differ from those of their NIPs. Conclusion A Stowaway family MITE, named as BRAMI-1, was gradually amplified and remained present in over than 1400 copies in each of three Brassica species. Overall, 78% of the members were identified in gene-rich regions, and it is assumed that they may contribute to the evolution of duplicated genes in the highly duplicated Brassica genome. The resulting MIPs can serve as a good source of DNA markers for Brassica crops because the insertions are highly dispersed in the gene-rich euchromatin region and are polymorphic between or within species. PMID:23547712
Preparation and screening of an arrayed human genomic library generated with the P1 cloning system.
Shepherd, N S; Pfrogner, B D; Coulby, J N; Ackerman, S L; Vaidyanathan, G; Sauer, R H; Balkenhol, T C; Sternberg, N
1994-01-01
We describe here the construction and initial characterization of a 3-fold coverage genomic library of the human haploid genome that was prepared using the bacteriophage P1 cloning system. The cloned DNA inserts were produced by size fractionation of a Sau3AI partial digest of high molecular weight genomic DNA isolated from primary cells of human foreskin fibroblasts. The inserts were cloned into the pAd10sacBII vector and packaged in vitro into P1 phage. These were used to generate recombinant bacterial clones, each of which was picked robotically from an agar plate into a well of a 96-well microtiter dish, grown overnight, and stored at -70 degrees C. The resulting library, designated DMPC-HFF#1 series A, consists of approximately 130,000-140,000 recombinant clones that were stored in 1500 microtiter dishes. To screen the library, clones were combined in a pooling strategy and specific loci were identified by PCR analysis. On average, the library contains two or three different clones for each locus screened. To date we have identified a total of 17 clones containing the hypoxanthine-guanine phosphoribosyltransferase, human serum albumin-human alpha-fetoprotein, p53, cyclooxygenase I, human apurinic endonuclease, beta-polymerase, and DNA ligase I genes. The cloned inserts average 80 kb in size and range from 70 to 95 kb, with one 49-kb insert and one 62-kb insert. Images PMID:8146166
Genome-Wide Transposon Mutagenesis in Pathogenic Leptospira Species▿ ‡
Murray, Gerald L.; Morel, Viviane; Cerqueira, Gustavo M.; Croda, Julio; Srikram, Amporn; Henry, Rebekah; Ko, Albert I.; Dellagostin, Odir A.; Bulach, Dieter M.; Sermswan, Rasana W.; Adler, Ben; Picardeau, Mathieu
2009-01-01
Leptospira interrogans is the most common cause of leptospirosis in humans and animals. Genetic analysis of L. interrogans has been severely hindered by a lack of tools for genetic manipulation. Recently we developed the mariner-based transposon Himar1 to generate the first defined mutants in L. interrogans. In this study, a total of 929 independent transposon mutants were obtained and the location of insertion determined. Of these mutants, 721 were located in the protein coding regions of 551 different genes. While sequence analysis of transposon insertion sites indicated that transposition occurred in an essentially random fashion in the genome, 25 unique transposon mutants were found to exhibit insertions into genes encoding 16S or 23S rRNAs, suggesting these genes are insertional hot spots in the L. interrogans genome. In contrast, loci containing notionally essential genes involved in lipopolysaccharide and heme biosynthesis showed few transposon insertions. The effect of gene disruption on the virulence of a selected set of defined mutants was investigated using the hamster model of leptospirosis. Two attenuated mutants with disruptions in hypothetical genes were identified, thus validating the use of transposon mutagenesis for the identification of novel virulence factors in L. interrogans. This library provides a valuable resource for the study of gene function in L. interrogans. Combined with the genome sequences of L. interrogans, this provides an opportunity to investigate genes that contribute to pathogenesis and will provide a better understanding of the biology of L. interrogans. PMID:19047402
Genome-wide transposon mutagenesis in pathogenic Leptospira species.
Murray, Gerald L; Morel, Viviane; Cerqueira, Gustavo M; Croda, Julio; Srikram, Amporn; Henry, Rebekah; Ko, Albert I; Dellagostin, Odir A; Bulach, Dieter M; Sermswan, Rasana W; Adler, Ben; Picardeau, Mathieu
2009-02-01
Leptospira interrogans is the most common cause of leptospirosis in humans and animals. Genetic analysis of L. interrogans has been severely hindered by a lack of tools for genetic manipulation. Recently we developed the mariner-based transposon Himar1 to generate the first defined mutants in L. interrogans. In this study, a total of 929 independent transposon mutants were obtained and the location of insertion determined. Of these mutants, 721 were located in the protein coding regions of 551 different genes. While sequence analysis of transposon insertion sites indicated that transposition occurred in an essentially random fashion in the genome, 25 unique transposon mutants were found to exhibit insertions into genes encoding 16S or 23S rRNAs, suggesting these genes are insertional hot spots in the L. interrogans genome. In contrast, loci containing notionally essential genes involved in lipopolysaccharide and heme biosynthesis showed few transposon insertions. The effect of gene disruption on the virulence of a selected set of defined mutants was investigated using the hamster model of leptospirosis. Two attenuated mutants with disruptions in hypothetical genes were identified, thus validating the use of transposon mutagenesis for the identification of novel virulence factors in L. interrogans. This library provides a valuable resource for the study of gene function in L. interrogans. Combined with the genome sequences of L. interrogans, this provides an opportunity to investigate genes that contribute to pathogenesis and will provide a better understanding of the biology of L. interrogans.
A small indel mutation in an anthocyanin transporter causes variegated colouration of peach flowers.
Cheng, Jun; Liao, Liao; Zhou, Hui; Gu, Chao; Wang, Lu; Han, Yuepeng
2015-12-01
The ornamental peach cultivar 'Hongbaihuatao (HBH)' can simultaneously bear pink, red, and variegated flowers on a single tree. Anthocyanin content in pink flowers is extremely low, being only 10% that of a red flower. Surprisingly, the expression of anthocyanin structural and potential regulatory genes in white flowers was not significantly lower than that in both pink and red flowers. However, proteomic analysis revealed a GST encoded by a gene-regulator involved in anthocyanin transport (Riant)-which is expressed in the red flower, but almost undetectable in the variegated flower. The Riant gene contains an insertion-deletion (indel) polymorphism in exon 3. In white flowers, the Riant gene is interrupted by a 2-bp insertion in the last exon, which causes a frameshift and a premature stop codon. In contrast, both pink and red flowers that arise from bud sports are heterozygous for the Riant locus, with one functional allele due to the 2-bp deletion or a novel 1-bp insertion. Southern blot analysis indicated that the Riant gene occurs in a single copy in the peach genome and it is not interrupted by a transposon. The function of the Riant gene was confirmed by its ectopic expression in the Arabidopsis tt19 mutant, where it complements the anthocyanin phenotype, but not the proanthocyanidin pigmentation in seed coat. Collectively,these results indicate that a small indel mutation in the Riant gene, which is not the result of a transposon insertion or excision, causes variegated colouration of peach flowers. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.
A small indel mutation in an anthocyanin transporter causes variegated colouration of peach flowers
Cheng, Jun; Liao, Liao; Zhou, Hui; Gu, Chao; Wang, Lu; Han, Yuepeng
2015-01-01
The ornamental peach cultivar ‘Hongbaihuatao (HBH)’ can simultaneously bear pink, red, and variegated flowers on a single tree. Anthocyanin content in pink flowers is extremely low, being only 10% that of a red flower. Surprisingly, the expression of anthocyanin structural and potential regulatory genes in white flowers was not significantly lower than that in both pink and red flowers. However, proteomic analysis revealed a GST encoded by a gene—regulator involved in anthocyanin transport (Riant)—which is expressed in the red flower, but almost undetectable in the variegated flower. The Riant gene contains an insertion-deletion (indel) polymorphism in exon 3. In white flowers, the Riant gene is interrupted by a 2-bp insertion in the last exon, which causes a frameshift and a premature stop codon. In contrast, both pink and red flowers that arise from bud sports are heterozygous for the Riant locus, with one functional allele due to the 2-bp deletion or a novel 1-bp insertion. Southern blot analysis indicated that the Riant gene occurs in a single copy in the peach genome and it is not interrupted by a transposon. The function of the Riant gene was confirmed by its ectopic expression in the Arabidopsis tt19 mutant, where it complements the anthocyanin phenotype, but not the proanthocyanidin pigmentation in seed coat. Collectively,these results indicate that a small indel mutation in the Riant gene, which is not the result of a transposon insertion or excision, causes variegated colouration of peach flowers. PMID:26357885
Deng, Youjin; Zhang, Qihui; Ming, Ray; Lin, Longji; Lin, Xiangzhi; Lin, Yiying; Li, Xiao; Xie, Baogui; Wen, Zhiqiang
2016-06-30
Hypomyces aurantius is a mycoparasite that causes cobweb disease, a most serious disease of cultivated mushrooms. Intra-species identification is vital for disease control, however the lack of genomic data makes development of molecular markers challenging. Small size, high copy number, and high mutation rate of fungal mitochondrial genome makes it a good candidate for intra and inter species differentiation. In this study, the mitochondrial genome of H. H.a0001 was determined from genomic DNA using Illumina sequencing. The roughly 72 kb genome shows all major features found in other Hypocreales: 14 common protein genes, large and small subunit rRNAs genes and 27 tRNAs genes. Gene arrangement comparison showed conserved gene orders in Hypocreales mitochondria are relatively conserved, with the exception of Acremonium chrysogenum and Acremonium implicatum. Mitochondrial genome comparison also revealed that intron length primarily contributes to mitogenome size variation. Seventeen introns were detected in six conserved genes: five in cox1, four in rnl, three in cob, two each in atp6 and cox3, and one in cox2. Four introns were found to contain two introns or open reading frames: cox3-i2 is a twintron containing two group IA type introns; cox2-i1 is a group IB intron encoding two homing endonucleases; and cox1-i4 and cox1-i3 both contain two open reading frame (ORFs). Analyses combining secondary intronic structures, insertion sites, and similarities of homing endonuclease genes reveal two group IA introns arranged side by side within cox3-i2. Mitochondrial data for H. aurantius provides the basis for further studies relating to population genetics and species identification.
Deng, Youjin; Zhang, Qihui; Ming, Ray; Lin, Longji; Lin, Xiangzhi; Lin, Yiying; Li, Xiao; Xie, Baogui; Wen, Zhiqiang
2016-01-01
Hypomyces aurantius is a mycoparasite that causes cobweb disease, a most serious disease of cultivated mushrooms. Intra-species identification is vital for disease control, however the lack of genomic data makes development of molecular markers challenging. Small size, high copy number, and high mutation rate of fungal mitochondrial genome makes it a good candidate for intra and inter species differentiation. In this study, the mitochondrial genome of H. H.a0001 was determined from genomic DNA using Illumina sequencing. The roughly 72 kb genome shows all major features found in other Hypocreales: 14 common protein genes, large and small subunit rRNAs genes and 27 tRNAs genes. Gene arrangement comparison showed conserved gene orders in Hypocreales mitochondria are relatively conserved, with the exception of Acremonium chrysogenum and Acremonium implicatum. Mitochondrial genome comparison also revealed that intron length primarily contributes to mitogenome size variation. Seventeen introns were detected in six conserved genes: five in cox1, four in rnl, three in cob, two each in atp6 and cox3, and one in cox2. Four introns were found to contain two introns or open reading frames: cox3-i2 is a twintron containing two group IA type introns; cox2-i1 is a group IB intron encoding two homing endonucleases; and cox1-i4 and cox1-i3 both contain two open reading frame (ORFs). Analyses combining secondary intronic structures, insertion sites, and similarities of homing endonuclease genes reveal two group IA introns arranged side by side within cox3-i2. Mitochondrial data for H. aurantius provides the basis for further studies relating to population genetics and species identification. PMID:27376282
Tajaddod, Mansoureh; Tanzer, Andrea; Licht, Konstantin; Wolfinger, Michael T; Badelt, Stefan; Huber, Florian; Pusch, Oliver; Schopoff, Sandy; Janisiw, Michael; Hofacker, Ivo; Jantsch, Michael F
2016-10-25
Short interspersed elements (SINEs) represent the most abundant group of non-long-terminal repeat transposable elements in mammalian genomes. In primates, Alu elements are the most prominent and homogenous representatives of SINEs. Due to their frequent insertion within or close to coding regions, SINEs have been suggested to play a crucial role during genome evolution. Moreover, Alu elements within mRNAs have also been reported to control gene expression at different levels. Here, we undertake a genome-wide analysis of insertion patterns of human Alus within transcribed portions of the genome. Multiple, nearby insertions of SINEs within one transcript are more abundant in tandem orientation than in inverted orientation. Indeed, analysis of transcriptome-wide expression levels of 15 ENCODE cell lines suggests a cis-repressive effect of inverted Alu elements on gene expression. Using reporter assays, we show that the negative effect of inverted SINEs on gene expression is independent of known sensors of double-stranded RNAs. Instead, transcriptional elongation seems impaired, leading to reduced mRNA levels. Our study suggests that there is a bias against multiple SINE insertions that can promote intramolecular base pairing within a transcript. Moreover, at a genome-wide level, mRNAs harboring inverted SINEs are less expressed than mRNAs harboring single or tandemly arranged SINEs. Finally, we demonstrate a novel mechanism by which inverted SINEs can impact on gene expression by interfering with RNA polymerase II.
Utilization of next generation sequencing for analyzing transgenic insertions in plum
USDA-ARS?s Scientific Manuscript database
When utilizing transgenic plants, it is useful to know how many copies of the genes were inserted and the locations of these insertions in the genome. This information can provide important insights for the interpretation of transgene expression and the resulting phenotype. Traditionally, these qu...
Bartels, Daniela; Kespohl, Sebastian; Albaum, Stefan; Drüke, Tanja; Goesmann, Alexander; Herold, Julia; Kaiser, Olaf; Pühler, Alfred; Pfeiffer, Friedhelm; Raddatz, Günter; Stoye, Jens; Meyer, Folker; Schuster, Stephan C
2005-04-01
We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.
S Elements: A Family of Tc1-like Transposons in the Genome of Drosophila Melanogaster
Merriman, P. J.; Grimes, C. D.; Ambroziak, J.; Hackett, D. A.; Skinner, P.; Simmons, M. J.
1995-01-01
The S elements form a diverse family of long-inverted-repeat transposons within the genome of Drosophila melanogaster. These elements vary in size and sequence, the longest consisting of 1736 bp with 234-bp inverted terminal repeats. The longest open reading frame in an intact S element could encode a 345-amino acid polypeptide. This polypeptide is homologous to the transposases of the mariner-Tc1 superfamily of transposable elements. S elements are ubiquitous in D. melanogaster populations and also appear to be present in the genomes of two sibling species; however, they seem to be absent from 17 other Drosophila species that were examined. Within D. melanogaster strains, there are, on average, 37.4 cytologically detectable S elements per diploid genome. These elements are scattered throughout the chromosomes, but several sites in both the euchromatin and β heterochromatin are consistently occupied. The discovery of an S-element-insertion mutation and a reversion of this mutation indicates that S elements are at least occasionally mobile in the D. melanogaster genome. These elements seem to insert at an AT dinucleotide within a short palindrome and apparently duplicate that dinucleotide upon insertion. PMID:8601484
Han, Guomin; Shao, Qian; Li, Cuiping; Zhao, Kai; Jiang, Li; Fan, Jun; Jiang, Haiyang; Tao, Fang
2018-05-01
Aspergillus flavus often invade many important corps and produce harmful aflatoxins both in preharvest and during storage stages. The regulation mechanism of aflatoxin biosynthesis in this fungus has not been well explored mainly due to the lack of an efficient transformation method for constructing a genome-wide gene mutant library. This challenge was resolved in this study, where a reliable and efficient Agrobacterium tumefaciens-mediated transformation (ATMT) protocol for A. flavus NRRL 3357 was established. The results showed that removal of multinucleate conidia, to collect a homogenous sample of uninucleate conidia for use as the transformation material, is the key step in this procedure. A. tumefaciens strain AGL-1 harboring the ble gene for zeocin resistance under the control of the gpdA promoter from A. nidulans is suitable for genetic transformation of this fungus. We successfully generated A. flavus transformants with an efficiency of ∼ 60 positive transformants per 10 6 conidia using our protocol. A small-scale insertional mutant library (∼ 1,000 mutants) was constructed using this method and the resulting several mutants lacked both production of conidia and aflatoxin biosynthesis capacity. Southern blotting analysis demonstrated that the majority of the transformants contained a single T-DNA insert on the genome. To the best of our knowledge, this is the first report of genetic transformation of A. flavus via ATMT and our protocol provides an effective tool for construction of genome-wide gene mutant libraries for functional analysis of important genes in A. flavus.
Chen, Bo-Ruei; Hale, Devin C; Ciolek, Peter J; Runge, Kurt W
2012-05-03
Barcodes are unique DNA sequence tags that can be used to specifically label individual mutants. The barcode-tagged open reading frame (ORF) haploid deletion mutant collections in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe allow for high-throughput mutant phenotyping because the relative growth of mutants in a population can be determined by monitoring the proportions of their associated barcodes. While these mutant collections have greatly facilitated genome-wide studies, mutations in essential genes are not present, and the roles of these genes are not as easily studied. To further support genome-scale research in S. pombe, we generated a barcode-tagged fission yeast insertion mutant library that has the potential of generating viable mutations in both essential and non-essential genes and can be easily analyzed using standard molecular biological techniques. An insertion vector containing a selectable ura4+ marker and a random barcode was used to generate a collection of 10,000 fission yeast insertion mutants stored individually in 384-well plates and as six pools of mixed mutants. Individual barcodes are flanked by Sfi I recognition sites and can be oligomerized in a unique orientation to facilitate barcode sequencing. Independent genetic screens on a subset of mutants suggest that this library contains a diverse collection of single insertion mutations. We present several approaches to determine insertion sites. This collection of S. pombe barcode-tagged insertion mutants is well-suited for genome-wide studies. Because insertion mutations may eliminate, reduce or alter the function of essential and non-essential genes, this library will contain strains with a wide range of phenotypes that can be assayed by their associated barcodes. The design of the barcodes in this library allows for barcode sequencing using next generation or standard benchtop cloning approaches.
Han, Mengxue; Sun, Qibao; Zhou, Junyong; Qiu, Huarong; Guo, Jing; Lu, Lijuan; Mu, Wenlei; Sun, Jun
2017-09-01
Insertion of a solo LTR, which possesses strong bidirectional, stem-specific promoter activities, is associated with the evolution of a dwarfing apple spur mutation. Spur mutations in apple scions revolutionized global apple production. Since long terminal repeat (LTR) retrotransposons are tightly related to natural mutations, inter-retrotransposon-amplified polymorphism technique and genome walking were used to find sequences in the apple genome based on these LTRs. In 'Red Delicious' spur mutants, a novel, 2190-bp insertion was identified as a spur-specific, solo LTR (sLTR) located at the 1038th nucleotide of another sLTR, which was 1536 bp in length. This insertion-within-an-insertion was localized within a preexisting Gypsy-50 retrotransposon at position 3,762,767 on chromosome 4. The analysis of transcriptional activity of the two sLTRs (the 2190- and 1536-bp inserts) indicated that the 2190-bp sLTR is a promoter, capable of bidirectional transcription. GUS expression in the 2190-bp-sense and 2190-bp-antisense transgenic lines was prominent in stems. In contrast, no promoter activity from either the sense or the antisense strand of the 1536-bp sLTR was detected. From ~150 kb of DNA on each side of the 2190 bp, sLTR insertion site, corresponding to 300 kb of the 'Golden Delicious' genome, 23 genes were predicted. Ten genes had predicted functions that could affect shoot development. This first report, of a sLTR insertion associated with the evolution of apple spur mutation, will facilitate apple breeding, cloning of spur-related genes, and discovery of mechanisms behind dwarf habit.
Charles, Mathieu; Belcram, Harry; Just, Jérémy; Huneau, Cécile; Viollet, Agnès; Couloux, Arnaud; Segurens, Béatrice; Carter, Meredith; Huteau, Virginie; Coriton, Olivier; Appels, Rudi; Samain, Sylvie; Chalhoub, Boulos
2008-01-01
Transposable elements (TEs) constitute >80% of the wheat genome but their dynamics and contribution to size variation and evolution of wheat genomes (Triticum and Aegilops species) remain unexplored. In this study, 10 genomic regions have been sequenced from wheat chromosome 3B and used to constitute, along with all publicly available genomic sequences of wheat, 1.98 Mb of sequence (from 13 BAC clones) of the wheat B genome and 3.63 Mb of sequence (from 19 BAC clones) of the wheat A genome. Analysis of TE sequence proportions (as percentages), ratios of complete to truncated copies, and estimation of insertion dates of class I retrotransposons showed that specific types of TEs have undergone waves of differential proliferation in the B and A genomes of wheat. While both genomes show similar rates and relatively ancient proliferation periods for the Athila retrotransposons, the Copia retrotransposons proliferated more recently in the A genome whereas Gypsy retrotransposon proliferation is more recent in the B genome. It was possible to estimate for the first time the proliferation periods of the abundant CACTA class II DNA transposons, relative to that of the three main retrotransposon superfamilies. Proliferation of these TEs started prior to and overlapped with that of the Athila retrotransposons in both genomes. However, they also proliferated during the same periods as Gypsy and Copia retrotransposons in the A genome, but not in the B genome. As estimated from their insertion dates and confirmed by PCR-based tracing analysis, the majority of differential proliferation of TEs in B and A genomes of wheat (87 and 83%, respectively), leading to rapid sequence divergence, occurred prior to the allotetraploidization event that brought them together in Triticum turgidum and Triticum aestivum, <0.5 million years ago. More importantly, the allotetraploidization event appears to have neither enhanced nor repressed retrotranspositions. We discuss the apparent proliferation of TEs as resulting from their insertion, removal, and/or combinations of both evolutionary forces. PMID:18780739
Vincent, Antony T; Trudel, Mélanie V; Freschi, Luca; Nagar, Vandan; Gagné-Thivierge, Cynthia; Levesque, Roger C; Charette, Steve J
2016-01-12
Aeromonads make up a group of Gram-negative bacteria that includes human and fish pathogens. The Aeromonas salmonicida species has the peculiarity of including five known subspecies. However, few studies of the genomes of A. salmonicida subspecies have been reported to date. We sequenced the genomes of additional A. salmonicida isolates, including three from India, using next-generation sequencing in order to gain a better understanding of the genomic and phylogenetic links between A. salmonicida subspecies. Their relative phylogenetic positions were confirmed by a core genome phylogeny based on 1645 gene sequences. The Indian isolates, which formed a sub-group together with A. salmonicida subsp. pectinolytica, were able to grow at either at 18 °C and 37 °C, unlike the A. salmonicida psychrophilic isolates that did not grow at 37 °C. Amino acid frequencies, GC content, tRNA composition, loss and gain of genes during evolution, pseudogenes as well as genes under positive selection and the mobilome were studied to explain this intraspecies dichotomy. Insertion sequences appeared to be an important driving force that locked the psychrophilic strains into their particular lifestyle in order to conserve their genomic integrity. This observation, based on comparative genomics, is in agreement with previous results showing that insertion sequence mobility induced by heat in A. salmonicida subspecies causes genomic plasticity, resulting in a deleterious effect on the virulence of the bacterium. We provide a proof-of-concept that selfish DNAs play a major role in the evolution of bacterial species by modeling genomes.
Willett-Brozick, J E; Savul, S A; Richey, L E; Baysal, B E
2001-08-01
Constitutional chromosomal translocations are relatively common causes of human morbidity, yet the DNA double-strand break (DSB) repair mechanisms that generate them are incompletely understood. We cloned, sequenced and analyzed the breakpoint junctions of a familial constitutional reciprocal translocation t(9;11)(p24;q23). Within the 10-kb region flanking the breakpoints, chromosome 11 had 25% repeat elements, whereas chromosome 9 had 98% repeats, 95% of which were L1-type LINE elements. The breakpoints occurred within an L1-type repeat element at 9p24 and at the 3'-end of an Alu sequence at 11q23. At the breakpoint junction of derivative chromosome 9, we discovered an unusually large 41-bp insertion, which showed 100% identity to 12S mitochondrial DNA (mtDNA) between nucleotides 896 and 936 of the mtDNA sequence. Analysis of the human genome failed to show the preexistence of the inserted sequence at normal chromosomes 9 and 11 breakpoint junctions or elsewhere in the genome, strongly suggesting that the insertion was derived from human mtDNA and captured into the junction during the DSB repair process. To our knowledge, these findings represent the first observation of spontaneous germ line insertion of modern human mtDNA sequences and suggest that DSB repair may play a role in inter-organellar gene transfer in vivo. Our findings also provide evidence for a previously unrecognized insertional mechanism in human, by which non-mobile extra-chromosomal fragments can be inserted into the genome at DSB repair junctions.
Identification, variation and transcription of pneumococcal repeat sequences
2011-01-01
Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
[The principle and application of the single-molecule real-time sequencing technology].
Yanhu, Liu; Lu, Wang; Li, Yu
2015-03-01
Last decade witnessed the explosive development of the third-generation sequencing strategy, including single-molecule real-time sequencing (SMRT), true single-molecule sequencing (tSMSTM) and the single-molecule nanopore DNA sequencing. In this review, we summarize the principle, performance and application of the SMRT sequencing technology. Compared with the traditional Sanger method and the next-generation sequencing (NGS) technologies, the SMRT approach has several advantages, including long read length, high speed, PCR-free and the capability of direct detection of epigenetic modifications. However, the disadvantage of its low accuracy, most of which resulted from insertions and deletions, is also notable. So, the raw sequence data need to be corrected before assembly. Up to now, the SMRT is a good fit for applications in the de novo genomic sequencing and the high-quality assemblies of small genomes. In the future, it is expected to play an important role in epigenetics, transcriptomic sequencing, and assemblies of large genomes.
USDA-ARS?s Scientific Manuscript database
We have expanded upon a previously reported comparative genomics approach using a read-depth (JaRMs) and a hybrid read-pair, split-read (RAPTR-SV) copy number variation (CNV) detection method that uses read alignments to the cattle reference genome in order to identify species-specific genomic rearr...
Guo, Jinchao; Yang, Litao; Liu, Xin; Guan, Xiaoyan; Jiang, Lingxi; Zhang, Dabing
2009-08-26
Genetically modified (GM) papaya (Carica papaya L.), Huanong No. 1, was approved for commercialization in Guangdong province, China in 2006, and the development of the Huanong No. 1 papaya detection method is necessary for implementing genetically modified organism (GMO) labeling regulations. In this study, we reported the characterization of the exogenous integration of GM Huanong No. 1 papaya by means of conventional polymerase chain reaction (PCR) and thermal asymmetric interlaced (TAIL)-PCR strategies. The results suggested that one intact copy of the initial construction was integrated in the papaya genome and which probably resulted in one deletion (38 bp in size) of the host genomic DNA. Also, one unintended insertion of a 92 bp truncated NptII fragment was observed at the 5' end of the exogenous insert. Furthermore, we revealed its 5' and 3' flanking sequences between the insert DNA and the papaya genomic DNA, and developed the event-specific qualitative and quantitative PCR assays for GM Huanong No. 1 papaya based on the 5' integration flanking sequence. The relative limit of detection (LOD) of the qualitative PCR assay was about 0.01% in 100 ng of total papaya genomic DNA, corresponding to about 25 copies of papaya haploid genome. In the quantitative PCR, the limits of detection and quantification (LOD and LOQ) were as low as 12.5 and 25 copies of papaya haploid genome, respectively. In practical sample quantification, the quantified biases between the test and true values of three samples ranged from 0.44% to 4.41%. Collectively, we proposed that all of these results are useful for the identification and quantification of Huanong No. 1 papaya and its derivates.
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution
Yap, Jia-Yee S.; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y. H.; Wilkins, Marc R.; Rossetto, Maurizio; Delaney, Sven K.
2015-01-01
The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine. PMID:26061691
Identification and Classification of Conserved RNA Secondary Structures in the Human Genome
Pedersen, Jakob Skou; Bejerano, Gill; Siepel, Adam; Rosenbloom, Kate; Lindblad-Toh, Kerstin; Lander, Eric S; Kent, Jim; Miller, Webb; Haussler, David
2006-01-01
The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3′UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization. PMID:16628248
The genetic map of finger millet, Eleusine coracana.
Dida, Mathews M; Srinivasachary; Ramakrishnan, Sujatha; Bennetzen, Jeffrey L; Gale, Mike D; Devos, Katrien M
2007-01-01
Restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), expressed-sequenced tag (EST), and simple sequence repeat (SSR) markers were used to generate a genetic map of the tetraploid finger millet (Eleusine coracana subsp. coracana) genome (2n = 4x = 36). Because levels of variation in finger millet are low, the map was generated in an inter-subspecific F(2) population from a cross between E. coracana subsp. coracana cv. Okhale-1 and its wild progenitor E. coracana subsp. africana acc. MD-20. Duplicated loci were used to identify homoeologous groups. Assignment of linkage groups to the A and B genome was done by comparing the hybridization patterns of probes in Okhale-1, MD-20, and Eleusine indica acc. MD-36. E. indica is the A genome donor to E. coracana. The maps span 721 cM on the A genome and 787 cM on the B genome and cover all 18 finger millet chromosomes, at least partially. To facilitate the use of marker-assisted selection in finger millet, a first set of 82 SSR markers was developed. The SSRs were identified in small-insert genomic libraries generated using methylation-sensitive restriction enzymes. Thirty-one of the SSRs were mapped. Application of the maps and markers in hybridization-based breeding programs will expedite the improvement of finger millet.
The clinical applications of genome editing in HIV.
Wang, Cathy X; Cannon, Paula M
2016-05-26
HIV/AIDS has long been at the forefront of the development of gene- and cell-based therapies. Although conventional gene therapy approaches typically involve the addition of anti-HIV genes to cells using semirandomly integrating viral vectors, newer genome editing technologies based on engineered nucleases are now allowing more precise genetic manipulations. The possible outcomes of genome editing include gene disruption, which has been most notably applied to the CCR5 coreceptor gene, or the introduction of small mutations or larger whole gene cassette insertions at a targeted locus. Disruption of CCR5 using zinc finger nucleases was the first-in-human application of genome editing and remains the most clinically advanced platform, with 7 completed or ongoing clinical trials in T cells and hematopoietic stem/progenitor cells (HSPCs). Here we review the laboratory and clinical findings of CCR5 editing in T cells and HSPCs for HIV therapy and summarize other promising genome editing approaches for future clinical development. In particular, recent advances in the delivery of genome editing reagents and the demonstration of highly efficient homology-directed editing in both T cells and HSPCs are expected to spur the development of even more sophisticated applications of this technology for HIV therapy. © 2016 by The American Society of Hematology.
Baculovirus-based genome editing in primary cells.
Mansouri, Maysam; Ehsaei, Zahra; Taylor, Verdon; Berger, Philipp
2017-03-01
Genome editing in eukaryotes became easier in the last years with the development of nucleases that induce double strand breaks in DNA at user-defined sites. CRISPR/Cas9-based genome editing is currently one of the most powerful strategies. In the easiest case, a nuclease (e.g. Cas9) and a target defining guide RNA (gRNA) are transferred into a target cell. Non-homologous end joining (NHEJ) repair of the DNA break following Cas9 cleavage can lead to inactivation of the target gene. Specific repair or insertion of DNA with Homology Directed Repair (HDR) needs the simultaneous delivery of a repair template. Recombinant Lentivirus or Adenovirus genomes have enough capacity for a nuclease coding sequence and the gRNA but are usually too small to also carry large targeting constructs. We recently showed that a baculovirus-based multigene expression system (MultiPrime) can be used for genome editing in primary cells since it possesses the necessary capacity to carry the nuclease and gRNA expression constructs and the HDR targeting sequences. Here we present new Acceptor plasmids for MultiPrime that allow simplified cloning of baculoviruses for genome editing and we show their functionality in primary cells with limited life span and induced pluripotent stem cells (iPS). Copyright © 2017 Elsevier Inc. All rights reserved.
Chi, Sylvia Ighem; Urbarova, Ilona; Johansen, Steinar D
2018-04-30
The mitochondrial genomes of sea anemones are dynamic in structure. Invasion by genetic elements, such as self-catalytic group I introns or insertion-like sequences, contribute to sea anemone mitochondrial genome expansion and complexity. By using next generation sequencing we investigated the complete mtDNAs and corresponding transcriptomes of the temperate sea anemone Anemonia viridis and its closer tropical relative Anemonia majano. Two versions of fused homing endonuclease gene (HEG) organization were observed among the Actiniidae sea anemones; in-frame gene fusion and pseudo-gene fusion. We provided support for the pseudo-gene fusion organization in Anemonia species, resulting in a repressed HEG from the COI-884 group I intron. orfA, a putative protein-coding gene with insertion-like features, was present in both Anemonia species. Interestingly, orfA and COI expression were significantly up-regulated upon long-term environmental stress corresponding to low seawater pH conditions. This study provides new insights to the dynamics of sea anemone mitochondrial genome structure and function. Copyright © 2018 Elsevier B.V. All rights reserved.
Brewer, Megan H.; Chaudhry, Rabia; Qi, Jessica; Kidambi, Aditi; Drew, Alexander P.; Ryan, Monique M.; Subramanian, Gopinath M.; Young, Helen K.; Zuchner, Stephan; Reddel, Stephen W.; Nicholson, Garth A.; Kennerson, Marina L.
2016-01-01
With the advent of whole exome sequencing, cases where no pathogenic coding mutations can be found are increasingly being observed in many diseases. In two large, distantly-related families that mapped to the Charcot-Marie-Tooth neuropathy CMTX3 locus at chromosome Xq26.3-q27.3, all coding mutations were excluded. Using whole genome sequencing we found a large DNA interchromosomal insertion within the CMTX3 locus. The 78 kb insertion originates from chromosome 8q24.3, segregates fully with the disease in the two families, and is absent from the general population as well as 627 neurologically normal chromosomes from in-house controls. Large insertions into chromosome Xq27.1 are known to cause a range of diseases and this is the first neuropathy phenotype caused by an interchromosomal insertion at this locus. The CMTX3 insertion represents an understudied pathogenic structural variation mechanism for inherited peripheral neuropathies. Our finding highlights the importance of considering all structural variation types when studying unsolved inherited peripheral neuropathy cases with no pathogenic coding mutations. PMID:27438001
Copy number determination of genetically-modified hematopoietic stem cells.
Schuesler, Todd; Reeves, Lilith; Kalle, Christof von; Grassman, Elke
2009-01-01
Human gene transfer with gammaretroviral, murine leukemia virus (MLV) based vectors has been shown to effectively insert and express transgene sequences at a level of therapeutic benefit. However, there are numerous reports of disruption of the normal cellular processes caused by the viral insertion, even of replication deficient gammaretroviral vectors. Current gammaretroviral and lentiviral vectors do not control the site of insertion into the genome, hence, the possibility of disruption of the target cell genome. Risk related to viral insertions is linked to the number of insertions of the transgene into the cellular DNA, as has been demonstrated for replication competent and replication deficient retroviruses in experiments. At high number of insertions per cell, cell transformation due to vector induced activation of proto-oncogenes is more likely to occur, in particular since more than one transforming event is needed for oncogenesis. Thus, determination of the vector copy number in bulk transduced populations, individual colony forming units, and tissue from the recipient of the transduced cells is an increasingly important safety assay and has become a standard, though not straightforward assay, since the inception of quantitative PCR.
Time- and Cost-Efficient Identification of T-DNA Insertion Sites through Targeted Genomic Sequencing
Lepage, Étienne; Zampini, Éric; Boyle, Brian; Brisson, Normand
2013-01-01
Forward genetic screens enable the unbiased identification of genes involved in biological processes. In Arabidopsis, several mutant collections are publicly available, which greatly facilitates such practice. Most of these collections were generated by agrotransformation of a T-DNA at random sites in the plant genome. However, precise mapping of T-DNA insertion sites in mutants isolated from such screens is a laborious and time-consuming task. Here we report a simple, low-cost and time efficient approach to precisely map T-DNA insertions simultaneously in many different mutants. By combining sequence capture, next-generation sequencing and 2D-PCR pooling, we developed a new method that allowed the rapid localization of T-DNA insertion sites in 55 out of 64 mutant plants isolated in a screen for gyrase inhibition hypersensitivity. PMID:23951038
Widespread and evolutionary analysis of a MITE family Monkey King in Brassicaceae.
Dai, Shutao; Hou, Jinna; Long, Yan; Wang, Jing; Li, Cong; Xiao, Qinqin; Jiang, Xiaoxue; Zou, Xiaoxiao; Zou, Jun; Meng, Jinling
2015-06-19
Miniature inverted repeat transposable elements (MITEs) are important components of eukaryotic genomes, with hundreds of families and many copies, which may play important roles in gene regulation and genome evolution. However, few studies have investigated the molecular mechanisms involved. In our previous study, a Tourist-like MITE, Monkey King, was identified from the promoter region of a flowering time gene, BnFLC.A10, in Brassica napus. Based on this MITE, the characteristics and potential roles on gene regulation of the MITE family were analyzed in Brassicaceae. The characteristics of the Tourist-like MITE family Monkey King in Brassicaceae, including its distribution, copies and insertion sites in the genomes of major Brassicaceae species were analyzed in this study. Monkey King was actively amplified in Brassica after divergence from Arabidopsis, which was indicated by the prompt increase in copy number and by phylogenetic analysis. The genomic variations caused by Monkey King insertions, both intra- and inter-species in Brassica, were traced by PCR amplification. Genomic sequence analysis showed that most complete Monkey King elements are located in gene-rich regions, less than 3kb from genes, in both the B. rapa and A. thaliana genomes. Sixty-seven Brassica expressed sequence tags carrying Monkey King fragments were also identified from the NCBI database. Bisulfite sequencing identified specific DNA methylation of cytosine residues in the Monkey King sequence. A fragment containing putative TATA-box motifs in the MITE sequence could bind with nuclear protein(s) extracted from leaves of B. napus plants. A Monkey King-related microRNA, bna-miR6031, was identified in the microRNA database. In transgenic A. thaliana, when the Monkey King element was inserted upstream of 35S promoter, the promoter activity was weakened. Monkey King, a Brassicaceae Tourist-like MITE family, has amplified relatively recently and has induced intra- and inter-species genomic variations in Brassica. Monkey King elements are most abundant in the vicinity of genes and may have a substantial effect on genome-wide gene regulation in Brassicaceae. Monkey King insertions potentially regulate gene expression and genome evolution through epigenetic modification and new regulatory motif production.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Golden, Susan S
2008-10-16
The aim of this project was to inactivate each locus of the genome of the cyanobacterium Synechococcus elongatus PCC 7942 and screen resulting mutants for altered circadian phenotypes. The immediate goal was to identify all open reading frames (ORFs) that contribute to circadian timing. An additional result was to create a complete archived set of mutagenesis templates, of great utility for the wider research community, that will allow inactivation of any given locus in the genome of S. elongatus. Clones that carry segments of the S. elongatus genome were saturated with transposon insertions in vitro. We completed saturation mutagenesis ofmore » the chromosome (~2800 ORFs). The positions of insertions were sequenced for 17,767 mutagenized clones. Each individual insertion into the S. elongatus DNA in a cosmid or plasmid is a substrate for mutagenesis of the genome via homologous recombination. Because the complete insertion mutation clone set is 5-7 fold redundant, we produced a streamlined set of clones that contains one insertion mutation per locus in the genome, a unigene set. All clones are archived as Escherichia coli stocks frozen in glycerol in 96-well plates at -85ºC and as replicas of these plates on Whatman CloneSaver cards. Each of the mutagenesis substrates from the unigene set has been recombined into the chromosome of wild-type S. elongatus and these cyanobacterial mutants have been archived at -85ºC as well. S. elongatus insertion mutants defective for than 1400 independent genes have screened in luciferase reporter gene backgrounds to evaluate the effect of each mutation on circadian rhythms of gene expression. For the first 700 genes tested, mutagenesis of 71 different ORFs resulted in circadian phenotypes. The mutagenesis project also created insertion mutations in the endogenous large plasmid of S. elongatus, pANL. The sequence of pANL revealed two potential addiction cassettes that appear to account for selection for plasmid persistence. Genetic experiments confirmed that these regions are present on all sub-sets of the plasmid that can replace wild-type pANL. Analysis of mutants defective in each of the remaining ~1400 genes for defects in circadian rhythms will be completed with support from another agency as part of a larger project on circadian rhythms in this cyanobacterium.« less
2011-01-01
Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357
Gao, Yuanzheng; Guo, Xiuming; Santostefano, Katherine; Wang, Yanlin; Reid, Tammy; Zeng, Desmond; Terada, Naohiro; Ashizawa, Tetsuo; Xia, Guangbin
2016-08-01
Myotonic dystrophy type 1 (DM1) is caused by expanded Cytosine-Thymine-Guanine (CTG) repeats in the 3'-untranslated region (3' UTR) of the Dystrophia myotonica protein kinase (DMPK) gene, for which there is no effective therapy. The objective of this study is to develop genome therapy in human DM1 induced pluripotent stem (iPS) cells to eliminate mutant transcripts and reverse the phenotypes for developing autologous stem cell therapy. The general approach involves targeted insertion of polyA signals (PASs) upstream of DMPK CTG repeats, which will lead to premature termination of transcription and elimination of toxic mutant transcripts. Insertion of PASs was mediated by homologous recombination triggered by site-specific transcription activator-like effector nuclease (TALEN)-induced double-strand break. We found genome-treated DM1 iPS cells continue to maintain pluripotency. The insertion of PASs led to elimination of mutant transcripts and complete disappearance of nuclear RNA foci and reversal of aberrant splicing in linear-differentiated neural stem cells, cardiomyocytes, and teratoma tissues. In conclusion, genome therapy by insertion of PASs upstream of the expanded DMPK CTG repeats prevented the production of toxic mutant transcripts and reversal of phenotypes in DM1 iPS cells and their progeny. These genetically-treated iPS cells will have broad clinical application in developing autologous stem cell therapy for DM1.
Mercenaro, Luca; Nieddu, Giovanni; Porceddu, Andrea; Pezzotti, Mario; Camiolo, Salvatore
2017-01-01
The genetic diversity among grapevine (Vitis vinifera L.) cultivars that underlies differences in agronomic performance and wine quality reflects the accumulation of single nucleotide polymorphisms (SNPs) and small indels as well as larger genomic variations. A combination of high throughput sequencing and mapping against the grapevine reference genome allows the creation of comprehensive sequence variation maps. We used next generation sequencing and bioinformatics to generate an inventory of SNPs and small indels in four widely cultivated Sardinian grape cultivars (Bovale sardo, Cannonau, Carignano and Vermentino). More than 3,200,000 SNPs were identified with high statistical confidence. Some of the SNPs caused the appearance of premature stop codons and thus identified putative pseudogenes. The analysis of SNP distribution along chromosomes led to the identification of large genomic regions with uninterrupted series of homozygous SNPs. We used a digital comparative genomic hybridization approach to identify 6526 genomic regions with significant differences in copy number among the four cultivars compared to the reference sequence, including 81 regions shared between all four cultivars and 4953 specific to single cultivars (representing 1.2 and 75.9% of total copy number variation, respectively). Reads mapping at a distance that was not compatible with the insert size were used to identify a dataset of putative large deletions with cultivar Cannonau revealing the highest number. The analysis of genes mapping to these regions provided a list of candidates that may explain some of the phenotypic differences among the Bovale sardo, Cannonau, Carignano and Vermentino cultivars. PMID:28775732
Gustavsson, Peter; Förster, Alisa; Hofmeister, Wolfgang; Wincent, Josephine; Zachariadis, Vasilios; Anderlid, Britt-Marie; Nordgren, Ann; Mäkitie, Outi; Wirta, Valtteri; Käller, Max; Vezzi, Francesco; Lupski, James R; Nordenskjöld, Magnus; Lundberg, Elisabeth Syk; Carvalho, Claudia M. B.; Lindstrand, Anna
2016-01-01
Most balanced translocations are thought to result mechanistically from non-homologous endjoining (NHEJ) or, in rare cases of recurrent events, by nonallelic homologous recombination (NAHR). Here, we use low coverage mate pair whole genome sequencing to fine map rearrangement breakpoint junctions in both phenotypically normal and affected translocation carriers. In total, 46 junctions from 22 carriers of balanced translocations were characterized. Genes were disrupted in 48% of the breakpoints; recessive genes in four normal carriers and known dominant intellectual disability genes in three affected carriers. Finally, seven candidate disease genes were disrupted in five carriers with neurocognitive disabilities (SVOPL, SUSD1, TOX, NCALD, SLC4A10) and one XX-male carrier with Tourette syndrome (LYPD6, GPC5). Breakpoint junction analyses revealed microhomology and small templated insertions in a substantive fraction of the analyzed translocations (17.4%; n=4); an observation that was substantiated by reanalysis of 37 previously published translocation junctions. Microhomology associated with templated-insertions is a characteristic seen in the breakpoint junctions of rearrangements mediated by the error prone replication-based repair mechanisms (RBMs). Our data implicate that a mechanism involving template switching might contribute to the formation of at least 15% of the interchromosomal translocation events. PMID:27862604
On the inversion-indel distance
2013-01-01
Background The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be computed thanks to a pioneering approach of Hannenhalli and Pevzner in 1995. In 2000, El-Mabrouk extended the inversion model to allow the comparison of unichromosomal genomes with unequal contents, thus insertions and deletions of DNA segments besides inversions. However, an exact algorithm was presented only for the case in which we have insertions alone and no deletion (or vice versa), while a heuristic was provided for the symmetric case, that allows both insertions and deletions and is called the inversion-indel distance. In 2005, Yancopoulos, Attie and Friedberg started a new branch of research by introducing the generic double cut and join (DCJ) operation, that can represent several genome rearrangements (including inversions). Among others, the DCJ model gave rise to two important results. First, it has been shown that the inversion distance can be computed in a simpler way with the help of the DCJ operation. Second, the DCJ operation originated the DCJ-indel distance, that allows the comparison of genomes with unequal contents, considering DCJ, insertions and deletions, and can be computed in linear time. Results In the present work we put these two results together to solve an open problem, showing that, when the graph that represents the relation between the two compared genomes has no bad components, the inversion-indel distance is equal to the DCJ-indel distance. We also give a lower and an upper bound for the inversion-indel distance in the presence of bad components. PMID:24564182
Tol2 transposon-mediated transgenesis in Xenopus tropicalis.
Hamlet, Michelle R Johnson; Yergeau, Donald A; Kuliyev, Emin; Takeda, Masatoshi; Taira, Masanori; Kawakami, Koichi; Mead, Paul E
2006-09-01
The diploid frog Xenopus tropicalis is becoming a powerful developmental genetic model system. Sequencing of the X. tropicalis genome is nearing completion and several labs are embarking on mutagenesis screens. We are interested in developing insertional mutagenesis strategies in X. tropicalis. Transposon-mediated insertional mutagenesis, once used exclusively in plants and invertebrate systems, is now more widely applicable to vertebrates. The first step in developing transposons as tools for mutagenesis is to demonstrate that these mobile elements function efficiently in the target organism. Here, we show that the Medaka fish transposon, Tol2, is able to stably integrate into the X. tropicalis genome and will serve as a powerful tool for insertional mutagenesis strategies in the frog.
2009-01-01
Background With the publication of the draft chicken genome and the recent production of several BAC clone libraries from non-avian reptiles and birds, it is now possible to undertake more detailed comparative genomic studies in Reptilia. Of interest in particular are the genomic events that transformed the large, repeat-rich genomes of mammals and non-avian reptiles into the minimalist chicken genome. We have used paired BAC end sequences (BESs) from the American alligator (Alligator mississippiensis), painted turtle (Chrysemys picta) and emu (Dromaius novaehollandiae) to investigate patterns of sequence divergence, gene and retroelement content, and microsynteny between these species and chicken. Results From a total of 11,967 curated BESs, we successfully mapped 725, 773 and 2597 sequences in alligator, turtle, and emu, respectively, to sites in the draft chicken genome using a stringent BLAST protocol. Most commonly, sequences mapped to a single site in the chicken genome. Of 1675, 1828 and 2936 paired BESs obtained for alligator, turtle, and emu, respectively, a total of 34 (alligator, 2%), 24 (turtle, 1.3%) and 479 (emu, 16.3%) pairs were found to map with high confidence and in the correct orientation and with BAC-sized intermarker distances to single chicken chromosomes, including 25 such paired hits in emu mapping to the chicken Z chromosome. By determining the insert sizes of a subset of BAC clones from these three species, we also found a significant correlation between the intermarker distance in alligator and turtle and in chicken, with slopes as expected on the basis of the ratio of the genome sizes. Conclusion Our results suggest that a large number of small-scale chromosomal rearrangements and deletions in the lineage leading to chicken have drastically reduced the number of detected syntenies observed between the chicken and alligator, turtle, and emu genomes and imply that small deletions occurring widely throughout the genomes of reptilian and avian ancestors led to the ~50% reduction in genome size observed in birds compared to reptiles. We have also mapped and identified likely gene regions in hundreds of new BAC clones from these species. PMID:19607659
Chapus, Charles; Edwards, Scott V
2009-07-14
With the publication of the draft chicken genome and the recent production of several BAC clone libraries from non-avian reptiles and birds, it is now possible to undertake more detailed comparative genomic studies in Reptilia. Of interest in particular are the genomic events that transformed the large, repeat-rich genomes of mammals and non-avian reptiles into the minimalist chicken genome. We have used paired BAC end sequences (BESs) from the American alligator (Alligator mississippiensis), painted turtle (Chrysemys picta) and emu (Dromaius novaehollandiae) to investigate patterns of sequence divergence, gene and retroelement content, and microsynteny between these species and chicken. From a total of 11,967 curated BESs, we successfully mapped 725, 773 and 2597 sequences in alligator, turtle, and emu, respectively, to sites in the draft chicken genome using a stringent BLAST protocol. Most commonly, sequences mapped to a single site in the chicken genome. Of 1675, 1828 and 2936 paired BESs obtained for alligator, turtle, and emu, respectively, a total of 34 (alligator, 2%), 24 (turtle, 1.3%) and 479 (emu, 16.3%) pairs were found to map with high confidence and in the correct orientation and with BAC-sized intermarker distances to single chicken chromosomes, including 25 such paired hits in emu mapping to the chicken Z chromosome. By determining the insert sizes of a subset of BAC clones from these three species, we also found a significant correlation between the intermarker distance in alligator and turtle and in chicken, with slopes as expected on the basis of the ratio of the genome sizes. Our results suggest that a large number of small-scale chromosomal rearrangements and deletions in the lineage leading to chicken have drastically reduced the number of detected syntenies observed between the chicken and alligator, turtle, and emu genomes and imply that small deletions occurring widely throughout the genomes of reptilian and avian ancestors led to the ~50% reduction in genome size observed in birds compared to reptiles. We have also mapped and identified likely gene regions in hundreds of new BAC clones from these species.
USDA-ARS?s Scientific Manuscript database
The American cranberry (Vaccinium macrocarpon Ait.) mitochondrial genome was assembled and reconstructed from whole genome 454 Roche GS-FLX and Illumina shotgun sequences. Compared with other Asterids, the reconstruction of the genome revealed an average size mitochondrion (459,678 nt) with comparat...
2013-01-01
Background Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re-sequencing accessions, which represent wild, domesticated landrace, and Chinese elite soybean populations were analyzed. Results A total of 5,102,244 single nucleotide polymorphisms (SNPs) and 707,969 insertion/deletions were identified. Among the SNPs detected, 25.5% were not described previously. We found that artificial selection during domestication led to more pronounced reduction in the genetic diversity of soybean than the switch from landraces to elite cultivars. Only a small proportion (2.99%) of the whole genomic regions appear to be affected by artificial selection for preferred agricultural traits. The selection regions were not distributed randomly or uniformly throughout the genome. Instead, clusters of selection hotspots in certain genomic regions were observed. Moreover, a set of candidate genes (4.38% of the total annotated genes) significantly affected by selection underlying soybean domestication and genetic improvement were identified. Conclusions Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes/loci underlying agronomically important traits. PMID:23984715
The Transposable Element Mariner Mediates Germline Transformation in Drosophila Melanogaster
Lidholm, D. A.; Lohe, A. R.; Hartl, D. L.
1993-01-01
A vector for germline transformation in Drosophila melanogaster was constructed using the transposable element mariner. The vector, denoted pMlwB, contains a mariner element disrupted by an insertion containing the wild-type white gene from D. melanogaster, the β-galactosidase gene from Escherichia coli and sequences that enable plasmid replication and selection in E. coli. The white gene is controlled by the promoter of the D. melanogaster gene for heat-shock protein 70, and the β-galactosidase gene is flanked upstream by the promoter of the transposable element P as well as that of mariner. The MlwB element was introduced into the germline of D. melanogaster by co-injection into embryos with an active mariner element, Mos1, which codes for a functional transposase and serves as a helper. Two independent germline insertions were isolated and characterized. The results show that the MlwB element inserted into the genome in a mariner-dependent manner with the termini of the inverted repeats inserted at a TA dinucleotide. Both insertions exhibit an unexpected degree of germline and somatic stability, even in the presence of an active mariner element in the genetic background. These results demonstrate that the mariner transposable element, which is small (1286 bp) and relatively homogeneous in size among different copies, is nevertheless capable of promoting the insertion of the large (13.2 kb) MlwB element. Because of the widespread phylogenetic distribution of mariner among insects, these results suggest that mariner might provide a wide hostrange transformation vector for insects. PMID:8394264
Parasitism and the retrotransposon life cycle in plants: a hitchhiker's guide to the genome.
Sabot, F; Schulman, A H
2006-12-01
LTR (long terminal repeat) retrotransposons are the main components of higher plant genomic DNA. They have shaped their host genomes through insertional mutagenesis and by effects on genome size, gene expression and recombination. These Class I transposable elements are closely related to retroviruses such as the HIV by their structure and presumptive life cycle. However, the retrotransposon life cycle has been closely investigated in few systems. For retroviruses and retrotransposons, individual defective copies can parasitize the activity of functional ones. However, some LTR retrotransposon groups as a whole, such as large retrotransposon derivatives and terminal repeats in miniature, are non-autonomous even though their genomic insertion patterns remain polymorphic between organismal accessions. Here, we examine what is known of the retrotransposon life cycle in plants, and in that context discuss the role of parasitism and complementation between and within retrotransposon groups.
MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.
Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi
2018-03-05
Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.
Loong, Herbert H; Raymond, Victoria M; Shiotsu, Yukimasa; Chua, Daniel T T; Teo, Peter M L; Yung, Tony; Skrzypczak, Stan; Lanman, Richard B; Mok, Tony S K
2018-05-07
Genomic profiling of cell-free circulating tumor DNA (ctDNA) is a potential alternative to repeat invasive biopsy in patients with advanced cancer. We report the first real-world cohort of comprehensive genomic assessments of patients with non-small-cell lung cancer (NSCLC) in a Chinese population. We performed a retrospective analysis of patients with advanced or metastatic NSCLC whose physician requested ctDNA-based genomic profiling using the Guardant360 platform from January 2016 to June 2017. Guardant360 includes all 4 major types of genomic alterations (point mutations, insertion-deletion alterations, fusions, and amplifications) in 73 genes. Genomic profiling was performed in 76 patients from Hong Kong during the 18-month study period (median age, 59.5 years; 41 men and 35 women). The histologic types included adenocarcinoma (n = 10), NSCLC, not otherwise specified (n = 58), and squamous cell carcinoma (n = 8). In the adenocarcinoma and NSCLC, not otherwise specified, combined group, 62 of the 68 patients (91%) had variants identified (range, 1-12; median, 3), of whom, 26 (42%) had ≥ 1 of the 7 National Comprehensive Cancer Network-recommended lung adenocarcinoma genomic targets. Concurrent detection of driver and resistance mutations were identified in 6 of 13 patients with EGFR driver mutations and in 3 of 5 patients with EML4-ALK fusions. All 8 patients with squamous cell carcinoma had multiple variants identified (range, 1-20; median, 6), including FGFR1 amplification and ERBB2 (HER2) amplification. PIK3CA amplification occurred in combination with either FGFR1 or ERBB2 (HER2) amplification or alone. Genomic profiling using ctDNA analysis detected alterations in most patients with advanced-stage NSCLC, with targetable aberrations and resistance mechanisms identified. This approach has demonstrated its feasibility in Asia. Copyright © 2018 Elsevier Inc. All rights reserved.
Stable zymomonas mobilis xylose and arabinose fermenting strains
Zhang, Min [Lakewood, CO; Chou, Yat-Chen [Taipei, TW
2008-04-08
The present invention briefly includes a transposon for stable insertion of foreign genes into a bacterial genome, comprising at least one operon having structural genes encoding enzymes selected from the group consisting of xylAxylB, araBAD and tal/tkt, and at least one promoter for expression of the structural genes in the bacterium, a pair of inverted insertion sequences, the operons contained inside the insertion sequences, and a transposase gene located outside of the insertion sequences. A plasmid shuttle vector for transformation of foreign genes into a bacterial genome, comprising at least one operon having structural genes encoding enzymes selected from the group consisting of xylAxylB, araBAD and tal/tkt, at least one promoter for expression of the structural genes in the bacterium, and at least two DNA fragments having homology with a gene in the bacterial genome to be transformed, is also provided.The transposon and shuttle vectors are useful in constructing significantly different Zymomonas mobilis strains, according to the present invention, which are useful in the conversion of the cellulose derived pentose sugars into fuels and chemicals, using traditional fermentation technology, because they are stable for expression in a non-selection medium.
Traverse, Charles C.
2017-01-01
ABSTRACT Advances in sequencing technologies have enabled direct quantification of genome-wide errors that occur during RNA transcription. These errors occur at rates that are orders of magnitude higher than rates during DNA replication, but due to technical difficulties such measurements have been limited to single-base substitutions and have not yet quantified the scope of transcription insertions and deletions. Previous reporter gene assay findings suggested that transcription indels are produced exclusively by elongation complex slippage at homopolymeric runs, so we enumerated indels across the protein-coding transcriptomes of Escherichia coli and Buchnera aphidicola, which differ widely in their genomic base compositions and incidence of repeat regions. As anticipated from prior assays, transcription insertions prevailed in homopolymeric runs of A and T; however, transcription deletions arose in much more complex sequences and were rarely associated with homopolymeric runs. By reconstructing the relocated positions of the elongation complex as inferred from the sequences inserted or deleted during transcription, we show that continuation of transcription after slippage hinges on the degree of nucleotide complementarity within the RNA:DNA hybrid at the new DNA template location. PMID:28851848
Wang, Yan; Zhang, Wen; Liu, Zhijian; Fu, Xingli; Yuan, Jiaqi; Zhao, Jieji; Lin, Yuan; Shen, Quan; Wang, Xiaochun; Deng, Xutao; Delwart, Eric; Shan, Tongling; Yang, Shixing
2018-05-21
Recombination occurs frequently between enteroviruses (EVs) which are classified within the same species of the Picornaviridae family. Here, using viral metagenomics, the genomes of two recombinant EV-Gs (strains EVG 01/NC_CHI/2014 and EVG 02/NC_CHI/2014) found in the feces of pigs from a swine farm in China are described. The two strains are characterized by distinct insertion of a papain-like protease gene from toroviruses classified within the Coronaviridae family. According to recent reports the site of the torovirus protease insertion was located at the 2C/3A junction region in EVG 02/NC_CHI/2014. For the other variant EVG 01/NC_CHI/2014, the inserted protease sequence replaced the entire viral capsid protein region up to the VP1/2A junction. These two EV-G strains were highly prevalent in the same pig farm with all animals shedding the full-length genome (EVG 02/NC_CHI/2014) while 65% also shed the capsid deletion mutant (EVG 01/NC_CHI/2014). A helper-defective virus relationship between the two co-circulating EV-G recombinants is hypothesized.
Insertion and deletion mutagenesis of the human cytomegalovirus genome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spaete, R.R.; Mocarski, E.S.
1987-10-01
Studies on human cytomegalovirus (CMV) have been limited by a paucity of molecular genetic techniques available for manipulating the viral genome. The authors have developed methods for site-specific insertion and deletion mutagenesis of CMV utilizing a modified Escherichia coli lacZ gene as a genetic marker. The lacZ gene was placed under the control of the major ..beta.. gene regulatory signals and inserted into the viral genome by homologous recombination, disrupting one of two copies of this ..beta.. gene within the L-component repeats of CMV DNA. They observed high-level expression of ..beta..-galactosidase by the recombinant in a temporally authentic manner, withmore » levels of this enzyme approaching 1% of total protein in infected cells. Thus, CMV is an efficient vector for high-level expression of foreign gene products in human cells. Using back selection of lacZ-deficient virus in the presence of the chromogenic substrate 5-bromo-4-chloro-3-indolyl ..beta..-D-galactoside, they generated random endpoint deletion mutants. Analysis of these mutant revealed that CMV DNA sequences flanking the insert had been removed, thereby establishing this approach as a means of determining whether sequences flanking a lacZ insertion are dispensable for viral growth. In an initial test of the methods, they have shown that 7800 base pairs of one copy of L-component repeat sequences can be deleted without affecting viral growth in human fibroblasts.« less
Moyer, Tyler C; Holland, Andrew J
2015-01-01
The ability to rapidly and specifically modify the genome of mammalian cells has been a long-term goal of biomedical researchers. Recently, the clustered, regularly interspaced, short palindromic repeats (CRISPR)/Cas9 system from bacteria has been exploited for genome engineering in human cells. The CRISPR system directs the RNA-guided Cas9 nuclease to a specific genomic locus to induce a DNA double-strand break that may be subsequently repaired by homology-directed repair using an exogenous DNA repair template. Here we describe a protocol using CRISPR/Cas9 to achieve bi-allelic insertion of a point mutation in human cells. Using this method, homozygous clonal cell lines can be constructed in 5-6 weeks. This method can also be adapted to insert larger DNA elements, such as fluorescent proteins and degrons, at defined genomic locations. CRISPR/Cas9 genome engineering offers exciting applications in both basic science and translational research. Copyright © 2015 Elsevier Inc. All rights reserved.
Zhu, Hongwen; Shang, Dandan; Sun, Miao; Choi, Sunju; Liu, Qing; Hao, Jiajie; Figuera, Luis E.; Zhang, Feng; Choy, Kwong Wai; Ao, Yang; Liu, Yang; Zhang, Xiao-Lin; Yue, Fengzhen; Wang, Ming-Rong; Jin, Li; Patel, Pragna I.; Jing, Tao; Zhang, Xue
2011-01-01
X-linked congenital generalized hypertrichosis (CGH), an extremely rare condition characterized by universal overgrowth of terminal hair, was first mapped to chromosome Xq24-q27.1 in a Mexican family. However, the underlying genetic defect remains unknown. We ascertained a large Chinese family with an X-linked congenital hypertrichosis syndrome combining CGH, scoliosis, and spina bifida and mapped the disease locus to a 5.6 Mb critical region within the interval defined by the previously reported Mexican family. Through the combination of a high-resolution copy-number variation (CNV) scan and targeted genomic sequencing, we identified an interchromosomal insertion at Xq27.1 of a 125,577 bp intragenic fragment of COL23A1 on 5q35.3, with one X breakpoint within and the other very close to a human-specific short palindromic sequence located 82 kb downstream of SOX3. In the Mexican family, we found an interchromosomal insertion at the same Xq27.1 site of a 300,036 bp genomic fragment on 4q31.2, encompassing PRMT10 and TMEM184C and involving parts of ARHGAP10 and EDNRA. Notably, both of the two X breakpoints were within the short palindrome. The two palindrome-mediated insertions fully segregate with the CGH phenotype in each of the families, and the CNV gains of the respective autosomal genomic segments are not present in the public database and were not found in 1274 control individuals. Analysis of control individuals revealed deletions ranging from 173 bp to 9104 bp at the site of the insertions with no phenotypic consequence. Taken together, our results strongly support the pathogenicity of the identified insertions and establish X-linked congenital hypertrichosis syndrome as a genomic disorder. PMID:21636067
Saranathan, Rajagopalan; Pagal, Sudhakar; Sawant, Ajit R; Tomar, Archana; Madhangi, M; Sah, Suresh; Satti, Annapurna; Arunkumar, K P; Prashanth, K
2017-10-03
Acinetobacter baumannii is an important human pathogen and considered as a major threat due to its extreme drug resistance. In this study, the genome of a hyper-virulent MDR strain PKAB07 of A. baumannii isolated from an Indian patient was sequenced and analyzed to understand its mechanisms of virulence, resistance and evolution. Comparative genome analysis of PKAB07 revealed virulence and resistance related genes scattered throughout the genome, instead of being organized as an island, indicating the highly mosaic nature of the genome. Many intermittent horizontal gene transfer events, insertion sequence (IS) element insertions identified were augmenting resistance machinery and elevating the SNP densities in A. baumannii eventually aiding in their swift evolution. ISAba1, the most widely distributed insertion sequence in A. baumannii was found in multiple sites in PKAB07. Out of many ISAba1 insertions, we identified novel insertions in 9 different genes wherein insertional inactivation of adeN (tetR type regulator) was significant. To assess the significance of this disruption in A. baumannii, adeN mutant and complement strains were constructed in A. baumannii ATCC 17978 strain and studied. Biofilm levels were abrogated in the adeN knockout when compared with the wild type and complemented strain of adeN knockout. Virulence of the adeN knockout mutant strain was observed to be high, which was validated by in vitro experiments and Galleria mellonella infection model. The overexpression of adeJ, a major component of AdeIJK efflux pump observed in adeN knockout strain could be the possible reason for the elevated virulence in adeN mutant and PKB07 strain. Knocking out of adeN in ATCC strain led to increased resistance and virulence at par with the PKAB07. Disruption of tetR type regulator adeN by ISAba1 consequently has led to elevated virulence in this pathogen.
Wachter, Shaun; Raghavan, Rahul; Wachter, Jenny; Minnick, Michael F
2018-04-11
Coxiella burnetii is a Gram-negative gammaproteobacterium and zoonotic agent of Q fever. C. burnetii's genome contains an abundance of pseudogenes and numerous selfish genetic elements. MITEs (miniature inverted-repeat transposable elements) are non-autonomous transposons that occur in all domains of life and are thought to be insertion sequences (ISs) that have lost their transposase function. Like most transposable elements (TEs), MITEs are thought to play an active role in evolution by altering gene function and expression through insertion and deletion activities. However, information regarding bacterial MITEs is limited. We describe two MITE families discovered during research on small non-coding RNAs (sRNAs) of C. burnetii. Two sRNAs, Cbsr3 and Cbsr13, were found to originate from a novel MITE family, termed QMITE1. Another sRNA, CbsR16, was found to originate from a separate and novel MITE family, termed QMITE2. Members of each family occur ~ 50 times within the strains evaluated. QMITE1 is a typical MITE of 300-400 bp with short (2-3 nt) direct repeats (DRs) of variable sequence and is often found overlapping annotated open reading frames (ORFs). Additionally, QMITE1 elements possess sigma-70 promoters and are transcriptionally active at several loci, potentially influencing expression of nearby genes. QMITE2 is smaller (150-190 bps), but has longer (7-11 nt) DRs of variable sequences and is mainly found in the 3' untranslated region of annotated ORFs and intergenic regions. QMITE2 contains a GTAG repetitive extragenic palindrome (REP) that serves as a target for IS1111 TE insertion. Both QMITE1 and QMITE2 display inter-strain linkage and sequence conservation, suggesting that they are adaptive and existed before divergence of C. burnetii strains. We have discovered two novel MITE families of C. burnetii. Our finding that MITEs serve as a source for sRNAs is novel. QMITE2 has a unique structure and occurs in large or small versions with unique DRs that display linkage and sequence conservation between strains, allowing for tracking of genomic rearrangements. QMITE1 and QMITE2 copies are hypothesized to influence expression of neighboring genes involved in DNA repair and virulence through transcriptional interference and ribonuclease processing.
Short and long-term genome stability analysis of prokaryotic genomes.
Brilli, Matteo; Liò, Pietro; Lacroix, Vincent; Sagot, Marie-France
2013-05-08
Gene organization dynamics is actively studied because it provides useful evolutionary information, makes functional annotation easier and often enables to characterize pathogens. There is therefore a strong interest in understanding the variability of this trait and the possible correlations with life-style. Two kinds of events affect genome organization: on one hand translocations and recombinations change the relative position of genes shared by two genomes (i.e. the backbone gene order); on the other, insertions and deletions leave the backbone gene order unchanged but they alter the gene neighborhoods by breaking the syntenic regions. A complete picture about genome organization evolution therefore requires to account for both kinds of events. We developed an approach where we model chromosomes as graphs on which we compute different stability estimators; we consider genome rearrangements as well as the effect of gene insertions and deletions. In a first part of the paper, we fit a measure of backbone gene order conservation (hereinafter called backbone stability) against phylogenetic distance for over 3000 genome comparisons, improving existing models for the divergence in time of backbone stability. Intra- and inter-specific comparisons were treated separately to focus on different time-scales. The use of multiple genomes of a same species allowed to identify genomes with diverging gene order with respect to their conspecific. The inter-species analysis indicates that pathogens are more often unstable with respect to non-pathogens. In a second part of the text, we show that in pathogens, gene content dynamics (insertions and deletions) have a much more dramatic effect on genome organization stability than backbone rearrangements. In this work, we studied genome organization divergence taking into account the contribution of both genome order rearrangements and genome content dynamics. By studying species with multiple sequenced genomes available, we were able to explore genome organization stability at different time-scales and to find significant differences for pathogen and non-pathogen species. The output of our framework also allows to identify the conserved gene clusters and/or partial occurrences thereof, making possible to explore how gene clusters assembled during evolution.
Draft genome sequences of Actinomyces timonensis strain 7400942T and its prophage.
Gorlas, Aurore; Gimenez, Grégory; Raoult, Didier; Roux, Véronique
2012-12-01
A draft genome sequence of Actinomyces timonensis, an anaerobic bacterium isolated from a human clinical osteoarticular sample, is described here. CRISPR-associated proteins, insertion sequence, and toxin-antitoxin loci were found on the genome. A new virus or provirus, AT-1, was characterized.
Hypervariable and highly divergent intron-exon organizations in the chordate Oikopleura dioica.
Edvardsen, Rolf B; Lerat, Emmanuelle; Maeland, Anne Dorthea; Flåt, Mette; Tewari, Rita; Jensen, Marit F; Lehrach, Hans; Reinhardt, Richard; Seo, Hee-Chan; Chourrout, Daniel
2004-10-01
Oikopleura dioica is a pelagic tunicate with a very small genome and a very short life cycle. In order to investigate the intron-exon organizations in Oikopleura, we have isolated and characterized ribosomal protein EF-1alpha, Hox, and alpha-tubulin genes. Their intron positions have been compared with those of the same genes from various invertebrates and vertebrates, including four species with entirely sequenced genomes. Oikopleura genes, like Caenorhabditis genes, have introns at a large number of nonconserved positions, which must originate from late insertions or intron sliding of ancient insertions. Both species exhibit hypervariable intron-exon organization within their alpha-tubulin gene family. This is due to localization of most nonconserved intron positions in single members of this gene family. The hypervariability and divergence of intron positions in Oikopleura and Caenorhabditis may be related to the predominance of short introns, the processing of which is not very dependent upon the exonic environment compared to large introns. Also, both species have an undermethylated genome, and the control of methylation-induced point mutations imposes a control on exon size, at least in vertebrate genes. That introns placed at such variable positions in Oikopleura or C. elegans may serve a specific purpose is not easy to infer from our current knowledge and hypotheses on intron functions. We propose that new introns are retained in species with very short life cycles, because illegitimate exchanges including gene conversion are repressed. We also speculate that introns placed at gene-specific positions may contribute to suppressing these exchanges and thereby favor their own persistence.
Adachi, Kaori
2014-03-01
At the Division of Functional Genomics, Research Center for Bioscience and Technology, Tottori University, we have been making an effort to establish a genetic testing facility that can provide the same screening procedures conducted worldwide. Direct Sequencing of PCR products is the main method to detect point mutations, small deletions and insertions. Multiplex Ligation-dependent Probe Amplification (MLPA) was used to detect large deletions or insertions. Expansion of the repeat was analyzed for triplet repeat diseases. Original primers were constructed for 41 diseases when the reported primers failed to amplify the gene. Prediction of functional effects of human nsSNPs (PolyPhen) was used for evaluation of novel mutations. From January 2000 to September 2013, a total of 1,006 DNA samples were subjected to genetic testing in the Division of Functional Genomics, Research Center for Bioscience and Technology, Tottori University. The hospitals that requested genetic testing were located in 43 prefectures in Japan and in 11 foreign countries. The genetic testing covered 62 diseases, and mutations were detected in 287 out of 1,006 with an average mutation detection rate of 24.7%. There were 77 samples for prenatal diagnosis. The number of samples has rapidly increased since 2010. In 2013, the next-generation sequencers were introduced in our facility and are expected to provide more comprehensive genetic testing in the near future. Nowadays, genetic testing is a popular and powerful tool for diagnosis of many genetic diseases. Our genetic testing should be further expanded in the future.
Allying with armored snails: the complete genome of gammaproteobacterial endosymbiont.
Nakagawa, Satoshi; Shimamura, Shigeru; Takaki, Yoshihiro; Suzuki, Yohey; Murakami, Shun-ichi; Watanabe, Tamaki; Fujiyoshi, So; Mino, Sayaka; Sawabe, Tomoo; Maeda, Takahiro; Makita, Hiroko; Nemoto, Suguru; Nishimura, Shin-Ichiro; Watanabe, Hiromi; Watsuji, Tomo-o; Takai, Ken
2014-01-01
Deep-sea vents harbor dense populations of various animals that have their specific symbiotic bacteria. Scaly-foot gastropods, which are snails with mineralized scales covering the sides of its foot, have a gammaproteobacterial endosymbiont in their enlarged esophageal glands and diverse epibionts on the surface of their scales. In this study, we report the complete genome sequencing of gammaproteobacterial endosymbiont. The endosymbiont genome displays features consistent with ongoing genome reduction such as large proportions of pseudogenes and insertion elements. The genome encodes functions commonly found in deep-sea vent chemoautotrophs such as sulfur oxidation and carbon fixation. Stable carbon isotope ((13)C)-labeling experiments confirmed the endosymbiont chemoautotrophy. The genome also includes an intact hydrogenase gene cluster that potentially has been horizontally transferred from phylogenetically distant bacteria. Notable findings include the presence and transcription of genes for flagellar assembly, through which proteins are potentially exported from bacterium to the host. Symbionts of snail individuals exhibited extreme genetic homogeneity, showing only two synonymous changes in 19 different genes (13 810 positions in total) determined for 32 individual gastropods collected from a single colony at one time. The extremely low genetic individuality in endosymbionts probably reflects that the stringent symbiont selection by host prevents the random genetic drift in the small population of horizontally transmitted symbiont. This study is the first complete genome analysis of gastropod endosymbiont and offers an opportunity to study genome evolution in a recently evolved endosymbiont.
Identification and Characterization of Domesticated Bacterial Transposases
Gallie, Jenna; Rainey, Paul B.
2017-01-01
Abstract Selfish genetic elements, such as insertion sequences and transposons are found in most genomes. Transposons are usually identifiable by their high copy number within genomes. In contrast, REP-associated tyrosine transposases (RAYTs), a recently described class of bacterial transposase, are typically present at just one copy per genome. This suggests that RAYTs no longer copy themselves and thus they no longer function as a typical transposase. Motivated by this possibility we interrogated thousands of fully sequenced bacterial genomes in order to determine patterns of RAYT diversity, their distribution across chromosomes and accessory elements, and rate of duplication. RAYTs encompass exceptional diversity and are divisible into at least five distinct groups. They possess features more similar to housekeeping genes than insertion sequences, are predominantly vertically transmitted and have persisted through evolutionary time to the point where they are now found in 24% of all species for which at least one fully sequenced genome is available. Overall, the genomic distribution of RAYTs suggests that they have been coopted by host genomes to perform a function that benefits the host cell. PMID:28910967
Long interspersed element-1 (LINE-1): passenger or driver in human neoplasms?
Rodić, Nemanja; Burns, Kathleen H
2013-03-01
LINE-1 (L1) retrotransposons make up a significant portion of human genomes, with an estimated 500,000 copies per genome. Like other retrotransposons, L1 retrotransposons propagate through RNA sequences that are reverse transcribed into DNA sequences, which are integrated into new genomic loci. L1 somatic insertions have the potential to disrupt the transcriptome by inserting into or nearby genes. By mutating genes and playing a role in epigenetic dysregulation, L1 transposons may contribute to tumorigenesis. Studies of the "mobilome" have lagged behind other tumor characterizations at the sequence, transcript, and epigenetic levels. Here, we consider evidence that L1 retrotransposons may sometimes drive human tumorigenesis.
Nuclear Mitochondrial DNA Activates Replication in Saccharomyces cerevisiae
Chatre, Laurent; Ricchetti, Miria
2011-01-01
The nuclear genome of eukaryotes is colonized by DNA fragments of mitochondrial origin, called NUMTs. These insertions have been associated with a variety of germ-line diseases in humans. The significance of this uptake of potentially dangerous sequences into the nuclear genome is unclear. Here we provide functional evidence that sequences of mitochondrial origin promote nuclear DNA replication in Saccharomyces cerevisiae. We show that NUMTs are rich in key autonomously replicating sequence (ARS) consensus motifs, whose mutation results in the reduction or loss of DNA replication activity. Furthermore, 2D-gel analysis of the mrc1 mutant exposed to hydroxyurea shows that several NUMTs function as late chromosomal origins. We also show that NUMTs located close to or within ARS provide key sequence elements for replication. Thus NUMTs can act as independent origins, when inserted in an appropriate genomic context or affect the efficiency of pre-existing origins. These findings show that migratory mitochondrial DNAs can impact on the replication of the nuclear region they are inserted in. PMID:21408151
Nuclear mitochondrial DNA activates replication in Saccharomyces cerevisiae.
Chatre, Laurent; Ricchetti, Miria
2011-03-08
The nuclear genome of eukaryotes is colonized by DNA fragments of mitochondrial origin, called NUMTs. These insertions have been associated with a variety of germ-line diseases in humans. The significance of this uptake of potentially dangerous sequences into the nuclear genome is unclear. Here we provide functional evidence that sequences of mitochondrial origin promote nuclear DNA replication in Saccharomyces cerevisiae. We show that NUMTs are rich in key autonomously replicating sequence (ARS) consensus motifs, whose mutation results in the reduction or loss of DNA replication activity. Furthermore, 2D-gel analysis of the mrc1 mutant exposed to hydroxyurea shows that several NUMTs function as late chromosomal origins. We also show that NUMTs located close to or within ARS provide key sequence elements for replication. Thus NUMTs can act as independent origins, when inserted in an appropriate genomic context or affect the efficiency of pre-existing origins. These findings show that migratory mitochondrial DNAs can impact on the replication of the nuclear region they are inserted in.
2012-01-01
Background Barcodes are unique DNA sequence tags that can be used to specifically label individual mutants. The barcode-tagged open reading frame (ORF) haploid deletion mutant collections in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe allow for high-throughput mutant phenotyping because the relative growth of mutants in a population can be determined by monitoring the proportions of their associated barcodes. While these mutant collections have greatly facilitated genome-wide studies, mutations in essential genes are not present, and the roles of these genes are not as easily studied. To further support genome-scale research in S. pombe, we generated a barcode-tagged fission yeast insertion mutant library that has the potential of generating viable mutations in both essential and non-essential genes and can be easily analyzed using standard molecular biological techniques. Results An insertion vector containing a selectable ura4+ marker and a random barcode was used to generate a collection of 10,000 fission yeast insertion mutants stored individually in 384-well plates and as six pools of mixed mutants. Individual barcodes are flanked by Sfi I recognition sites and can be oligomerized in a unique orientation to facilitate barcode sequencing. Independent genetic screens on a subset of mutants suggest that this library contains a diverse collection of single insertion mutations. We present several approaches to determine insertion sites. Conclusions This collection of S. pombe barcode-tagged insertion mutants is well-suited for genome-wide studies. Because insertion mutations may eliminate, reduce or alter the function of essential and non-essential genes, this library will contain strains with a wide range of phenotypes that can be assayed by their associated barcodes. The design of the barcodes in this library allows for barcode sequencing using next generation or standard benchtop cloning approaches. PMID:22554201
Morimoto, Tomomi; Arii, Jun; Akashi, Hiroomi; Kawaguchi, Yasushi
2009-03-01
Information on sites in HSV genomes at which foreign gene(s) can be inserted without disrupting viral genes or affecting properties of the parental virus are important for basic research on HSV and development of HSV-based vectors for human therapy. The intergenic region between HSV-1 UL3 and UL4 genes has been reported to satisfy the requirements for such an insertion site. The UL3 and UL4 genes are oriented toward the intergenic region and, therefore, insertion of a foreign gene(s) into the region between the UL3 and UL4 polyadenylation signals should not disrupt any viral genes or transcriptional units. HSV-1 and HSV-2 each have more than 10 additional regions structurally similar to the intergenic region between UL3 and UL4. In the studies reported here, it has been demonstrated that insertion of a reporter gene expression cassette into several of the HSV-1 and HSV-2 intergenic regions has no effect on viral growth in cell culture or virulence in mice, suggesting that these multiple intergenic regions may be suitable HSV sites for insertion of foreign genes.
Jheng, Cheng-Fong; Chen, Tien-Chih; Lin, Jhong-Yi; Chen, Ting-Chieh; Wu, Wen-Luan; Chang, Ching-Chun
2012-07-01
The chloroplast genome of Phalaenopsis equestris was determined and compared to those of Phalaenopsis aphrodite and Oncidium Gower Ramsey in Orchidaceae. The chloroplast genome of P. equestris is 148,959 bp, and a pair of inverted repeats (25,846 bp) separates the genome into large single-copy (85,967 bp) and small single-copy (11,300 bp) regions. The genome encodes 109 genes, including 4 rRNA, 30 tRNA and 75 protein-coding genes, but loses four ndh genes (ndhA, E, F and H) and seven other ndh genes are pseudogenes. The rate of inter-species variation between the two moth orchids was 0.74% (1107 sites) for single nucleotide substitution and 0.24% for insertions (161 sites; 1388 bp) and deletions (189 sites; 1393 bp). The IR regions have a lower rate of nucleotide substitution (3.5-5.8-fold) and indels (4.3-7.1-fold) than single-copy regions. The intergenic spacers are the most divergent, and based on the length variation of the three intergenic spacers, 11 native Phalaenopsis orchids could be successfully distinguished. The coding genes, IR junction and RNA editing sites are relatively more conserved between the two moth orchids than between those of Phalaenopsis and Oncidium spp. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Habegger, Lukas; Balasubramanian, Suganthi; Chen, David Z; Khurana, Ekta; Sboner, Andrea; Harmanci, Arif; Rozowsky, Joel; Clarke, Declan; Snyder, Michael; Gerstein, Mark
2012-09-01
The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment. VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org.
Error Correcting Optical Mapping Data.
Mukherjee, Kingshuk; Washimkar, Darshan; Muggli, Martin D; Salmela, Leena; Boucher, Christina
2018-05-26
Optical mapping is a unique system that is capable of producing high-resolution, high-throughput genomic map data that gives information about the structure of a genome [21]. Recently it has been used for scaffolding contigs and assembly validation for large-scale sequencing projects, including the maize [32], goat [6], and amborella [4] genomes. However, a major impediment in the use of this data is the variety and quantity of errors in the raw optical mapping data, which are called Rmaps. The challenges associated with using Rmap data are analogous to dealing with insertions and deletions in the alignment of long reads. Moreover, they are arguably harder to tackle since the data is numerical and susceptible to inaccuracy. We develop cOMET to error correct Rmap data, which to the best of our knowledge is the only optical mapping error correction method. Our experimental results demonstrate that cOMET has high prevision and corrects 82.49% of insertion errors and 77.38% of deletion errors in Rmap data generated from the E. coli K-12 reference genome. Out of the deletion errors corrected, 98.26% are true errors. Similarly, out of the insertion errors corrected, 82.19% are true errors. It also successfully scales to large genomes, improving the quality of 78% and 99% of the Rmaps in the plum and goat genomes, respectively. Lastly, we show the utility of error correction by demonstrating how it improves the assembly of Rmap data. Error corrected Rmap data results in an assembly that is more contiguous, and covers a larger fraction of the genome.
Young, Michael; Artsatbanov, Vladislav; Beller, Harry R.; Chandra, Govind; Chater, Keith F.; Dover, Lynn G.; Goh, Ee-Been; Kahan, Tamar; Kaprelyants, Arseny S.; Kyrpides, Nikos; Lapidus, Alla; Lowry, Stephen R.; Lykidis, Athanasios; Mahillon, Jacques; Markowitz, Victor; Mavromatis, Konstantinos; Mukamolova, Galina V.; Oren, Aharon; Rokem, J. Stefan; Smith, Margaret C. M.; Young, Danielle I.; Greenblatt, Charles L.
2010-01-01
Micrococcus luteus (NCTC2665, “Fleming strain”) has one of the smallest genomes of free-living actinobacteria sequenced to date, comprising a single circular chromosome of 2,501,097 bp (G+C content, 73%) predicted to encode 2,403 proteins. The genome shows extensive synteny with that of the closely related organism, Kocuria rhizophila, from which it was taxonomically separated relatively recently. Despite its small size, the genome harbors 73 insertion sequence (IS) elements, almost all of which are closely related to elements found in other actinobacteria. An IS element is inserted into the rrs gene of one of only two rrn operons found in M. luteus. The genome encodes only four sigma factors and 14 response regulators, a finding indicative of adaptation to a rather strict ecological niche (mammalian skin). The high sensitivity of M. luteus to β-lactam antibiotics may result from the presence of a reduced set of penicillin-binding proteins and the absence of a wblC gene, which plays an important role in the antibiotic resistance in other actinobacteria. Consistent with the restricted range of compounds it can use as a sole source of carbon for energy and growth, M. luteus has a minimal complement of genes concerned with carbohydrate transport and metabolism and its inability to utilize glucose as a sole carbon source may be due to the apparent absence of a gene encoding glucokinase. Uniquely among characterized bacteria, M. luteus appears to be able to metabolize glycogen only via trehalose and to make trehalose only via glycogen. It has very few genes associated with secondary metabolism. In contrast to most other actinobacteria, M. luteus encodes only one resuscitation-promoting factor (Rpf) required for emergence from dormancy, and its complement of other dormancy-related proteins is also much reduced. M. luteus is capable of long-chain alkene biosynthesis, which is of interest for advanced biofuel production; a three-gene cluster essential for this metabolism has been identified in the genome. PMID:19948807
Matsunaga, Taichi; Yamashita, Jun K
2014-02-07
Specific gene knockout and rescue experiments are powerful tools in developmental and stem cell biology. Nevertheless, the experiments require multiple steps of molecular manipulation for gene knockout and subsequent rescue procedures. Here we report an efficient and single step strategy to generate gene knockout-rescue system in pluripotent stem cells by promoter insertion with CRISPR/Cas9 genome editing technology. We inserted a tetracycline-regulated inducible gene promoter (tet-OFF/TRE-CMV) upstream of the endogenous promoter region of vascular endothelial growth factor receptor 2 (VEGFR2/Flk1) gene, an essential gene for endothelial cell (EC) differentiation, in mouse embryonic stem cells (ESCs) with homologous recombination. Both homo- and hetero-inserted clones were efficiently obtained through a simple selection with a drug-resistant gene. The insertion of TRE-CMV promoter disrupted endogenous Flk1 expression, resulting in null mutation in homo-inserted clones. When the inserted TRE-CMV promoter was activated with doxycycline (Dox) depletion, Flk1 expression was sufficiently recovered from the downstream genomic Flk1 gene. Whereas EC differentiation was almost completely perturbed in homo-inserted clones, Flk1 rescue with TRE-CMV promoter activation restored EC appearance, indicating that phenotypic changes in EC differentiation can be successfully reproduced with this knockout-rescue system. Thus, this promoter insertion strategy with CRISPR/Cas9 would be a novel attractive method for knockout-rescue experiments. Copyright © 2014 Elsevier Inc. All rights reserved.
Contribution of transposable elements in the plant's genome.
Sahebi, Mahbod; Hanafi, Mohamed M; van Wijnen, Andre J; Rice, David; Rafii, M Y; Azizi, Parisa; Osman, Mohamad; Taheri, Sima; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat; Noor, Yusuf Muhammad
2018-07-30
Plants maintain extensive growth flexibility under different environmental conditions, allowing them to continuously and rapidly adapt to alterations in their environment. A large portion of many plant genomes consists of transposable elements (TEs) that create new genetic variations within plant species. Different types of mutations may be created by TEs in plants. Many TEs can avoid the host's defense mechanisms and survive alterations in transposition activity, internal sequence and target site. Thus, plant genomes are expected to utilize a variety of mechanisms to tolerate TEs that are near or within genes. TEs affect the expression of not only nearby genes but also unlinked inserted genes. TEs can create new promoters, leading to novel expression patterns or alternative coding regions to generate alternate transcripts in plant species. TEs can also provide novel cis-acting regulatory elements that act as enhancers or inserts within original enhancers that are required for transcription. Thus, the regulation of plant gene expression is strongly managed by the insertion of TEs into nearby genes. TEs can also lead to chromatin modifications and thereby affect gene expression in plants. TEs are able to generate new genes and modify existing gene structures by duplicating, mobilizing and recombining gene fragments. They can also facilitate cellular functions by sharing their transposase-coding regions. Hence, TE insertions can not only act as simple mutagens but can also alter the elementary functions of the plant genome. Here, we review recent discoveries concerning the contribution of TEs to gene expression in plant genomes and discuss the different mechanisms by which TEs can affect plant gene expression and reduce host defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
Endogenous Retroviruses: With Us and Against Us
NASA Astrophysics Data System (ADS)
Meyer, Thomas J.; Rosenkrantz, Jimi L.; Carbone, Lucia; Chavez, Shawn L.
2017-04-01
Mammalian genomes are scattered with thousands of copies of endogenous retroviruses (ERVs), mobile genetic elements that are relics of ancient retroviral infections. After inserting copies into the germ line of a host, most ERVs accumulate mutations that prevent the normal assembly of infectious viral particles, becoming trapped in host genomes and unable to leave to infect other cells. While most copies of ERVs are inactive, some are transcribed and encode the proteins needed to generate new insertions at novel loci. In some cases, old copies are removed via recombination and other mechanisms. This creates a shifting landscape of ERV copies within host genomes. New insertions can disrupt normal expression of nearby genes via directly inserting into key regulatory elements or by containing regulatory motifs within their sequences. Further, the transcriptional silencing of ERVs via epigenetic modification may result in changes to the epigenetic regulation of adjacent genes. In these ways, ERVs can be potent sources of regulatory disruption as well as genetic innovation. Here, we provide a brief review of the association between ERVs and gene expression, especially as observed in pre-implantation development and placentation. Moreover, we will describe the roles ERVs may play in somatic tissues, mostly in the context of human disease, including cancer, neurodegenerative disorders, and schizophrenia. Lastly, we discuss the recent discovery that some ERVs may have been pressed into the service of their host genomes to aid in the innate immune response to exogenous viral infections.
Zhang, Xiaobing; Tang, Qiaoling; Wang, Xujing; Wang, Zhixing
2016-01-01
In this study, the flanking sequence of an inserted fragment conferring glyphosate tolerance on transgenic cotton line BG2-7 was analyzed by thermal asymmetric interlaced polymerase chain reaction (TAIL-PCR) and standard PCR. The results showed apparent insertion of the exogenous gene into chromosome D10 of the Gossypium hirsutum L. genome, as the left and right borders of the inserted fragment are nucleotides 61,962,952 and 61,962,921 of chromosome D10, respectively. In addition, a 31-bp cotton microsatellite sequence was noted between the genome sequence and the 5' end of the exogenous gene. In total, 84 and 298 bp were deleted from the left and right borders of the exogenous gene, respectively, with 30 bp deleted from the cotton chromosome at the insertion site. According to the flanking sequence obtained, several pairs of event-specific detection primers were designed to amplify sequence between the 5' end of the exogenous gene and the cotton genome junction region as well as between the 3' end and the cotton genome junction region. Based on screening tests, the 5'-end primers GTCATAACGTGACTCCCTTAATTCTCC/CCTATTACACGGCTATGC and 3'-end primers TCCTTTCGCTTTCTTCCCTT/ACACTTACATGGCGTCTTCT were used to detect the respective BG2-7 event-specific primers. The limit of detection of the former primers reached 44 copies, and that of the latter primers reached 88 copies. The results of this study provide useful data for assessment of BG2-7 safety and for accelerating its industrialization.
Short interspersed elements (SINEs) are a major source of canine genomic diversity.
Wang, Wei; Kirkness, Ewen F
2005-12-01
SINEs are retrotransposons that have enjoyed remarkable reproductive success during the course of mammalian evolution, and have played a major role in shaping mammalian genomes. Previously, an analysis of survey-sequence data from an individual dog (a poodle) indicated that canine genomes harbor a high frequency of alleles that differ only by the absence or presence of a SINEC_Cf repeat. Comparison of this survey-sequence data with a draft genome sequence of a distinct dog (a boxer) has confirmed this prediction, and revealed the chromosomal coordinates for >10,000 loci that are bimorphic for SINEC_Cf insertions. Analysis of SINE insertion sites from the genomes of nine additional dogs indicates that 3%-5% are absent from either the poodle or boxer genome sequences--suggesting that an additional 10,000 bimorphic loci could be readily identified in the general dog population. We describe a methodology that can be used to identify these loci, and could be adapted to exploit these bimorphic loci for genotyping purposes. Approximately half of all annotated canine genes contain SINEC_Cf repeats, and these elements are occasionally transcribed. When transcribed in the antisense orientation, they provide splice acceptor sites that can result in incorporation of novel exons. The high frequency of bimorphic SINE insertions in the dog population is predicted to provide numerous examples of allele-specific transcription patterns that will be valuable for the study of differential gene expression among multiple dog breeds.
Creation and genomic analysis of irradiation hybrids in Populus
Matthew S. Zinkgraf; K. Haiby; M.C. Lieberman; L. Comai; I.M. Henry; Andrew Groover
2016-01-01
Establishing efficient functional genomic systems for creating and characterizing genetic variation in forest trees is challenging. Here we describe protocols for creating novel gene-dosage variation in Populus through gamma-irradiation of pollen, followed by genomic analysis to identify chromosomal regions that have been deleted or inserted in...
Liao, Hsiao-Mei; Niu, Dau-Ming; Chen, Yan-Jang; Fang, Jye-Siung; Chen, Shih-Jen; Chen, Chia-Hsiang
2011-01-01
Nance-Horan syndrome (NHS) is a rare X-linked disorder characterized by congenital cataracts, dental anomalies and mental retardation. The disease has been linked to a novel gene termed NHS located at Xp22.13. The majority of pathogenic mutations of the disease include nonsense mutations and small deletions and insertions that lead to truncation of the NHS protein. In this study, we identified a microdeletion of ∼ 0.92 Mb at Xp22.13 detected by array-based comparative genomic hybridization in two brothers presenting congenital cataract, dental anomalies, facial dysmorphisms and mental retardation. The deleted region encompasses the REPS2, NHS, SCML1 and RAI2 genes, and was transmitted from their carrier mother who presented only mild cataract. Our findings are in line with several recent case reports to indicate that genomic rearrangement involving the NHS gene is an important genetic etiology underlying NHS.
Horizontal Transfer Can Drive a Greater Transposable Element Load in Large Populations.
Groth, Sam B; Blumenstiel, Justin P
2017-01-01
Genomes are comprised of contrasting domains of euchromatin and heterochromatin, and transposable elements (TEs) play an important role in defining these genomic regions. Therefore, understanding the forces that control TE abundance can help us understand the chromatin landscape of the genome. What determines the burden of TEs in populations? Some have proposed that drift plays a determining role. In small populations, mildly deleterious TE insertion alleles are allowed to fix, leading to increased copy number. However, it is not clear how the rate of exposure to new TE families, via horizontal transfer (HT), can contribute to broader patterns of genomic TE abundance. Here, using simulation and analytical approaches, we show that when the effects of drift are weak, exposure rate to new TE families via HT can be an important determinant of genomic copy number. If population exposure rate is proportional to population size, larger populations are expected to have a higher rate of exposure to rare HT events. This leads to the counterintuitive prediction that larger populations may carry a higher TE load. We also find that increased rates of recombination can lead to greater probabilities of TE establishment. This work has implications for our understanding of the evolution of chromatin landscapes, genome defense by RNA silencing, and recombination rates. © The American Genetic Association 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Genetic resources offer efficient tools for rice functional genomics research.
Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May
2016-05-01
Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.
ISEScan: automated identification of insertion sequence elements in prokaryotic genomes.
Xie, Zhiqun; Tang, Haixu
2017-11-01
The insertion sequence (IS) elements are the smallest but most abundant autonomous transposable elements in prokaryotic genomes, which play a key role in prokaryotic genome organization and evolution. With the fast growing genomic data, it is becoming increasingly critical for biology researchers to be able to accurately and automatically annotate ISs in prokaryotic genome sequences. The available automatic IS annotation systems are either providing only incomplete IS annotation or relying on the availability of existing genome annotations. Here, we present a new IS elements annotation pipeline to address these issues. ISEScan is a highly sensitive software pipeline based on profile hidden Markov models constructed from manually curated IS elements. ISEScan performs better than existing IS annotation systems when tested on prokaryotic genomes with curated annotations of IS elements. Applying it to 2784 prokaryotic genomes, we report the global distribution of IS families across taxonomic clades in Archaea and Bacteria. ISEScan is implemented in Python and released as an open source software at https://github.com/xiezhq/ISEScan. hatang@indiana.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Identification of genomic indels and structural variations using split reads
2011-01-01
Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs) in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC), a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read). All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions). A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models). This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions). We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events) allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole size spectrum for deletions. Moreover, with the advent of the third-generation sequencing technologies that produce longer reads, we expect our method to be even more useful. PMID:21787423
Bacterial Artificial Chromosome Libraries for Mouse Sequencing and Functional Analysis
Osoegawa, Kazutoyo; Tateno, Minako; Woon, Peng Yeong; Frengen, Eirik; Mammoser, Aaron G.; Catanese, Joseph J.; Hayashizaki, Yoshihide; de Jong, Pieter J.
2000-01-01
Bacterial artificial chromosome (BAC) and P1-derived artificial chromosome (PAC) libraries providing a combined 33-fold representation of the murine genome have been constructed using two different restriction enzymes for genomic digestion. A large-insert PAC library was prepared from the 129S6/SvEvTac strain in a bacterial/mammalian shuttle vector to facilitate functional gene studies. For genome mapping and sequencing, we prepared BAC libraries from the 129S6/SvEvTac and the C57BL/6J strains. The average insert sizes for the three libraries range between 130 kb and 200 kb. Based on the numbers of clones and the observed average insert sizes, we estimate each library to have slightly in excess of 10-fold genome representation. The average number of clones found after hybridization screening with 28 probes was in the range of 9–14 clones per marker. To explore the fidelity of the genomic representation in the three libraries, we analyzed three contigs, each established after screening with a single unique marker. New markers were established from the end sequences and screened against all the contig members to determine if any of the BACs and PACs are chimeric or rearranged. Only one chimeric clone and six potential deletions have been observed after extensive analysis of 113 PAC and BAC clones. Seventy-one of the 113 clones were conclusively nonchimeric because both end markers or sequences were mapped to the other confirmed contig members. We could not exclude chimerism for the remaining 41 clones because one or both of the insert termini did not contain unique sequence to design markers. The low rate of chimerism, ∼1%, and the low level of detected rearrangements support the anticipated usefulness of the BAC libraries for genome research. [The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AQ797173–AQ797398.] PMID:10645956
Evolutionary transgenomics: prospects and challenges.
Correa, Raul; Baum, David A
2015-01-01
Many advances in our understanding of the genetic basis of species differences have arisen from transformation experiments, which allow us to study the effect of genes from one species (the donor) when placed in the genetic background of another species (the recipient). Such interspecies transformation experiments are usually focused on candidate genes - genes that, based on work in model systems, are suspected to be responsible for certain phenotypic differences between the donor and recipient species. We suggest that the high efficiency of transformation in a few plant species, most notably Arabidopsis thaliana, combined with the small size of typical plant genes and their cis-regulatory regions allow implementation of a screening strategy that does not depend upon a priori candidate gene identification. This approach, transgenomics, entails moving many large genomic inserts of a donor species into the wild type background of a recipient species and then screening for dominant phenotypic effects. As a proof of concept, we recently conducted a transgenomic screen that analyzed more than 1100 random, large genomic inserts of the Alabama gladecress Leavenworthia alabamica for dominant phenotypic effects in the A. thaliana background. This screen identified one insert that shortens fruit and decreases A. thaliana fertility. In this paper we discuss the principles of transgenomic screens and suggest methods to help minimize the frequencies of false positive and false negative results. We argue that, because transgenomics avoids committing in advance to candidate genes it has the potential to help us identify truly novel genes or cryptic functions of known genes. Given the valuable knowledge that is likely to be gained, we believe the time is ripe for the plant evolutionary community to invest in transgenomic screens, at least in the mustard family Brassicaceae where many species are amenable to efficient transformation.
Evolutionary genomics: transdomain gene transfers.
Bordenstein, Seth R
2007-11-06
Biologists have until now conceded that bacterial gene transfer to multicellular animals is relatively uncommon in Nature. A new study showing promiscuous insertions of bacterial endosymbiont genes into invertebrate genomes ushers in a shift in this paradigm.
Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young
2017-08-15
Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.
Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana.
Lin, X; Kaul, S; Rounsley, S; Shea, T P; Benito, M I; Town, C D; Fujii, C Y; Mason, T; Bowman, C L; Barnstead, M; Feldblyum, T V; Buell, C R; Ketchum, K A; Lee, J; Ronning, C M; Koo, H L; Moffat, K S; Cronin, L A; Shen, M; Pai, G; Van Aken, S; Umayam, L; Tallon, L J; Gill, J E; Adams, M D; Carrera, A J; Creasy, T H; Goodman, H M; Somerville, C R; Copenhaver, G P; Preuss, D; Nierman, W C; White, O; Eisen, J A; Salzberg, S L; Fraser, C M; Venter, J C
1999-12-16
Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130-140 Mb), excellent physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes 4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing of nearly 2 Mb within the genetically defined centromere revealed a low density of recognizable genes, and a high density and diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a continuous stretch of 75% of the mitochondrial genome into chromosome 2.
Transposable element distribution, abundance and role in genome size variation in the genus Oryza.
Zuccolo, Andrea; Sebastian, Aswathy; Talag, Jayson; Yu, Yeisoo; Kim, HyeRan; Collura, Kristi; Kudrna, Dave; Wing, Rod A
2007-08-29
The genus Oryza is composed of 10 distinct genome types, 6 diploid and 4 polyploid, and includes the world's most important food crop - rice (Oryza sativa [AA]). Genome size variation in the Oryza is more than 3-fold and ranges from 357 Mbp in Oryza glaberrima [AA] to 1283 Mbp in the polyploid Oryza ridleyi [HHJJ]. Because repetitive elements are known to play a significant role in genome size variation, we constructed random sheared small insert genomic libraries from 12 representative Oryza species and conducted a comprehensive study of the repetitive element composition, distribution and phylogeny in this genus. Particular attention was paid to the role played by the most important classes of transposable elements (Long Terminal Repeats Retrotransposons, Long interspersed Nuclear Elements, helitrons, DNA transposable elements) in shaping these genomes and in their contributing to genome size variation. We identified the elements primarily responsible for the most strikingly genome size variation in Oryza. We demonstrated how Long Terminal Repeat retrotransposons belonging to the same families have proliferated to very different extents in various species. We also showed that the pool of Long Terminal Repeat Retrotransposons is substantially conserved and ubiquitous throughout the Oryza and so its origin is ancient and its existence predates the speciation events that originated the genus. Finally we described the peculiar behavior of repeats in the species Oryza coarctata [HHKK] whose placement in the Oryza genus is controversial. Long Terminal Repeat retrotransposons are the major component of the Oryza genomes analyzed and, along with polyploidization, are the most important contributors to the genome size variation across the Oryza genus. Two families of Ty3-gypsy elements (RIRE2 and Atlantys) account for a significant portion of the genome size variations present in the Oryza genus.
An expanding universe of the non-coding genome in cancer biology.
Xue, Bin; He, Lin
2014-06-01
Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
L1-Associated Genomic Regions are Deleted in Somatic Cells of the Healthy Human Brain
Erwin, Jennifer A.; Paquola, Apuã C.M.; Singer, Tatjana; Gallina, Iryna; Novotny, Mark; Quayle, Carolina; Bedrosian, Tracy; Ivanio, Francisco; Butcher, Cheyenne R.; Herdy, Joseph R.; Sarkar, Anindita; Lasken, Roger S.; Muotri, Alysson R.; Gage, Fred H.
2016-01-01
The healthy human brain is a mosaic of varied genomes. L1 retrotransposition is known to create mosaicism by inserting L1 sequences into new locations of somatic cell genomes. Using a machine learning-based, single-cell sequencing approach, we discovered that Somatic L1-Associated Variants (SLAVs) are actually composed of two classes: L1 retrotransposition insertions and retrotransposition-independent L1-associated variants. We demonstrate that a subset of SLAVs are, in fact, somatic deletions generated by L1 endonuclease cutting activity. Retrotransposition- independent rearrangements within inherited L1s resulted in the deletion of proximal genomic regions. These rearrangements were resolved by microhomology-mediated repair, which suggests that L1-associated genomic regions are hotspots for somatic copy number variants in the brain and therefore a heritable genetic contributor to somatic mosaicism. We demonstrate that SLAVs are present in crucial neural genes, such as DLG2/PSD93, and affect between 44–63% of cells of the cells in the healthy brain. PMID:27618310
Liu, Wanlu; Duttke, Sascha H; Hetzel, Jonathan; Groth, Martin; Feng, Suhua; Gallego-Bartolome, Javier; Zhong, Zhenhui; Kuo, Hsuan Yu; Wang, Zonghua; Zhai, Jixian; Chory, Joanne; Jacobsen, Steven E
2018-03-01
Small RNAs regulate chromatin modifications such as DNA methylation and gene silencing across eukaryotic genomes. In plants, RNA-directed DNA methylation (RdDM) requires 24-nucleotide small interfering RNAs (siRNAs) that bind to ARGONAUTE 4 (AGO4) and target genomic regions for silencing. RdDM also requires non-coding RNAs transcribed by RNA polymerase V (Pol V) that probably serve as scaffolds for binding of AGO4-siRNA complexes. Here, we used a modified global nuclear run-on protocol followed by deep sequencing to capture Pol V nascent transcripts genome-wide. We uncovered unique characteristics of Pol V RNAs, including a uracil (U) common at position 10. This uracil was complementary to the 5' adenine found in many AGO4-bound 24-nucleotide siRNAs and was eliminated in a siRNA-deficient mutant as well as in the ago4/6/9 triple mutant, suggesting that the +10 U signature is due to siRNA-mediated co-transcriptional slicing of Pol V transcripts. Expression of wild-type AGO4 in ago4/6/9 mutants was able to restore slicing of Pol V transcripts, but a catalytically inactive AGO4 mutant did not correct the slicing defect. We also found that Pol V transcript slicing required SUPPRESSOR OF TY INSERTION 5-LIKE (SPT5L), an elongation factor whose function is not well understood. These results highlight the importance of Pol V transcript slicing in RNA-mediated transcriptional gene silencing, which is a conserved process in many eukaryotes.
A novel helper phage enabling construction of genome-scale ORF-enriched phage display libraries.
Gupta, Amita; Shrivastava, Nimisha; Grover, Payal; Singh, Ajay; Mathur, Kapil; Verma, Vaishali; Kaur, Charanpreet; Chaudhary, Vijay K
2013-01-01
Phagemid-based expression of cloned genes fused to the gIIIP coding sequence and rescue using helper phages, such as VCSM13, has been used extensively for constructing large antibody phage display libraries. However, for randomly primed cDNA and gene fragment libraries, this system encounters reading frame problems wherein only one of 18 phages display the translated foreign peptide/protein fused to phagemid-encoded gIIIP. The elimination of phages carrying out-of-frame inserts is vital in order to improve the quality of phage display libraries. In this study, we designed a novel helper phage, AGM13, which carries trypsin-sensitive sites within the linker regions of gIIIP. This renders the phage highly sensitive to trypsin digestion, which abolishes its infectivity. For open reading frame (ORF) selection, the phagemid-borne phages are rescued using AGM13, so that clones with in-frame inserts express fusion proteins with phagemid-encoded trypsin-resistant gIIIP, which becomes incorporated into the phages along with a few copies of AGM13-encoded trypsin-sensitive gIIIP. In contrast, clones with out-of-frame inserts produce phages carrying only AGM13-encoded trypsin-sensitive gIIIP. Trypsin treatment of the phage population renders the phages with out-of-frame inserts non-infectious, whereas phages carrying in-frame inserts remain fully infectious and can hence be enriched by infection. This strategy was applied efficiently at a genome scale to generate an ORF-enriched whole genome fragment library from Mycobacterium tuberculosis, in which nearly 100% of the clones carried in-frame inserts after selection. The ORF-enriched libraries were successfully used for identification of linear and conformational epitopes for monoclonal antibodies specific to mycobacterial proteins.
Cybermaterials: materials by design and accelerated insertion of materials
NASA Astrophysics Data System (ADS)
Xiong, Wei; Olson, Gregory B.
2016-02-01
Cybermaterials innovation entails an integration of Materials by Design and accelerated insertion of materials (AIM), which transfers studio ideation into industrial manufacturing. By assembling a hierarchical architecture of integrated computational materials design (ICMD) based on materials genomic fundamental databases, the ICMD mechanistic design models accelerate innovation. We here review progress in the development of linkage models of the process-structure-property-performance paradigm, as well as related design accelerating tools. Extending the materials development capability based on phase-level structural control requires more fundamental investment at the level of the Materials Genome, with focus on improving applicable parametric design models and constructing high-quality databases. Future opportunities in materials genomic research serving both Materials by Design and AIM are addressed.
Insertional mutagenesis in Populus: relevance and feasibility
Victor Busov; Matthias Fladung; Andrew Groover; Steven Strauss
2005-01-01
The recent sequencing of the first tree genome, that of the black cottonwood (Populus trichocarpa), opens a new chapter in tree functional genomics. While the completion of the genome is a milestone, mobilizing this significant resource for better understanding the growth and development of woody perennials will be an even greater undertaking in the...
Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna
2016-04-07
DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. Copyright © 2016 Vrljicak et al.
The B73 maize genome: complexity, diversity, and dynamics.
Schnable, Patrick S; Ware, Doreen; Fulton, Robert S; Stein, Joshua C; Wei, Fusheng; Pasternak, Shiran; Liang, Chengzhi; Zhang, Jianwei; Fulton, Lucinda; Graves, Tina A; Minx, Patrick; Reily, Amy Denise; Courtney, Laura; Kruchowski, Scott S; Tomlinson, Chad; Strong, Cindy; Delehaunty, Kim; Fronick, Catrina; Courtney, Bill; Rock, Susan M; Belter, Eddie; Du, Feiyu; Kim, Kyung; Abbott, Rachel M; Cotton, Marc; Levy, Andy; Marchetto, Pamela; Ochoa, Kerri; Jackson, Stephanie M; Gillam, Barbara; Chen, Weizu; Yan, Le; Higginbotham, Jamey; Cardenas, Marco; Waligorski, Jason; Applebaum, Elizabeth; Phelps, Lindsey; Falcone, Jason; Kanchi, Krishna; Thane, Thynn; Scimone, Adam; Thane, Nay; Henke, Jessica; Wang, Tom; Ruppert, Jessica; Shah, Neha; Rotter, Kelsi; Hodges, Jennifer; Ingenthron, Elizabeth; Cordes, Matt; Kohlberg, Sara; Sgro, Jennifer; Delgado, Brandon; Mead, Kelly; Chinwalla, Asif; Leonard, Shawn; Crouse, Kevin; Collura, Kristi; Kudrna, Dave; Currie, Jennifer; He, Ruifeng; Angelova, Angelina; Rajasekar, Shanmugam; Mueller, Teri; Lomeli, Rene; Scara, Gabriel; Ko, Ara; Delaney, Krista; Wissotski, Marina; Lopez, Georgina; Campos, David; Braidotti, Michele; Ashley, Elizabeth; Golser, Wolfgang; Kim, HyeRan; Lee, Seunghee; Lin, Jinke; Dujmic, Zeljko; Kim, Woojin; Talag, Jayson; Zuccolo, Andrea; Fan, Chuanzhu; Sebastian, Aswathy; Kramer, Melissa; Spiegel, Lori; Nascimento, Lidia; Zutavern, Theresa; Miller, Beth; Ambroise, Claude; Muller, Stephanie; Spooner, Will; Narechania, Apurva; Ren, Liya; Wei, Sharon; Kumari, Sunita; Faga, Ben; Levy, Michael J; McMahan, Linda; Van Buren, Peter; Vaughn, Matthew W; Ying, Kai; Yeh, Cheng-Ting; Emrich, Scott J; Jia, Yi; Kalyanaraman, Ananth; Hsia, An-Ping; Barbazuk, W Brad; Baucom, Regina S; Brutnell, Thomas P; Carpita, Nicholas C; Chaparro, Cristian; Chia, Jer-Ming; Deragon, Jean-Marc; Estill, James C; Fu, Yan; Jeddeloh, Jeffrey A; Han, Yujun; Lee, Hyeran; Li, Pinghua; Lisch, Damon R; Liu, Sanzhen; Liu, Zhijie; Nagel, Dawn Holligan; McCann, Maureen C; SanMiguel, Phillip; Myers, Alan M; Nettleton, Dan; Nguyen, John; Penning, Bryan W; Ponnala, Lalit; Schneider, Kevin L; Schwartz, David C; Sharma, Anupma; Soderlund, Carol; Springer, Nathan M; Sun, Qi; Wang, Hao; Waterman, Michael; Westerman, Richard; Wolfgruber, Thomas K; Yang, Lixing; Yu, Yeisoo; Zhang, Lifang; Zhou, Shiguo; Zhu, Qihui; Bennetzen, Jeffrey L; Dawe, R Kelly; Jiang, Jiming; Jiang, Ning; Presting, Gernot G; Wessler, Susan R; Aluru, Srinivas; Martienssen, Robert A; Clifton, Sandra W; McCombie, W Richard; Wing, Rod A; Wilson, Richard K
2009-11-20
We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
QuickMap: a public tool for large-scale gene therapy vector insertion site mapping and analysis.
Appelt, J-U; Giordano, F A; Ecker, M; Roeder, I; Grund, N; Hotz-Wagenblatt, A; Opelz, G; Zeller, W J; Allgayer, H; Fruehauf, S; Laufs, S
2009-07-01
Several events of insertional mutagenesis in pre-clinical and clinical gene therapy studies have created intense interest in assessing the genomic insertion profiles of gene therapy vectors. For the construction of such profiles, vector-flanking sequences detected by inverse PCR, linear amplification-mediated-PCR or ligation-mediated-PCR need to be mapped to the host cell's genome and compared to a reference set. Although remarkable progress has been achieved in mapping gene therapy vector insertion sites, public reference sets are lacking, as are the possibilities to quickly detect non-random patterns in experimental data. We developed a tool termed QuickMap, which uniformly maps and analyzes human and murine vector-flanking sequences within seconds (available at www.gtsg.org). Besides information about hits in chromosomes and fragile sites, QuickMap automatically determines insertion frequencies in +/- 250 kb adjacency to genes, cancer genes, pseudogenes, transcription factor and (post-transcriptional) miRNA binding sites, CpG islands and repetitive elements (short interspersed nuclear elements (SINE), long interspersed nuclear elements (LINE), Type II elements and LTR elements). Additionally, all experimental frequencies are compared with the data obtained from a reference set, containing 1 000 000 random integrations ('random set'). Thus, for the first time a tool allowing high-throughput profiling of gene therapy vector insertion sites is available. It provides a basis for large-scale insertion site analyses, which is now urgently needed to discover novel gene therapy vectors with 'safe' insertion profiles.
Ni, ZhouXian; Ye, YouJu; Bai, Tiandao; Xu, Meng; Xu, Li-An
2017-09-11
The chloroplast genome (CPG) of Pinus massoniana belonging to the genus Pinus (Pinaceae), which is a primary source of turpentine, was sequenced and analyzed in terms of gene rearrangements, ndh genes loss, and the contraction and expansion of short inverted repeats (IRs). P. massoniana CPG has a typical quadripartite structure that includes large single copy (LSC) (65,563 bp), small single copy (SSC) (53,230 bp) and two IRs (IRa and IRb, 485 bp). The 108 unique genes were identified, including 73 protein-coding genes, 31 tRNAs, and 4 rRNAs. Most of the 81 simple sequence repeats (SSRs) identified in CPG were mononucleotides motifs of A/T types and located in non-coding regions. Comparisons with related species revealed an inversion (21,556 bp) in the LSC region; P. massoniana CPG lacks all 11 intact ndh genes (four ndh genes lost completely; the five remained truncated as pseudogenes; and the other two ndh genes remain as pseudogenes because of short insertions or deletions). A pair of short IRs was found instead of large IRs, and size variations among pine species were observed, which resulted from short insertions or deletions and non-synchronized variations between "IRa" and "IRb". The results of phylogenetic analyses based on whole CPG sequences of 16 conifers indicated that the whole CPG sequences could be used as a powerful tool in phylogenetic analyses.
SV40 host-substituted variants: a new look at the monkey DNA inserts and recombinant junctions.
Singer, Maxine; Winocour, Ernest
2011-04-10
The available monkey genomic data banks were examined in order to determine the chromosomal locations of the host DNA inserts in 8 host-substituted SV40 variant DNAs. Five of the 8 variants contained more than one linked monkey DNA insert per tandem repeat unit and in all cases but one, the 19 monkey DNA inserts in the 8 variants mapped to different locations in the monkey genome. The 50 parental DNAs (32 monkey and 18 SV40 DNA segments) which spanned the crossover and flanking regions that participated in monkey/monkey and monkey/SV40 recombinations were characterized by substantial levels of microhomology of up to 8 nucleotides in length; the parental DNAs also exhibited direct and inverted repeats at or adjacent to the crossover sequences. We discuss how the host-substituted SV40 variants arose and the nature of the recombination mechanisms involved. Copyright © 2011 Elsevier Inc. All rights reserved.
LoRTE: Detecting transposon-induced genomic variants using low coverage PacBio long read sequences.
Disdero, Eric; Filée, Jonathan
2017-01-01
Population genomic analysis of transposable elements has greatly benefited from recent advances of sequencing technologies. However, the short size of the reads and the propensity of transposable elements to nest in highly repeated regions of genomes limits the efficiency of bioinformatic tools when Illumina or 454 technologies are used. Fortunately, long read sequencing technologies generating read length that may span the entire length of full transposons are now available. However, existing TE population genomic softwares were not designed to handle long reads and the development of new dedicated tools is needed. LoRTE is the first tool able to use PacBio long read sequences to identify transposon deletions and insertions between a reference genome and genomes of different strains or populations. Tested against simulated and genuine Drosophila melanogaster PacBio datasets, LoRTE appears to be a reliable and broadly applicable tool to study the dynamic and evolutionary impact of transposable elements using low coverage, long read sequences. LoRTE is an efficient and accurate tool to identify structural genomic variants caused by TE insertion or deletion. LoRTE is available for download at http://www.egce.cnrs-gif.fr/?p=6422.
Luchetti, Andrea; Mantovani, Barbara
2009-12-01
Studies on transposable elements in termites are of interest because their genome is in a permanent condition of inbreeding. In this situation, an increase in transposon copy number should be mainly due to a Muller's ratchet effect, with selection against deleterious insertions playing a major role. Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, known to be stable components of eukaryotic genomes. The SINE Talua, first isolated from Reticulitermes lucifugus (Rhinotermitidae), is the only mobile element described so far in termites. In the present survey, Talua has been found widespread in the Isoptera order. In comparison with other non-termite SINEs, Talua diversity and distribution in the Reticulitermes genome demonstrate that Talua is an ancient component of termite genome and that it is significantly associated with other repeats. In particular, the element is found to be involved with microsatellite motifs either as their generator or because inserted in their nearby. Further, two new SINEs and a putative retrotranscriptase-like sequence were found linked to Talua. Talua's genomic distribution is discussed in the light of the available models on transposable element dynamics within inbred genomes, also taking into account SINE role as drivers of genetic diversity in counteracting inbreeding depression.
Evidence for horizontal transfer of mitochondrial DNA to the plastid genome in a bamboo genus.
Ma, Peng-Fei; Zhang, Yu-Xiao; Guo, Zhen-Hua; Li, De-Zhu
2015-06-23
In flowering plants, three genomes (nuclear, mitochondrial, and plastid) coexist and intracellular horizontal transfer of DNA is prevalent, especially from the plastid to the mitochondrion genome. However, the plastid genomes are generally conserved in evolution and have long been considered immune to foreign DNA. Recently, the opposite direction of DNA transfer from the mitochondrial to the plastid genome has been reported in two eudicot lineages. Here we sequenced 6 plastid genomes of bamboos, three of which are neotropical woody species and three are herbaceous ones. Several unusual features were found, including the duplication of trnT-GGU and loss of one copy of rps19 due to contraction of inverted repeats (IRs). The most intriguing was the ~2.7 kb insertion in the plastid IR regions in the three herbaceous bamboos. Furthermore, the insertion was documented to be horizontally transferred from the mitochondrial to the plastid genome. Our study provided evidence of the mitochondrial-to-plastid DNA transfer in the monocots, demonstrating again that this rare event does occur in other angiosperm lineages. However, the mechanism underlying the transfer remains obscure, and more studies in other plants may elucidate it in the future.
Gene replacements and insertions in rice by intron targeting using CRISPR-Cas9.
Li, Jun; Meng, Xiangbing; Zong, Yuan; Chen, Kunling; Zhang, Huawei; Liu, Jinxing; Li, Jiayang; Gao, Caixia
2016-09-12
Sequence-specific nucleases have been exploited to create targeted gene knockouts in various plants(1), but replacing a fragment and even obtaining gene insertions at specific loci in plant genomes remain a serious challenge. Here, we report efficient intron-mediated site-specific gene replacement and insertion approaches that generate mutations using the non-homologous end joining (NHEJ) pathway using the clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein 9 (Cas9) system. Using a pair of single guide RNAs (sgRNAs) targeting adjacent introns and a donor DNA template including the same pair of sgRNA sites, we achieved gene replacements in the rice endogenous gene 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) at a frequency of 2.0%. We also obtained targeted gene insertions at a frequency of 2.2% using a sgRNA targeting one intron and a donor DNA template including the same sgRNA site. Rice plants harbouring the OsEPSPS gene with the intended substitutions were glyphosate-resistant. Furthermore, the site-specific gene replacements and insertions were faithfully transmitted to the next generation. These newly developed approaches can be generally used to replace targeted gene fragments and to insert exogenous DNA sequences into specific genomic sites in rice and other plants.
Comparative structural analysis of Bru1 region homeologs in Saccharum spontaneum and S. officinarum
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Jisen; Sharma, Anupma; Yu, Qingyi
Here, sugarcane is a major sugar and biofuel crop, but genomic research and molecular breeding have lagged behind other major crops due to the complexity of auto-allopolyploid genomes. Sugarcane cultivars are frequently aneuploid with chromosome number ranging from 100 to 130, consisting of 70-80 % S. officinarum, 10-20 % S. spontaneum, and 10 % recombinants between these two species. Analysis of a genomic region in the progenitor autoploid genomes of sugarcane hybrid cultivars will reveal the nature and divergence of homologous chromosomes. As a result, to investigate the origin and evolution of haplotypes in the Bru1 genomic regions in sugarcanemore » cultivars, we identified two BAC clones from S. spontaneum and four from S. officinarum and compared to seven haplotype sequences from sugarcane hybrid R570. The results clarified the origin of seven homologous haplotypes in R570, four haplotypes originated from S. officinarum, two from S. spontaneum and one recombinant.. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence ranged from 18.2 % to 60.5 % with an average of 33. 7 %. Gene content and gene structure were relatively well conserved among the homologous haplotypes. Exon splitting occurred in haplotypes of the hybrid genome but not in its progenitor genomes. Tajima's D analysis revealed that S. spontaneum hapotypes in the Bru1 genomic regions were under strong directional selection. Numerous inversions, deletions, insertions and translocations were found between haplotypes within each genome. In conclusion, this is the first comparison among haplotypes of a modern sugarcane hybrid and its two progenitors. Tajima's D results emphasized the crucial role of this fungal disease resistance gene for enhancing the fitness of this species and indicating that the brown rust resistance gene in R570 is from S. spontaneum. Species-specific InDel, sequences similarity and phylogenetic analysis of homologous genes can be used for identifying the origin of S. spontaneum and S. officinarum haplotype in Saccharum hybrids. Comparison of exon splitting among the homologous haplotypes suggested that the genome rearrangements in Saccharum hybrids S. officinarum would be sufficient for proper genome assembly of this autopolyploid genome. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence may allow sequencing and assembling the autopolyploid Saccharum genomes and the auto-allopolyploid hybrid genomes using whole genome shotgun sequencing.« less
Comparative structural analysis of Bru1 region homeologs in Saccharum spontaneum and S. officinarum
Zhang, Jisen; Sharma, Anupma; Yu, Qingyi; ...
2016-06-10
Here, sugarcane is a major sugar and biofuel crop, but genomic research and molecular breeding have lagged behind other major crops due to the complexity of auto-allopolyploid genomes. Sugarcane cultivars are frequently aneuploid with chromosome number ranging from 100 to 130, consisting of 70-80 % S. officinarum, 10-20 % S. spontaneum, and 10 % recombinants between these two species. Analysis of a genomic region in the progenitor autoploid genomes of sugarcane hybrid cultivars will reveal the nature and divergence of homologous chromosomes. As a result, to investigate the origin and evolution of haplotypes in the Bru1 genomic regions in sugarcanemore » cultivars, we identified two BAC clones from S. spontaneum and four from S. officinarum and compared to seven haplotype sequences from sugarcane hybrid R570. The results clarified the origin of seven homologous haplotypes in R570, four haplotypes originated from S. officinarum, two from S. spontaneum and one recombinant.. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence ranged from 18.2 % to 60.5 % with an average of 33. 7 %. Gene content and gene structure were relatively well conserved among the homologous haplotypes. Exon splitting occurred in haplotypes of the hybrid genome but not in its progenitor genomes. Tajima's D analysis revealed that S. spontaneum hapotypes in the Bru1 genomic regions were under strong directional selection. Numerous inversions, deletions, insertions and translocations were found between haplotypes within each genome. In conclusion, this is the first comparison among haplotypes of a modern sugarcane hybrid and its two progenitors. Tajima's D results emphasized the crucial role of this fungal disease resistance gene for enhancing the fitness of this species and indicating that the brown rust resistance gene in R570 is from S. spontaneum. Species-specific InDel, sequences similarity and phylogenetic analysis of homologous genes can be used for identifying the origin of S. spontaneum and S. officinarum haplotype in Saccharum hybrids. Comparison of exon splitting among the homologous haplotypes suggested that the genome rearrangements in Saccharum hybrids S. officinarum would be sufficient for proper genome assembly of this autopolyploid genome. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence may allow sequencing and assembling the autopolyploid Saccharum genomes and the auto-allopolyploid hybrid genomes using whole genome shotgun sequencing.« less
Saeed, Isaam; Wong, Stephen Q.; Mar, Victoria; Goode, David L.; Caramia, Franco; Doig, Ken; Ryland, Georgina L.; Thompson, Ella R.; Hunter, Sally M.; Halgamuge, Saman K.; Ellul, Jason; Dobrovic, Alexander; Campbell, Ian G.; Papenfuss, Anthony T.; McArthur, Grant A.; Tothill, Richard W.
2014-01-01
Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/. PMID:24752294
Biémont, Christian; Nardon, Christiane; Deceliere, Grégory; Lepetit, David; Loevenbruck, Catherine; Vieira, Cristina
2003-01-01
Transposable elements (TEs), which promote various kinds of mutations, constitute a large fraction of the genome. How they invade natural populations and species is therefore of fundamental importance for understanding the dynamics of genetic diversity and genome composition. On the basis of 85 samples of natural populations of Drosophila simulans, we report the distributions of the genome insertion site numbers of nine TEs that were chosen because they have a low average number of sites. Most populations were found to have 0-3 insertion sites, but some of them had a significantly higher number of sites for a given TE. The populations located in regions outside Africa had the highest number of sites for all elements except HMS Beagle and Coral, suggesting a recent increase in the activity of some TEs associated with the colonization patterns of Drosophila simulans. The element Tirant had a very distinctive pattern of distribution: it was identified mainly in populations from East Africa and some islands in the Indian Ocean, and its insertion site number was low in all these populations. The data suggest that the genome of the entire species of Drosophila simulans may be being invaded by TEs from populations in which they are present in high copy number.
On the weight of indels in genomic distances
2011-01-01
Background Classical approaches to compute the genomic distance are usually limited to genomes with the same content, without duplicated markers. However, differences in the gene content are frequently observed and can reflect important evolutionary aspects. A few polynomial time algorithms that include genome rearrangements, insertions and deletions (or substitutions) were already proposed. These methods often allow a block of contiguous markers to be inserted, deleted or substituted at once but result in distance functions that do not respect the triangular inequality and hence do not constitute metrics. Results In the present study we discuss the disruption of the triangular inequality in some of the available methods and give a framework to establish an efficient correction for two models recently proposed, one that includes insertions, deletions and double cut and join (DCJ) operations, and one that includes substitutions and DCJ operations. Conclusions We show that the proposed framework establishes the triangular inequality in both distances, by summing a surcharge on indel operations and on substitutions that depends only on the number of markers affected by these operations. This correction can be applied a posteriori, without interfering with the already available formulas to compute these distances. We claim that this correction leads to distances that are biologically more plausible. PMID:22151784
Insertion and deletion polymorphisms of the ancient AluS family in the human genome.
Kryatova, Maria S; Steranka, Jared P; Burns, Kathleen H; Payer, Lindsay M
2017-01-01
Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some AluS elements have been more active recently than previously thought, or that fixation of AluS insertion alleles remains incomplete. These data expand the potential significance of polymorphic AluS elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic AluS elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted.
Mating system shifts and transposable element evolution in the plant genus Capsella.
Agren, J Ågren; Wang, Wei; Koenig, Daniel; Neuffer, Barbara; Weigel, Detlef; Wright, Stephen I
2014-07-16
Despite having predominately deleterious fitness effects, transposable elements (TEs) are major constituents of eukaryote genomes in general and of plant genomes in particular. Although the proportion of the genome made up of TEs varies at least four-fold across plants, the relative importance of the evolutionary forces shaping variation in TE abundance and distributions across taxa remains unclear. Under several theoretical models, mating system plays an important role in governing the evolutionary dynamics of TEs. Here, we use the recently sequenced Capsella rubella reference genome and short-read whole genome sequencing of multiple individuals to quantify abundance, genome distributions, and population frequencies of TEs in three recently diverged species of differing mating system, two self-compatible species (C. rubella and C. orientalis) and their self-incompatible outcrossing relative, C. grandiflora. We detect different dynamics of TE evolution in our two self-compatible species; C. rubella shows a small increase in transposon copy number, while C. orientalis shows a substantial decrease relative to C. grandiflora. The direction of this change in copy number is genome wide and consistent across transposon classes. For insertions near genes, however, we detect the highest abundances in C. grandiflora. Finally, we also find differences in the population frequency distributions across the three species. Overall, our results suggest that the evolution of selfing may have different effects on TE evolution on a short and on a long timescale. Moreover, cross-species comparisons of transposon abundance are sensitive to reference genome bias, and efforts to control for this bias are key when making comparisons across species.
Jia, Xianbo; Lin, Xinjian; Chen, Jichen
2017-11-02
Current genome walking methods are very time consuming, and many produce non-specific amplification products. To amplify the flanking sequences that are adjacent to Tn5 transposon insertion sites in Serratia marcescens FZSF02, we developed a genome walking method based on TAIL-PCR. This PCR method added a 20-cycle linear amplification step before the exponential amplification step to increase the concentration of the target sequences. Products of the linear amplification and the exponential amplification were diluted 100-fold to decrease the concentration of the templates that cause non-specific amplification. Fast DNA polymerase with a high extension speed was used in this method, and an amplification program was used to rapidly amplify long specific sequences. With this linear and exponential TAIL-PCR (LETAIL-PCR), we successfully obtained products larger than 2 kb from Tn5 transposon insertion mutant strains within 3 h. This method can be widely used in genome walking studies to amplify unknown sequences that are adjacent to known sequences.
Targeted gene insertion for molecular medicine.
Voigt, Katrin; Izsvák, Zsuzsanna; Ivics, Zoltán
2008-11-01
Genomic insertion of a functional gene together with suitable transcriptional regulatory elements is often required for long-term therapeutical benefit in gene therapy for several genetic diseases. A variety of integrating vectors for gene delivery exist. Some of them exhibit random genomic integration, whereas others have integration preferences based on attributes of the targeted site, such as primary DNA sequence and physical structure of the DNA, or through tethering to certain DNA sequences by host-encoded cellular factors. Uncontrolled genomic insertion bears the risk of the transgene being silenced due to chromosomal position effects, and can lead to genotoxic effects due to mutagenesis of cellular genes. None of the vector systems currently used in either preclinical experiments or clinical trials displays sufficient preferences for target DNA sequences that would ensure appropriate and reliable expression of the transgene and simultaneously prevent hazardous side effects. We review in this paper the advantages and disadvantages of both viral and non-viral gene delivery technologies, discuss mechanisms of target site selection of integrating genetic elements (viruses and transposons), and suggest distinct molecular strategies for targeted gene delivery.
Majira, Amel; Domin, Monique; Grandjean, Olivier; Gofron, Krystyna; Houba-Hérin, Nicole
2002-10-01
A seedling lethal mutant of Nicotiana plumbaginifolia (sdl-1) was isolated by transposon tagging using a maize Dissociation (Ds) element. The insertion mutation was produced by direct co-transformation of protoplasts with two plasmids: one containing Ds and a second with an Ac transposase gene. sdl-1 seedlings exhibit several phenotypes: swollen organs, short hypocotyls in light and dark conditions, and enlarged and multinucleated cells, that altogether suggest cell growth defects. Mutant cells are able to proliferate under in vitro culture conditions. Genomic DNA sequences bordering the transposon were used to recover cDNA from the normal allele. Complementation of the mutant phenotype with the cDNA confirmed that the transposon had caused the mutation. The Ds element was inserted into the first exon of the open reading frame and the homozygous mutant lacked detectable transcript. Phenocopies of the mutant were obtained by an antisense approach. SDL-1 encodes a novel protein found in several plant genomes but apparently missingfrom animal and fungal genomes; the protein is highly conserved and has a potential plastid targeting motif.
Suzuki, Hidetsugu; Asahara, Hiroshi
2015-08-01
Genome editing is a genetic technology by which any DNA sequence is inserted, replaced or deleted. Genome editing has been making rapid progress recently, with the development of new techniques such as ZFN, TALEN and CRISPR/Cas9. Genome editing can be applied to various fields ranging from the production of knock out animals to gene therapy. This section summarizes these new genome editing technologies and its applications.
Marcon, Helena Sanches; Domingues, Douglas Silva; Silva, Juliana Costa; Borges, Rafael Junqueira; Matioli, Fábio Filippi; Fontes, Marcos Roberto de Mattos; Marino, Celso Luis
2015-08-14
In Eucalyptus genus, studies on genome composition and transposable elements (TEs) are particularly scarce. Nearly half of the recently released Eucalyptus grandis genome is composed by retrotransposons and this data provides an important opportunity to understand TE dynamics in Eucalyptus genome and transcriptome. We characterized nine families of transcriptionally active LTR retrotransposons from Copia and Gypsy superfamilies in Eucalyptus grandis genome and we depicted genomic distribution and copy number in two Eucalyptus species. We also evaluated genomic polymorphism and transcriptional profile in three organs of five Eucalyptus species. We observed contrasting genomic and transcriptional behavior in the same family among different species. RLC_egMax_1 was the most prevalent family and RLC_egAngela_1 was the family with the lowest copy number. Most families of both superfamilies have their insertions occurring <3 million years, except one Copia family, RLC_egBianca_1. Protein theoretical models suggest different properties between Copia and Gypsy domains. IRAP and REMAP markers suggested genomic polymorphisms among Eucalyptus species. Using EST analysis and qRT-PCRs, we observed transcriptional activity in several tissues and in all evaluated species. In some families, osmotic stress increases transcript values. Our strategy was successful in isolating transcriptionally active retrotransposons in Eucalyptus, and each family has a particular genomic and transcriptional pattern. Overall, our results show that retrotransposon activity have differentially affected genome and transcriptome among Eucalyptus species.
Luo, Ming; Gilbert, Brian; Ayliffe, Michael
2016-07-01
Mutagenesis continues to play an essential role for understanding plant gene function and, in some instances, provides an opportunity for plant improvement. The development of gene editing technologies such as TALENs and zinc fingers has revolutionised the targeted mutation specificity that can now be achieved. The CRISPR/Cas9 system is the most recent addition to gene editing technologies and arguably the simplest requiring only two components; a small guide RNA molecule (sgRNA) and Cas9 endonuclease protein which complex to recognise and cleave a specific 20 bp target site present in a genome. Target specificity is determined by complementary base pairing between the sgRNA and target site sequence enabling highly specific, targeted mutation to be readily engineered. Upon target site cleavage, error-prone endogenous repair mechanisms produce small insertion/deletions at the target site usually resulting in loss of gene function. CRISPR/Cas9 gene editing has been rapidly adopted in plants and successfully undertaken in numerous species including major crop species. Its applications are not restricted to mutagenesis and target site cleavage can be exploited to promote sequence insertion or replacement by recombination. The multiple applications of this technology in plants are described.
Child Development and Structural Variation in the Human Genome
ERIC Educational Resources Information Center
Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.
2013-01-01
Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…
USDA-ARS?s Scientific Manuscript database
Copy number variations (CNVs) are large insertions, deletions or duplications in the genome that vary between members of a species and are known to affect a wide variety of phenotypic traits. In this study, we identified CNVs in a population of bulls using low coverage next-generation sequence data....
Successful Gene Tagging in Lettuce Using the Tnt1 Retrotransposon from Tobacco
Mazier, Marianne; Botton, Emmanuel; Flamain, Fabrice; Bouchet, Jean-Paul; Courtial, Béatrice; Chupeau, Marie-Christine; Chupeau, Yves; Maisonneuve, Brigitte; Lucas, Hélène
2007-01-01
The tobacco (Nicotiana tabacum) element Tnt1 is one of the few identified active retrotransposons in plants. These elements possess unique properties that make them ideal genetic tools for gene tagging. Here, we demonstrate the feasibility of gene tagging using the retrotransposon Tnt1 in lettuce (Lactuca sativa), which is the largest genome tested for retrotransposon mutagenesis so far. Of 10 different transgenic bushes carrying a complete Tnt1 containing T-DNA, eight contained multiple transposed copies of Tnt1. The number of transposed copies of the element per plant was particularly high, the smallest number being 28. Tnt1 transposition in lettuce can be induced by a very simple in vitro culture protocol. Tnt1 insertions were stable in the progeny of the primary transformants and could be segregated genetically. Characterization of the sequences flanking some insertion sites revealed that Tnt1 often inserted into genes. The progeny of some primary transformants showed phenotypic alterations due to recessive mutations. One of these mutations was due to Tnt1 insertion in the gibberellin 3β-hydroxylase gene. Taken together, these results indicate that Tnt1 is a powerful tool for insertion mutagenesis especially in plants with a large genome. PMID:17351058
Mutational Dynamics of Aroid Chloroplast Genomes
Ahmed, Ibrar; Biggs, Patrick J.; Matthews, Peter J.; Collins, Lesley J.; Hendy, Michael D.; Lockhart, Peter J.
2012-01-01
A characteristic feature of eukaryote and prokaryote genomes is the co-occurrence of nucleotide substitution and insertion/deletion (indel) mutations. Although similar observations have also been made for chloroplast DNA, genome-wide associations have not been reported. We determined the chloroplast genome sequences for two morphotypes of taro (Colocasia esculenta; family Araceae) and compared these with four publicly available aroid chloroplast genomes. Here, we report the extent of genome-wide association between direct and inverted repeats, indels, and substitutions in these aroid chloroplast genomes. We suggest that alternative but not mutually exclusive hypotheses explain the mutational dynamics of chloroplast genome evolution. PMID:23204304
Yohn, Chris T; Jiang, Zhaoshi; McGrath, Sean D; Hayden, Karen E; Khaitovich, Philipp; Johnson, Matthew E; Eichler, Marla Y; McPherson, John D; Zhao, Shaying; Pääbo, Svante; Eichler, Evan E
2005-04-01
Retroviral infections of the germline have the potential to episodically alter gene function and genome structure during the course of evolution. Horizontal transmissions between species have been proposed, but little evidence exists for such events in the human/great ape lineage of evolution. Based on analysis of finished BAC chimpanzee genome sequence, we characterize a retroviral element (Pan troglodytes endogenous retrovirus 1 [PTERV1]) that has become integrated in the germline of African great ape and Old World monkey species but is absent from humans and Asian ape genomes. We unambiguously map 287 retroviral integration sites and determine that approximately 95.8% of the insertions occur at non-orthologous regions between closely related species. Phylogenetic analysis of the endogenous retrovirus reveals that the gorilla and chimpanzee elements share a monophyletic origin with a subset of the Old World monkey retroviral elements, but that the average sequence divergence exceeds neutral expectation for a strictly nuclear inherited DNA molecule. Within the chimpanzee, there is a significant integration bias against genes, with only 14 of these insertions mapping within intronic regions. Six out of ten of these genes, for which there are expression data, show significant differences in transcript expression between human and chimpanzee. Our data are consistent with a retroviral infection that bombarded the genomes of chimpanzees and gorillas independently and concurrently, 3-4 million years ago. We speculate on the potential impact of such recent events on the evolution of humans and great apes.
Generation and validation of homozygous fluorescent knock-in cells using CRISPR-Cas9 genome editing.
Koch, Birgit; Nijmeijer, Bianca; Kueblbeck, Moritz; Cai, Yin; Walther, Nike; Ellenberg, Jan
2018-06-01
Gene tagging with fluorescent proteins is essential for investigations of the dynamic properties of cellular proteins. CRISPR-Cas9 technology is a powerful tool for inserting fluorescent markers into all alleles of the gene of interest (GOI) and allows functionality and physiological expression of the fusion protein. It is essential to evaluate such genome-edited cell lines carefully in order to preclude off-target effects caused by (i) incorrect insertion of the fluorescent protein, (ii) perturbation of the fusion protein by the fluorescent proteins or (iii) nonspecific genomic DNA damage by CRISPR-Cas9. In this protocol, we provide a step-by-step description of our systematic pipeline to generate and validate homozygous fluorescent knock-in cell lines.We have used the paired Cas9D10A nickase approach to efficiently insert tags into specific genomic loci via homology-directed repair (HDR) with minimal off-target effects. It is time-consuming and costly to perform whole-genome sequencing of each cell clone to check for spontaneous genetic variations occurring in mammalian cell lines. Therefore, we have developed an efficient validation pipeline of the generated cell lines consisting of junction PCR, Southern blotting analysis, Sanger sequencing, microscopy, western blotting analysis and live-cell imaging for cell-cycle dynamics. This protocol takes between 6 and 9 weeks. With this protocol, up to 70% of the targeted genes can be tagged homozygously with fluorescent proteins, thus resulting in physiological levels and phenotypically functional expression of the fusion proteins.
Selfish DNA in protein-coding genes of Rickettsia.
Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M
2000-10-13
Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.
Dron, M; Hartmann, C; Rode, A; Sevignac, M
1985-01-01
We have characterized a 1.7 kb sequence, containing a tRNA Leu2 gene shared by the ct and mt genomes of Brassica oleracea. The two sequences are completely homologous except in two short regions where two distinct gene conversion events have occurred between two sets of direct repeats leading to the insertion of 5 bp in the T loop of the mt copy of the ct gene. This is the first evidence that gene conversion represents the initial evolutionary step in inactivation of transferred ct genes in the mt genome. We also indicate that organelle DNA transfer by organelle fusion is an ongoing process which could be useful in genetic engineering. PMID:4080548
Genomic deletions created upon LINE-1 retrotransposition.
Gilbert, Nicolas; Lutz-Prigge, Sheila; Moran, John V
2002-08-09
LINE-1 (L1) retrotransposition continues to impact the human genome, yet little is known about how L1 integrates into DNA. Here, we developed a plasmid-based rescue system and have used it to recover 37 new L1 retrotransposition events from cultured human cells. Sequencing of the insertions revealed the usual L1 structural hallmarks; however, in four instances, retrotransposition generated large target site deletions. Remarkably, three of those resulted in the formation of chimeric L1s, containing the 5' end of an endogenous L1 fused precisely to our engineered L1. Thus, our data demonstrate multiple pathways for L1 integration in cultured cells, and show that L1 is not simply an insertional mutagen, but that its retrotransposition can result in significant deletions of genomic sequence.
Suboptimal Doses of Raltegravir Cause Aberrant HIV Integrations | Center for Cancer Research
When a cell is infected with HIV, a DNA copy of the HIV genome is inserted into that cell’s chromosomal DNA. This insertion reaction is carried out by the viral enzyme integrase (IN) and involves two distinct steps: removal of two nucleotides from each 3’ end of the viral DNA, followed by the strand transfer reaction, in which the viral DNA ends are inserted into the host
Hendre, Prasad S.; Aggarwal, Ramesh K.
2014-01-01
Coffee breeding and improvement efforts can be greatly facilitated by availability of a large repository of simple sequence repeats (SSRs) based microsatellite markers, which provides efficiency and high-resolution in genetic analyses. This study was aimed to improve SSR availability in coffee by developing new genic−/genomic-SSR markers using in-silico bioinformatics and streptavidin-biotin based enrichment approach, respectively. The expressed sequence tag (EST) based genic microsatellite markers (EST-SSRs) were developed using the publicly available dataset of 13,175 unigene ESTs, which showed a distribution of 1 SSR/3.4 kb of coffee transcriptome. Genomic SSRs, on the other hand, were developed from an SSR-enriched small-insert partial genomic library of robusta coffee. In total, 69 new SSRs (44 EST-SSRs and 25 genomic SSRs) were developed and validated as suitable genetic markers. Diversity analysis of selected coffee genotypes revealed these to be highly informative in terms of allelic diversity and PIC values, and eighteen of these markers (∼27%) could be mapped on a robusta linkage map. Notably, the markers described here also revealed a very high cross-species transferability. In addition to the validated markers, we have also designed primer pairs for 270 putative EST-SSRs, which are expected to provide another ca. 200 useful genetic markers considering the high success rate (88%) of marker conversion of similar pairs tested/validated in this study. PMID:25461752
The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
Shmakov, Sergey A.; Sitnik, Vassilii; Makarova, Kira S.; Wolf, Yuri I.; Severinov, Konstantin V.
2017-01-01
ABSTRACT Clustered regularly interspaced short palindromic repeats and CRISPR-associated protein (CRISPR-Cas) systems store the memory of past encounters with foreign DNA in unique spacers that are inserted between direct repeats in CRISPR arrays. For only a small fraction of the spacers, homologous sequences, called protospacers, are detectable in viral, plasmid, and microbial genomes. The rest of the spacers remain the CRISPR “dark matter.” We performed a comprehensive analysis of the spacers from all CRISPR-cas loci identified in bacterial and archaeal genomes, and we found that, depending on the CRISPR-Cas subtype and the prokaryotic phylum, protospacers were detectable for 1% to about 19% of the spacers (~7% global average). Among the detected protospacers, the majority, typically 80 to 90%, originated from viral genomes, including proviruses, and among the rest, the most common source was genes that are integrated into microbial chromosomes but are involved in plasmid conjugation or replication. Thus, almost all spacers with identifiable protospacers target mobile genetic elements (MGE). The GC content, as well as dinucleotide and tetranucleotide compositions, of microbial genomes, their spacer complements, and the cognate viral genomes showed a nearly perfect correlation and were almost identical. Given the near absence of self-targeting spacers, these findings are most compatible with the possibility that the spacers, including the dark matter, are derived almost completely from the species-specific microbial mobilomes. PMID:28928211
TEs or not TEs? That is the evolutionary question.
Vaknin, Keren; Goren, Amir; Ast, Gil
2009-10-23
Transposable elements (TEs) have contributed a wide range of functional sequences to their host genomes. A recent paper in BMC Molecular Biology discusses the creation of new transcripts by transposable element insertion upstream of retrocopies and the involvement of such insertions in tissue-specific post-transcriptional regulation.
Dong, Fengping; Xie, Kabin; Chen, Yueying; Yang, Yinong; Mao, Yingwei
2016-01-01
CRISPR/Cas9 has been widely used for genomic editing in many organisms. Many human diseases are caused by multiple mutations. The CRISPR/Cas9 system provides a potential tool to introduce multiple mutations in a genome. To mimic complicated genomic variants in human diseases, such as multiple gene deletions or mutations, two or more small guide RNAs (sgRNAs) need to be introduced all together. This can be achieved by separate Pol III promoters in a construct. However, limited enzyme sites and increased insertion size lower the efficiency to make a construct. Here, we report a strategy to quickly assembly multiple sgRNAs in one construct using a polycistronic-tRNA-gRNA (PTG) strategy. Taking advantage of the endogenous tRNA processing system in mammalian cells, we efficiently express multiple sgRNAs driven using only one Pol III promoter. Using an all-in-one construct carrying PTG, we disrupt the deacetylase domain in multiple histone deacetylases (HDACs) in human cells simultaneously. We demonstrate that multiple HDAC deletions significantly affect the activation of the Wnt-signaling pathway. Thus, this method enables to efficiently target multiple genes and provide a useful tool to establish mutated cells mimicking human diseases. PMID:27890617
Dong, Fengping; Xie, Kabin; Chen, Yueying; Yang, Yinong; Mao, Yingwei
2017-01-22
CRISPR/Cas9 has been widely used for genomic editing in many organisms. Many human diseases are caused by multiple mutations. The CRISPR/Cas9 system provides a potential tool to introduce multiple mutations in a genome. To mimic complicated genomic variants in human diseases, such as multiple gene deletions or mutations, two or more small guide RNAs (sgRNAs) need to be introduced all together. This can be achieved by separate Pol III promoters in a construct. However, limited enzyme sites and increased insertion size lower the efficiency to make a construct. Here, we report a strategy to quickly assembly multiple sgRNAs in one construct using a polycistronic-tRNA-gRNA (PTG) strategy. Taking advantage of the endogenous tRNA processing system in mammalian cells, we efficiently express multiple sgRNAs driven using only one Pol III promoter. Using an all-in-one construct carrying PTG, we disrupt the deacetylase domain in multiple histone deacetylases (HDACs) in human cells simultaneously. We demonstrate that multiple HDAC deletions significantly affect the activation of the Wnt-signaling pathway. Thus, this method enables to efficiently target multiple genes and provide a useful tool to establish mutated cells mimicking human diseases. Copyright © 2016 Elsevier Inc. All rights reserved.
Habegger, Lukas; Balasubramanian, Suganthi; Chen, David Z.; Khurana, Ekta; Sboner, Andrea; Harmanci, Arif; Rozowsky, Joel; Clarke, Declan; Snyder, Michael; Gerstein, Mark
2012-01-01
Summary: The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment. Availability and Implementation: VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org. Contact: lukas.habegger@yale.edu or mark.gerstein@yale.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:22743228
NASA Astrophysics Data System (ADS)
Zhan, Aibin; Bao, Zhenmin; Hu, Xiaoli; Lu, Wei; Hu, Jingjie
2009-06-01
Microsatellite markers have become one kind of the most important molecular tools used in various researches. A large number of microsatellite markers are required for the whole genome survey in the fields of molecular ecology, quantitative genetics and genomics. Therefore, it is extremely necessary to select several versatile, low-cost, efficient and time- and labor-saving methods to develop a large panel of microsatellite markers. In this study, we used Zhikong scallop ( Chlamys farreri) as the target species to compare the efficiency of the five methods derived from three strategies for microsatellite marker development. The results showed that the strategy of constructing small insert genomic DNA library resulted in poor efficiency, while the microsatellite-enriched strategy highly improved the isolation efficiency. Although the mining public database strategy is time- and cost-saving, it is difficult to obtain a large number of microsatellite markers, mainly due to the limited sequence data of non-model species deposited in public databases. Based on the results in this study, we recommend two methods, microsatellite-enriched library construction method and FIASCO-colony hybridization method, for large-scale microsatellite marker development. Both methods were derived from the microsatellite-enriched strategy. The experimental results obtained from Zhikong scallop also provide the reference for microsatellite marker development in other species with large genomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bai, Xiaodong; Zhang, Jianhua; Ewing, Adam
Phytoplasmas (Candidatus Phytoplasma, Class Mollicutes) cause disease in hundreds of economically important plants, and are obligately transmitted by sap-feeding insects of the order Hemiptera, mainly leafhoppers and psyllids. The 706,569-bp chromosome and four plasmids of aster yellows phytoplasma strain witches broom (AY-WB) were sequenced and compared to the onion yellows phytoplasma strain M (OY-M) genome. The phytoplasmas have small repeat-rich genomes. The repeated DNAs are organized into large clusters, potential mobile units (PMUs), which contain tra5 insertion sequences (ISs), and specialized sigma factors and membrane proteins. So far, PMUs are unique to phytoplasmas. Compared to mycoplasmas, phytoplasmas lack several recombinationmore » and DNA modification functions, and therefore phytoplasmas probably use different mechanisms of recombination, likely involving PMUs, for the creation of variability, allowing phytoplasmas to adjust to the diverse environments of plants and insects. The irregular GC skews and presence of ISs and large repeated sequences in the AY-WB and OY-M genomes are indicative of high genomic plasticity. Nevertheless, segments of {approx}250 kb, located between genes lplA and glnQ are syntenic between the two phytoplasmas, contain the majority of the metabolic genes and no ISs. AY-WB is further along in the reductive evolution process than OY-M. The AY-WB genome is {approx}154 kb smaller than the OY-M genome, primarily as a result of fewer multicopy sequences, including PMUs. Further, AY-WB lacks genes that are truncated and are part of incomplete pathways in OY-M. This is the first comparative phytoplasma genome analysis and report of the existence of PMUs in phytoplasma genomes.« less
Quadros, Rolen M; Miura, Hiromi; Harms, Donald W; Akatsuka, Hisako; Sato, Takehito; Aida, Tomomi; Redder, Ronald; Richardson, Guy P; Inagaki, Yutaka; Sakai, Daisuke; Buckley, Shannon M; Seshacharyulu, Parthasarathy; Batra, Surinder K; Behlke, Mark A; Zeiner, Sarah A; Jacobi, Ashley M; Izu, Yayoi; Thoreson, Wallace B; Urness, Lisa D; Mansour, Suzanne L; Ohtsuka, Masato; Gurumurthy, Channabasavaiah B
2017-05-17
Conditional knockout mice and transgenic mice expressing recombinases, reporters, and inducible transcriptional activators are key for many genetic studies and comprise over 90% of mouse models created. Conditional knockout mice are generated using labor-intensive methods of homologous recombination in embryonic stem cells and are available for only ~25% of all mouse genes. Transgenic mice generated by random genomic insertion approaches pose problems of unreliable expression, and thus there is a need for targeted-insertion models. Although CRISPR-based strategies were reported to create conditional and targeted-insertion alleles via one-step delivery of targeting components directly to zygotes, these strategies are quite inefficient. Here we describe Easi-CRISPR (Efficient additions with ssDNA inserts-CRISPR), a targeting strategy in which long single-stranded DNA donors are injected with pre-assembled crRNA + tracrRNA + Cas9 ribonucleoprotein (ctRNP) complexes into mouse zygotes. We show for over a dozen loci that Easi-CRISPR generates correctly targeted conditional and insertion alleles in 8.5-100% of the resulting live offspring. Easi-CRISPR solves the major problem of animal genome engineering, namely the inefficiency of targeted DNA cassette insertion. The approach is robust, succeeding for all tested loci. It is versatile, generating both conditional and targeted insertion alleles. Finally, it is highly efficient, as treating an average of only 50 zygotes is sufficient to produce a correctly targeted allele in up to 100% of live offspring. Thus, Easi-CRISPR offers a comprehensive means of building large-scale Cre-LoxP animal resources.
Efficient CRISPR/Cas9-Based Genome Engineering in Human Pluripotent Stem Cells.
Kime, Cody; Mandegar, Mohammad A; Srivastava, Deepak; Yamanaka, Shinya; Conklin, Bruce R; Rand, Tim A
2016-01-01
Human pluripotent stem cells (hPS cells) are rapidly emerging as a powerful tool for biomedical discovery. The advent of human induced pluripotent stem cells (hiPS cells) with human embryonic stem (hES)-cell-like properties has led to hPS cells with disease-specific genetic backgrounds for in vitro disease modeling and drug discovery as well as mechanistic and developmental studies. To fully realize this potential, it will be necessary to modify the genome of hPS cells with precision and flexibility. Pioneering experiments utilizing site-specific double-strand break (DSB)-mediated genome engineering tools, including zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), have paved the way to genome engineering in previously recalcitrant systems such as hPS cells. However, these methods are technically cumbersome and require significant expertise, which has limited adoption. A major recent advance involving the clustered regularly interspaced short palindromic repeats (CRISPR) endonuclease has dramatically simplified the effort required for genome engineering and will likely be adopted widely as the most rapid and flexible system for genome editing in hPS cells. In this unit, we describe commonly practiced methods for CRISPR endonuclease genomic editing of hPS cells into cell lines containing genomes altered by insertion/deletion (indel) mutagenesis or insertion of recombinant genomic DNA. Copyright © 2016 John Wiley & Sons, Inc.
Impact of retrotransposons in pluripotent stem cells.
Tanaka, Yoshiaki; Chung, Leeyup; Park, In-Hyun
2012-12-01
Retrotransposons, which constitute approximately 40% of the human genome, have the capacity to 'jump' across the genome. Their mobility contributes to oncogenesis, evolution, and genomic plasticity of the host genome. Induced pluripotent stem cells as well as embryonic stem cells are more susceptible than differentiated cells to genomic aberrations including insertion, deletion and duplication. Recent studies have revealed specific behaviors of retrotransposons in pluripotent cells. Here, we review recent progress in understanding retrotransposons and provide a perspective on the relationship between retrotransposons and genomic variation in pluripotent stem cells.
Kolk, A H; Noordhoek, G T; de Leeuw, O; Kuijper, S; van Embden, J D
1994-01-01
For the detection of Mycobacterium tuberculosis by PCR, the IS6110 sequence was used. A modified target was constructed by insertion of 56 nucleotides in the IS6110 insertion element of Mycobacterium bovis BCG. This modified insertion sequence was integrated into the genome of Mycobacterium smegmatis, a mycobacterium species which does not contain the IS6110 element. When DNA from the modified M. smegmatis 1008 strain was amplified with IS6110-specific primers INS1 and INS2, a band of 301 bp was seen on agarose gel, whereas the PCR product of M. tuberculosis complex DNA was a 245-bp fragment with these primers. The addition of a small number of M. smegmatis 1008 cells to clinical samples before DNA purification enables the detection of problems which may be due to the loss of DNA in the isolation procedure or to the presence of inhibitors. The presence of inhibitors of the amplification reaction can be confirmed by the addition of M. smegmatis 1008 DNA after the DNA isolation procedure. Furthermore, competition between the different target DNAs of M. smegmatis 1008 DNA and M. tuberculosis complex DNA enables the estimation of the number of IS6110 elements in the clinical sample. Images PMID:8051267
2014-01-01
Background Small insertion and deletion polymorphisms (Indels) are the second most common mutations in the human genome, after Single Nucleotide Polymorphisms (SNPs). Recent studies have shown that they have significant influence on genetic variation by altering human traits and can cause multiple human diseases. In particular, many Indels that occur in protein coding regions are known to impact the structure or function of the protein. A major challenge is to predict the effects of these Indels and to distinguish between deleterious and neutral variants. When an Indel occurs within a coding region, it can be either frameshifting (FS) or non-frameshifting (NFS). FS-Indels either modify the complete C-terminal region of the protein or result in premature termination of translation. NFS-Indels insert/delete multiples of three nucleotides leading to the insertion/deletion of one or more amino acids. Results In order to study the relationships between NFS-Indels and Mendelian diseases, we characterized NFS-Indels according to numerous structural, functional and evolutionary parameters. We then used these parameters to identify specific characteristics of disease-causing and neutral NFS-Indels. Finally, we developed a new machine learning approach, KD4i, that can be used to predict the phenotypic effects of NFS-Indels. Conclusions We demonstrate in a large-scale evaluation that the accuracy of KD4i is comparable to existing state-of-the-art methods. However, a major advantage of our approach is that we also provide the reasons for the predictions, in the form of a set of rules. The rules are interpretable by non-expert humans and they thus represent new knowledge about the relationships between the genotype and phenotypes of NFS-Indels and the causative molecular perturbations that result in the disease. PMID:24742296
L1-associated genomic regions are deleted in somatic cells of the healthy human brain.
Erwin, Jennifer A; Paquola, Apuã C M; Singer, Tatjana; Gallina, Iryna; Novotny, Mark; Quayle, Carolina; Bedrosian, Tracy A; Alves, Francisco I A; Butcher, Cheyenne R; Herdy, Joseph R; Sarkar, Anindita; Lasken, Roger S; Muotri, Alysson R; Gage, Fred H
2016-12-01
The healthy human brain is a mosaic of varied genomes. Long interspersed element-1 (LINE-1 or L1) retrotransposition is known to create mosaicism by inserting L1 sequences into new locations of somatic cell genomes. Using a machine learning-based, single-cell sequencing approach, we discovered that somatic L1-associated variants (SLAVs) are composed of two classes: L1 retrotransposition insertions and retrotransposition-independent L1-associated variants. We demonstrate that a subset of SLAVs comprises somatic deletions generated by L1 endonuclease cutting activity. Retrotransposition-independent rearrangements in inherited L1s resulted in the deletion of proximal genomic regions. These rearrangements were resolved by microhomology-mediated repair, which suggests that L1-associated genomic regions are hotspots for somatic copy number variants in the brain and therefore a heritable genetic contributor to somatic mosaicism. We demonstrate that SLAVs are present in crucial neural genes, such as DLG2 (also called PSD93), and affect 44-63% of cells of the cells in the healthy brain.
Tsai, Chia-Ti; Hsieh, Chia-Shan; Chang, Sheng-Nan; Chuang, Eric Y.; Ueng, Kwo-Chang; Tsai, Chin-Feng; Lin, Tsung-Hsien; Wu, Cho-Kai; Lee, Jen-Kuang; Lin, Lian-Yu; Wang, Yi-Chih; Yu, Chih-Chieh; Lai, Ling-Ping; Tseng, Chuen-Den; Hwang, Juey-Jen; Chiang, Fu-Tien; Lin, Jiunn-Lee
2016-01-01
Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia. Previous genome-wide association studies had identified single-nucleotide polymorphisms in several genomic regions to be associated with AF. In human genome, copy number variations (CNVs) are known to contribute to disease susceptibility. Using a genome-wide multistage approach to identify AF susceptibility CNVs, we here show a common 4,470-bp diallelic CNV in the first intron of potassium interacting channel 1 gene (KCNIP1) is strongly associated with AF in Taiwanese populations (odds ratio=2.27 for insertion allele; P=6.23 × 10−24). KCNIP1 insertion is associated with higher KCNIP1 mRNA expression. KCNIP1-encoded protein potassium interacting channel 1 (KCHIP1) is physically associated with potassium Kv channels and modulates atrial transient outward current in cardiac myocytes. Overexpression of KCNIP1 results in inducible AF in zebrafish. In conclusions, a common CNV in KCNIP1 gene is a genetic predictor of AF risk possibly pointing to a functional pathway. PMID:26831368
Young, Robert S
2016-07-01
Frequent evolutionary birth and death events have created a large quantity of biologically important, lineage-specific DNA within mammalian genomes. The birth and death of DNA sequences is so frequent that the total number of these insertions and deletions in the human population remains unknown, although there are differences between these groups, e.g. transposable elements contribute predominantly to sequence insertion. Functional turnover - where the activity of a locus is specific to one lineage, but the underlying DNA remains conserved - can also drive birth and death. However, this does not appear to be a major driver of divergent transcriptional regulation. Both sequence and functional turnover have contributed to the birth and death of thousands of functional promoters in the human and mouse genomes. These findings reveal the pervasive nature of evolutionary birth and death and suggest that lineage-specific regions may play an important but previously underappreciated role in human biology and disease. © 2016 The Authors BioEssays Published by WILEY Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mei, Ya-Fang, E-mail: ya-fang.mei@umu.se
2016-10-15
Conventional adenovirus vectors harboring E1 or E3 deletions followed by the insertion of an exogenous gene show considerably reduced virion stability. Here, we report strategies to generate complete replication-competent Ad11p(RCAd11p) vectors that overcome the above disadvantage. A GFP cassette was successfully introduced either upstream of E1A or in the E3A region. The resulting vectors showed high expression levels of the hexon and E1genes and also strongly induced the cytopathic effect in targeted cells. When harboring oversized genomes, the RCAd11pE1 and RCAd11pE3 vectors showed significantly improved heat stability in comparison to Ad11pwt;of the three, RCAd11pE3 was the most tolerant to heatmore » treatment. Electron microscopy showed that RCAd11pE3, RCAd11pE1, Ad11pwt, and Ad11pE1 Delmanifested dominant, moderate, minimum, or no full virus particles after heat treatment at 47 °C for 5 h. Our results demonstrated that both genome size and the insertion site in the viral genome affect virion stability. -- Highlights: •Replicating adenovirus 11p GFP vectors at the E1 or E3 region were generated. •RCAd11pE3 and RCAd11pE1 vectors manifested significantly improved heat stability. •RCAd11pE3 and RCAd11pE1 showed more full viral particles than Ad11pwt after heating. •We demonstrated that both genome size and the insertion site affect virion stability.« less
Transposable elements contribute to activation of maize genes in response to abiotic stress.
Makarevitch, Irina; Waters, Amanda J; West, Patrick T; Stitzer, Michelle; Hirsch, Candice N; Ross-Ibarra, Jeffrey; Springer, Nathan M
2015-01-01
Transposable elements (TEs) account for a large portion of the genome in many eukaryotic species. Despite their reputation as "junk" DNA or genomic parasites deleterious for the host, TEs have complex interactions with host genes and the potential to contribute to regulatory variation in gene expression. It has been hypothesized that TEs and genes they insert near may be transcriptionally activated in response to stress conditions. The maize genome, with many different types of TEs interspersed with genes, provides an ideal system to study the genome-wide influence of TEs on gene regulation. To analyze the magnitude of the TE effect on gene expression response to environmental changes, we profiled gene and TE transcript levels in maize seedlings exposed to a number of abiotic stresses. Many genes exhibit up- or down-regulation in response to these stress conditions. The analysis of TE families inserted within upstream regions of up-regulated genes revealed that between four and nine different TE families are associated with up-regulated gene expression in each of these stress conditions, affecting up to 20% of the genes up-regulated in response to abiotic stress, and as many as 33% of genes that are only expressed in response to stress. Expression of many of these same TE families also responds to the same stress conditions. The analysis of the stress-induced transcripts and proximity of the transposon to the gene suggests that these TEs may provide local enhancer activities that stimulate stress-responsive gene expression. Our data on allelic variation for insertions of several of these TEs show strong correlation between the presence of TE insertions and stress-responsive up-regulation of gene expression. Our findings suggest that TEs provide an important source of allelic regulatory variation in gene response to abiotic stress in maize.
Genome Engineering in Bacillus anthracis Using Cre Recombinase
Pomerantsev, Andrei P.; Sitaraman, Ramakrishnan; Galloway, Craig R.; Kivovich, Violetta; Leppla, Stephen H.
2006-01-01
Genome engineering is a powerful method for the study of bacterial virulence. With the availability of the complete genomic sequence of Bacillus anthracis, it is now possible to inactivate or delete selected genes of interest. However, many current methods for disrupting or deleting more than one gene require use of multiple antibiotic resistance determinants. In this report we used an approach that temporarily inserts an antibiotic resistance marker into a selected region of the genome and subsequently removes it, leaving the target region (a single gene or a larger genomic segment) permanently mutated. For this purpose, a spectinomycin resistance cassette flanked by bacteriophage P1 loxP sites oriented as direct repeats was inserted within a selected gene. After identification of strains having the spectinomycin cassette inserted by a double-crossover event, a thermo-sensitive plasmid expressing Cre recombinase was introduced at the permissive temperature. Cre recombinase action at the loxP sites excised the spectinomycin marker, leaving a single loxP site within the targeted gene or genomic segment. The Cre-expressing plasmid was then removed by growth at the restrictive temperature. The procedure could then be repeated to mutate additional genes. In this way, we sequentially mutated two pairs of genes: pepM and spo0A, and mcrB and mrr. Furthermore, loxP sites introduced at distant genes could be recombined by Cre recombinase to cause deletion of large intervening regions. In this way, we deleted the capBCAD region of the pXO2 plasmid and the entire 30 kb of chromosomal DNA between the mcrB and mrr genes, and in the latter case we found that the 32 intervening open reading frames were not essential to growth. PMID:16369025
Maumus, Florian; Blanc, Guillaume
2016-12-14
The nucleocytoplasmic large DNA viruses (NCLDV) are a group of extremely complex double-stranded DNA viruses, which are major parasites of a variety of eukaryotes. Recent studies showed that certain unicellular eukaryotes contain fragments of NCLDV DNA integrated in their genome, when surprisingly many of these organisms were not previously shown to be infected by NCLDVs. These findings prompted us to search the genome of Acanthamoeba castellanii strain Neff (Neff), one of the most prolific hosts in the discovery of giant NCLDVs, for possible DNA inserts of viral origin. We report the identification of 267 markers of lateral gene transfer with viruses, approximately half of which are clustered in Neff genome regions of viral origins, transcriptionally inactive or exhibit nucleotide-composition signatures suggestive of a foreign origin. The integrated viral genes had diverse origin among relatives of viruses that infect Neff, including Mollivirus, Pandoravirus, Marseillevirus, Pithovirus, and Mimivirus However, phylogenetic analysis suggests the existence of a yet-undiscovered family of amoeba-infecting NCLDV in addition to the five already characterized. The active transcription of some apparently anciently integrated virus-like genes suggests that some viral genes might have been domesticated during the amoeba evolution. These insights confirm that genomic insertion of NCLDV DNA is a common theme in eukaryotes. This gene flow contributed fertilizing the eukaryotic gene repertoire and participated in the occurrence of orphan genes, a long standing issue in genomics. Search for viral inserts in eukaryotic genomes followed by environmental screening of the original viruses should be used to isolate radically new NCLDVs. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Lo, Te-Wen; Pickle, Catherine S; Lin, Steven; Ralston, Edward J; Gurling, Mark; Schartner, Caitlin M; Bian, Qian; Doudna, Jennifer A; Meyer, Barbara J
2013-10-01
Exploitation of custom-designed nucleases to induce DNA double-strand breaks (DSBs) at genomic locations of choice has transformed our ability to edit genomes, regardless of their complexity. DSBs can trigger either error-prone repair pathways that induce random mutations at the break sites or precise homology-directed repair pathways that generate specific insertions or deletions guided by exogenously supplied DNA. Prior editing strategies using site-specific nucleases to modify the Caenorhabditis elegans genome achieved only the heritable disruption of endogenous loci through random mutagenesis by error-prone repair. Here we report highly effective strategies using TALE nucleases and RNA-guided CRISPR/Cas9 nucleases to induce error-prone repair and homology-directed repair to create heritable, precise insertion, deletion, or substitution of specific DNA sequences at targeted endogenous loci. Our robust strategies are effective across nematode species diverged by 300 million years, including necromenic nematodes (Pristionchus pacificus), male/female species (Caenorhabditis species 9), and hermaphroditic species (C. elegans). Thus, genome-editing tools now exist to transform nonmodel nematode species into genetically tractable model organisms. We demonstrate the utility of our broadly applicable genome-editing strategies by creating reagents generally useful to the nematode community and reagents specifically designed to explore the mechanism and evolution of X chromosome dosage compensation. By developing an efficient pipeline involving germline injection of nuclease mRNAs and single-stranded DNA templates, we engineered precise, heritable nucleotide changes both close to and far from DSBs to gain or lose genetic function, to tag proteins made from endogenous genes, and to excise entire loci through targeted FLP-FRT recombination.
Mitochondrial DNA transfer to the nucleus generates extensive insertion site variation in maize.
Lough, Ashley N; Roark, Leah M; Kato, Akio; Ream, Thomas S; Lamb, Jonathan C; Birchler, James A; Newton, Kathleen J
2008-01-01
Mitochondrial DNA (mtDNA) insertions into nuclear chromosomes have been documented in a number of eukaryotes. We used fluorescence in situ hybridization (FISH) to examine the variation of mtDNA insertions in maize. Twenty overlapping cosmids, representing the 570-kb maize mitochondrial genome, were individually labeled and hybridized to root tip metaphase chromosomes from the B73 inbred line. A minimum of 15 mtDNA insertion sites on nine chromosomes were detectable using this method. One site near the centromere on chromosome arm 9L was identified by a majority of the cosmids. To examine variation in nuclear mitochondrial DNA sequences (NUMTs), a mixture of labeled cosmids was applied to chromosome spreads of ten diverse inbred lines: A188, A632, B37, B73, BMS, KYS, Mo17, Oh43, W22, and W23. The number of detectable NUMTs varied dramatically among the lines. None of the tested inbred lines other than B73 showed the strong hybridization signal on 9L, suggesting that there is a recent mtDNA insertion at this site in B73. Different sources of B73 and W23 were examined for NUMT variation within inbred lines. Differences were detectable, suggesting either that mtDNA is being incorporated or lost from the maize nuclear genome continuously. The results indicate that mtDNA insertions represent a major source of nuclear chromosomal variation.
de Souza, Flávio S.J.; Franchini, Lucía F.; Rubinstein, Marcelo
2013-01-01
Transposable elements (TEs) are mobile genetic sequences that can jump around the genome from one location to another, behaving as genomic parasites. TEs have been particularly effective in colonizing mammalian genomes, and such heavy TE load is expected to have conditioned genome evolution. Indeed, studies conducted both at the gene and genome levels have uncovered TE insertions that seem to have been co-opted—or exapted—by providing transcription factor binding sites (TFBSs) that serve as promoters and enhancers, leading to the hypothesis that TE exaptation is a major factor in the evolution of gene regulation. Here, we critically review the evidence for exaptation of TE-derived sequences as TFBSs, promoters, enhancers, and silencers/insulators both at the gene and genome levels. We classify the functional impact attributed to TE insertions into four categories of increasing complexity and argue that so far very few studies have conclusively demonstrated exaptation of TEs as transcriptional regulatory regions. We also contend that many genome-wide studies dealing with TE exaptation in recent lineages of mammals are still inconclusive and that the hypothesis of rapid transcriptional regulatory rewiring mediated by TE mobilization must be taken with caution. Finally, we suggest experimental approaches that may help attributing higher-order functions to candidate exapted TEs. PMID:23486611
Nieminen, Mikko; Tuuri, Timo; Savilahti, Harri
2010-10-01
Human embryonic stem cells are pluripotent cells derived from early human embryo and retain a potential to differentiate into all adult cell types. They provide vast opportunities in cell replacement therapies and are expected to become significant tools in drug discovery as well as in the studies of cellular and developmental functions of human genes. The progress in applying different types of DNA recombination reactions for genome modification in a variety of eukaryotic cell types has provided means to utilize recombination-based strategies also in human embryonic stem cells. Homologous recombination-based methods, particularly those utilizing extended homologous regions and those employing zinc finger nucleases to boost genomic integration, have shown their usefulness in efficient genome modification. Site-specific recombination systems are potent genome modifiers, and they can be used to integrate DNA into loci that contain an appropriate recombination signal sequence, either naturally occurring or suitably pre-engineered. Non-homologous recombination can be used to generate random integrations in genomes relatively effortlessly, albeit with a moderate efficiency and precision. DNA transposition-based strategies offer substantially more efficient random strategies and provide means to generate single-copy insertions, thus potentiating the generation of genome-wide insertion libraries applicable in genetic screens. 2010 Elsevier Inc. All rights reserved.
Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula
Grzebelus, Dariusz; Lasota, Slawomir; Gambin, Tomasz; Kucherov, Gregory; Gambin, Anna
2007-01-01
Background Transposable elements constitute a significant fraction of plant genomes. The PIF/Harbinger superfamily includes DNA transposons (class II elements) carrying terminal inverted repeats and producing a 3 bp target site duplication upon insertion. The presence of an ORF coding for the DDE/DDD transposase, required for transposition, is characteristic for the autonomous PIF/Harbinger-like elements. Based on the above features, PIF/Harbinger-like elements were identified in several plant genomes and divided into several evolutionary lineages. Availability of a significant portion of Medicago truncatula genomic sequence allowed for mining PIF/Harbinger-like elements, starting from a single previously described element MtMaster. Results Twenty two putative autonomous, i.e. carrying an ORF coding for TPase and complete terminal inverted repeats, and 67 non-autonomous PIF/Harbinger-like elements were found in the genome of M. truncatula. They were divided into five families, MtPH-A5, MtPH-A6, MtPH-D,MtPH-E, and MtPH-M, corresponding to three previously identified and two new lineages. The largest families, MtPH-A6 and MtPH-M were further divided into four and three subfamilies, respectively. Non-autonomous elements were usually direct deletion derivatives of the putative autonomous element, however other types of rearrangements, including inversions and nested insertions were also observed. An interesting structural characteristic – the presence of 60 bp tandem repeats – was observed in a group of elements of subfamily MtPH-A6-4. Some families could be related to miniature inverted repeat elements (MITEs). The presence of empty loci (RESites), paralogous to those flanking the identified transposable elements, both autonomous and non-autonomous, as well as the presence of transposon insertion related size polymorphisms, confirmed that some of the mined elements were capable for transposition. Conclusion The population of PIF/Harbinger-like elements in the genome of M. truncatula is diverse. A detailed intra-family comparison of the elements' structure proved that they proliferated in the genome generally following the model of abortive gap repair. However, the presence of tandem repeats facilitated more pronounced rearrangements of the element internal regions. The insertion polymorphism of the MtPH elements and related MITE families in different populations of M. truncatula, if further confirmed experimentally, could be used as a source of molecular markers complementary to other marker systems. PMID:17996080
Repeat-associated plasticity in the Helicobacter pylori RD Gene Family
USDA-ARS?s Scientific Manuscript database
epetitive DNA facilitates genomic flexibility via increased recombination, deletion, and insertion. The bacterium Helicobacter pylori is remarkable for its ability to persist in the human stomach for decades without provoking sterilizing immunity. Examining the genomes of two H. pylori strains, we d...
Ramp, Kristina; Skiba, Martin; Karger, Axel; Mettenleiter, Thomas C; Römer-Oberdörfer, Angela
2011-02-01
Members of the order Mononegavirales express their genes in a transcription gradient from 3' to 5'. To assess how this impacts on expression of a foreign transgene, the haemagglutinin (HA) of highly pathogenic avian influenza virus (HPAIV) A/chicken/Vietnam/P41/05 (subtype H5N1) was inserted between the phosphoprotein (P) and matrix protein (M), M and fusion protein (F), or F and haemagglutinin-neuraminidase protein (HN) genes of attenuated Newcastle disease virus (NDV) Clone 30. In addition, the gene encoding the neuraminidase of HPAIV A/duck/Vietnam/TG24-01/05 (subtype H5N1) was inserted into the NDV genome either alone or in combination with the HA gene. All recombinants replicated well in embryonated chicken eggs. The expression levels of HA-specific mRNA and protein were quantified by Northern blot analysis and mass spectrometry, with good correlation. HA expression levels differed only moderately and were highest in the recombinant carrying the HA insertion between the F and HN genes of NDV.
Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H
2016-07-04
The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .
Genomic characterization of two large Alu-mediated rearrangements of the BRCA1 gene.
Peixoto, Ana; Pinheiro, Manuela; Massena, Lígia; Santos, Catarina; Pinto, Pedro; Rocha, Patrícia; Pinto, Carla; Teixeira, Manuel R
2013-02-01
To determine whether a large genomic rearrangement is actually novel and to gain insight about the mutational mechanism responsible for its occurrence, molecular characterization with breakpoint identification is mandatory. We here report the characterization of two large deletions involving the BRCA1 gene. The first rearrangement harbored a 89,664-bp deletion comprising exon 7 of the BRCA1 gene to exon 11 of the NBR1 gene (c.441+1724_oNBR1:c.1073+480del). Two highly homologous Alu elements were found in the genomic sequences flanking the deletion breakpoints. Furthermore, a 20-bp overlapping sequence at the breakpoint junction was observed, suggesting that the most likely mechanism for the occurrence of this rearrangement was nonallelic homologous recombination. The second rearrangement fully characterized at the nucleotide level was a BRCA1 exons 11-15 deletion (c.671-319_4677-578delinsAlu). The case harbored a 23,363-bp deletion with an Alu element inserted at the breakpoints of the deleted region. As the Alu element inserted belongs to a still active AluY family, the observed rearrangement could be due to an insertion-mediated deletion mechanism caused by Alu retrotransposition. To conclude, we describe the breakpoints of two novel large deletions involving the BRCA1 gene and analysis of their genomic context allowed us to gain insight about the respective mutational mechanism.
Wang, Guodong; Ellendorff, Ursula; Kemp, Ben; Mansfield, John W.; Forsyth, Alec; Mitchell, Kathy; Bastas, Kubilay; Liu, Chun-Ming; Woods-Tör, Alison; Zipfel, Cyril; de Wit, Pierre J.G.M.; Jones, Jonathan D.G.; Tör, Mahmut; Thomma, Bart P.H.J.
2008-01-01
Receptor-like proteins (RLPs) are cell surface receptors that typically consist of an extracellular leucine-rich repeat domain, a transmembrane domain, and a short cytoplasmatic tail. In several plant species, RLPs have been found to play a role in disease resistance, such as the tomato (Solanum lycopersicum) Cf and Ve proteins and the apple (Malus domestica) HcrVf2 protein that mediate resistance against the fungal pathogens Cladosporium fulvum, Verticillium spp., and Venturia inaequalis, respectively. In addition, RLPs play a role in plant development; Arabidopsis (Arabidopsis thaliana) TOO MANY MOUTHS (TMM) regulates stomatal distribution, while Arabidopsis CLAVATA2 (CLV2) and its functional maize (Zea mays) ortholog FASCINATED EAR2 regulate meristem maintenance. In total, 57 RLP genes have been identified in the Arabidopsis genome and a genome-wide collection of T-DNA insertion lines was assembled. This collection was functionally analyzed with respect to plant growth and development and sensitivity to various stress responses, including susceptibility toward pathogens. A number of novel developmental phenotypes were revealed for our CLV2 and TMM insertion mutants. In addition, one AtRLP gene was found to mediate abscisic acid sensitivity and another AtRLP gene was found to influence nonhost resistance toward Pseudomonas syringae pv phaseolicola. This genome-wide collection of Arabidopsis RLP gene T-DNA insertion mutants provides a tool for future investigations into the biological roles of RLPs. PMID:18434605
Kuhn, Alexandre; Ong, Yao Min; Quake, Stephen R; Burkholder, William F
2015-07-08
Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.
Insertion sequences enrichment in extreme Red sea brine pool vent.
Elbehery, Ali H A; Aziz, Ramy K; Siam, Rania
2017-03-01
Mobile genetic elements are major agents of genome diversification and evolution. Limited studies addressed their characteristics, including abundance, and role in extreme habitats. One of the rare natural habitats exposed to multiple-extreme conditions, including high temperature, salinity and concentration of heavy metals, are the Red Sea brine pools. We assessed the abundance and distribution of different mobile genetic elements in four Red Sea brine pools including the world's largest known multiple-extreme deep-sea environment, the Red Sea Atlantis II Deep. We report a gradient in the abundance of mobile genetic elements, dramatically increasing in the harshest environment of the pool. Additionally, we identified a strong association between the abundance of insertion sequences and extreme conditions, being highest in the harshest and deepest layer of the Red Sea Atlantis II Deep. Our comparative analyses of mobile genetic elements in secluded, extreme and relatively non-extreme environments, suggest that insertion sequences predominantly contribute to polyextremophiles genome plasticity.
Genome editing for crop improvement: Challenges and opportunities
Abdallah, Naglaa A; Prakash, Channapatna S; McHughen, Alan G
2015-01-01
ABSTRACT Genome or gene editing includes several new techniques to help scientists precisely modify genome sequences. The techniques also enables us to alter the regulation of gene expression patterns in a pre-determined region and facilitates novel insights into the functional genomics of an organism. Emergence of genome editing has brought considerable excitement especially among agricultural scientists because of its simplicity, precision and power as it offers new opportunities to develop improved crop varieties with clear-cut addition of valuable traits or removal of undesirable traits. Research is underway to improve crop varieties with higher yields, strengthen stress tolerance, disease and pest resistance, decrease input costs, and increase nutritional value. Genome editing encompasses a wide variety of tools using either a site-specific recombinase (SSR) or a site-specific nuclease (SSN) system. Both systems require recognition of a known sequence. The SSN system generates single or double strand DNA breaks and activates endogenous DNA repair pathways. SSR technology, such as Cre/loxP and Flp/FRT mediated systems, are able to knockdown or knock-in genes in the genome of eukaryotes, depending on the orientation of the specific sites (loxP, FLP, etc.) flanking the target site. There are 4 main classes of SSN developed to cleave genomic sequences, mega-nucleases (homing endonuclease), zinc finger nucleases (ZFNs), transcriptional activator-like effector nucleases (TALENs), and the CRISPR/Cas nuclease system (clustered regularly interspaced short palindromic repeat/CRISPR-associated protein). The recombinase mediated genome engineering depends on recombinase (sub-) family and target-site and induces high frequencies of homologous recombination. Improving crops with gene editing provides a range of options: by altering only a few nucleotides from billions found in the genomes of living cells, altering the full allele or by inserting a new gene in a targeted region of the genome. Due to its precision, gene editing is more precise than either conventional crop breeding methods or standard genetic engineering methods. Thus this technology is a very powerful tool that can be used toward securing the world's food supply. In addition to improving the nutritional value of crops, it is the most effective way to produce crops that can resist pests and thrive in tough climates. There are 3 types of modifications produced by genome editing; Type I includes altering a few nucleotides, Type II involves replacing an allele with a pre-existing one and Type III allows for the insertion of new gene(s) in predetermined regions in the genome. Because most genome-editing techniques can leave behind traces of DNA alterations evident in a small number of nucleotides, crops created through gene editing could avoid the stringent regulation procedures commonly associated with GM crop development. For this reason many scientists believe plants improved with the more precise gene editing techniques will be more acceptable to the public than transgenic plants. With genome editing comes the promise of new crops being developed more rapidly with a very low risk of off-target effects. It can be performed in any laboratory with any crop, even those that have complex genomes and are not easily bred using conventional methods. PMID:26930114
Bentley, Stephen D.; Corton, Craig; Brown, Susan E.; Barron, Andrew; Clark, Louise; Doggett, Jon; Harris, Barbara; Ormond, Doug; Quail, Michael A.; May, Georgiana; Francis, David; Knudson, Dennis; Parkhill, Julian; Ishimaru, Carol A.
2008-01-01
Clavibacter michiganensis subsp. sepedonicus is a plant-pathogenic bacterium and the causative agent of bacterial ring rot, a devastating agricultural disease under strict quarantine control and zero tolerance in the seed potato industry. This organism appears to be largely restricted to an endophytic lifestyle, proliferating within plant tissues and unable to persist in the absence of plant material. Analysis of the genome sequence of C. michiganensis subsp. sepedonicus and comparison with the genome sequences of related plant pathogens revealed a dramatic recent evolutionary history. The genome contains 106 insertion sequence elements, which appear to have been active in extensive rearrangement of the chromosome compared to that of Clavibacter michiganensis subsp. michiganensis. There are 110 pseudogenes with overrepresentation in functions associated with carbohydrate metabolism, transcriptional regulation, and pathogenicity. Genome comparisons also indicated that there is substantial gene content diversity within the species, probably due to differential gene acquisition and loss. These genomic features and evolutionary dating suggest that there was recent adaptation for life in a restricted niche where nutrient diversity and perhaps competition are low, correlated with a reduced ability to exploit previously occupied complex niches outside the plant. Toleration of factors such as multiplication and integration of insertion sequence elements, genome rearrangements, and functional disruption of many genes and operons seems to indicate that there has been general relaxation of selective pressure on a large proportion of the genome. PMID:18192393
Cui, Yubao; Yu, Lili
2016-12-01
The clustered regularly-interspaced short palindromic repeats (CRISPR) structural family functions as an acquired immune system in prokaryotes. Gene editing techniques have co-opted CRISPR and the associated Cas nucleases to allow for the precise genetic modification of human cells, zebrafish, mice, and other eukaryotes. Indeed, this approach has been used to induce a variety of modifications including directed insertion/deletion (InDel) of bases, gene knock-in, introduction of mutations in both alleles of a target gene, and deletion of small DNA fragments. Thus, CRISPR technology offers a precise molecular tool for directed genome modification with a range of potential applications; further, its high mutation efficiency, simple process, and low cost provide additional advantages over prior editing techniques. This paper will provide an overview of the basic structure and function of the CRISPR gene editing system as well as current and potential applications to research on parasites. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Aboklaish, Ali F.; Dordet-Frisoni, Emilie; Citti, Christine; Toleman, Mark A; Glass, John I.; Spiller, O. Brad
2015-01-01
While transposon mutagenesis has been successfully used for Mycoplasma spp. to disrupt and determine non-essential genes, previous attempts with Ureaplasma spp. have been unsuccessful. Using a polyethylene glycol-transformation enhancing protocol, we were able to transform three separate serovars of Ureaplasma parvum with a Tn4001-based mini-transposon plasmid containing a gentamicin resistance selection marker. Despite the large degree of homology between Ureaplasma parvum and Ureaplasma urealyticum, all attempts to transform the latter in parallel failed, with the exception of a single clinical U. urealyticum isolate. PCR probing and sequencing were used to confirm transposon insertion into the bacterial genome and identify disrupted genes. Transformation of prototype serovar 3 consistently resulted in transfer only of sequence between the mini-transposon inverted repeats, but some strains showed additional sequence transfer. Transposon insertion occurred randomly in the genome resulting in unique disruption of genes UU047, UU390, UU440, UU450, UU520, UU526, UU582 for single clones from a panel of screened clones. An intergenic insertion between genes UU187 and UU188 was also characterised. Two phenotypic alterations were observed in the mutated strains: Disruption of a DEAD-box RNA helicase (UU582) altered growth kinetics, while the U. urealyticum strain lost resistance to serum attack coincident with disruption of gene UUR10_137 and loss of expression of a 41 kDa protein. Transposon mutagenesis was used successfully to insert single copies of a mini-transposon into the genome and disrupt genes leading to phenotypic changes in Ureaplasma parvum strains. This method can now be used to deliver exogenous genes for expression and determine essential genes for Ureaplasma parvum replication in culture and experimental models. PMID:25444567
Aboklaish, Ali F; Dordet-Frisoni, Emilie; Citti, Christine; Toleman, Mark A; Glass, John I; Spiller, O Brad
2014-11-01
While transposon mutagenesis has been successfully used for Mycoplasma spp. to disrupt and determine non-essential genes, previous attempts with Ureaplasma spp. have been unsuccessful. Using a polyethylene glycol-transformation enhancing protocol, we were able to transform three separate serovars of Ureaplasma parvum with a Tn4001-based mini-transposon plasmid containing a gentamicin resistance selection marker. Despite the large degree of homology between Ureaplasma parvum and Ureaplasma urealyticum, all attempts to transform the latter in parallel failed, with the exception of a single clinical U. urealyticum isolate. PCR probing and sequencing were used to confirm transposon insertion into the bacterial genome and identify disrupted genes. Transformation of prototype serovar 3 consistently resulted in transfer only of sequence between the mini-transposon inverted repeats, but some strains showed additional sequence transfer. Transposon insertion occurred randomly in the genome resulting in unique disruption of genes UU047, UU390, UU440, UU450, UU520, UU526, UU582 for single clones from a panel of screened clones. An intergenic insertion between genes UU187 and UU188 was also characterised. Two phenotypic alterations were observed in the mutated strains: Disruption of a DEAD-box RNA helicase (UU582) altered growth kinetics, while the U. urealyticum strain lost resistance to serum attack coincident with disruption of gene UUR10_137 and loss of expression of a 41 kDa protein. Transposon mutagenesis was used successfully to insert single copies of a mini-transposon into the genome and disrupt genes leading to phenotypic changes in Ureaplasma parvum strains. This method can now be used to deliver exogenous genes for expression and determine essential genes for Ureaplasma parvum replication in culture and experimental models. Copyright © 2014 Elsevier GmbH. All rights reserved.
Yoshida, Asuka; Samal, Siba K.
2017-01-01
Avian paramyxovirus serotype 3 (APMV-3) causes infection in a wide variety of avian species, but it does not cause apparent diseases in chickens. On the contrary, APMV-1, also known as Newcastle disease virus (NDV), can cause severe disease in chickens. Currently, natural low virulence strains of NDV are used as live-attenuated vaccines throughout the world. NDV is also being evaluated as a vaccine vector against poultry pathogens. However, due to routine vaccination programs, chickens often possess pre-existing antibodies against NDV, which may cause the chickens to be less sensitive to recombinant NDV vaccines expressing antigens of other avian pathogens. Therefore, it may be possible for an APMV-3 vector vaccine to circumvent this issue. In this study, we determined the optimal insertion site in the genome of APMV-3 for high level expression of a foreign gene. We generated recombinant APMV-3 viruses expressing the green fluorescent protein (GFP) by inserting the GFP gene at five different intergenic regions in the genome. The levels of GFP transcription and translation were evaluated. Interestingly, the levels of GFP transcription and translation did not follow the 3′-to-5′ attenuation mechanism of non-segmented, negative-sense RNA viruses. The insertion of GFP gene into the P-M gene junction resulted in higher level of expression of GFP than when the gene was inserted into the upstream N-P gene junction. Unlike NDV, insertion of GFP did not attenuate the growth efficiency of AMPV-3. Thus, APMV-3 could be a more useful vaccine vector for avian pathogens than NDV. PMID:28473820
Yoshida, Asuka; Samal, Siba K
2017-01-01
Avian paramyxovirus serotype 3 (APMV-3) causes infection in a wide variety of avian species, but it does not cause apparent diseases in chickens. On the contrary, APMV-1, also known as Newcastle disease virus (NDV), can cause severe disease in chickens. Currently, natural low virulence strains of NDV are used as live-attenuated vaccines throughout the world. NDV is also being evaluated as a vaccine vector against poultry pathogens. However, due to routine vaccination programs, chickens often possess pre-existing antibodies against NDV, which may cause the chickens to be less sensitive to recombinant NDV vaccines expressing antigens of other avian pathogens. Therefore, it may be possible for an APMV-3 vector vaccine to circumvent this issue. In this study, we determined the optimal insertion site in the genome of APMV-3 for high level expression of a foreign gene. We generated recombinant APMV-3 viruses expressing the green fluorescent protein (GFP) by inserting the GFP gene at five different intergenic regions in the genome. The levels of GFP transcription and translation were evaluated. Interestingly, the levels of GFP transcription and translation did not follow the 3'-to-5' attenuation mechanism of non-segmented, negative-sense RNA viruses. The insertion of GFP gene into the P-M gene junction resulted in higher level of expression of GFP than when the gene was inserted into the upstream N-P gene junction. Unlike NDV, insertion of GFP did not attenuate the growth efficiency of AMPV-3. Thus, APMV-3 could be a more useful vaccine vector for avian pathogens than NDV.
Kelleher, Erin S; Barbash, Daniel A
2013-08-01
The Piwi-interacting RNA (piRNA) pathway defends animal genomes against the harmful consequences of transposable element (TE) infection by imposing small-RNA-mediated silencing. Because silencing is targeted by TE-derived piRNAs, piRNA production is posited to be central to the evolution of genome defense. We harnessed genomic data sets from Drosophila melanogaster, including genome-wide measures of piRNA, mRNA, and genomic abundance, along with estimates of age structure and risk of ectopic recombination, to address fundamental questions about the functional and evolutionary relationships between TE families and their regulatory piRNAs. We demonstrate that mRNA transcript abundance, robustness of "ping-pong" amplification, and representation in piRNA clusters together explain the majority of variation in piRNA abundance between TE families, providing the first robust statistical support for the prevailing model of piRNA biogenesis. Intriguingly, we also discover that the most transpositionally active TE families, with the greatest capacity to induce harmful mutations or disrupt gametogenesis, are not necessarily the most abundant among piRNAs. Rather, the level of piRNA targeting is largely independent of recent transposition rate for active TE families, but is rapidly lost for inactive TEs. These observations are consistent with population genetic theory that suggests a limited selective advantage for host repression of transposition. Additionally, we find no evidence that piRNA targeting responds to selection against a second major cost of TE infection: ectopic recombination between TE insertions. Our observations confirm the pivotal role of piRNA-mediated silencing in defending the genome against selfish transposition, yet also suggest limits to the optimization of host genome defense.
Issa, Mohammad Nouh; Ashhab, Yaqoub
2016-09-22
Brucella melitensis Rev.1 is an avirulent strain that is widely used as a live vaccine to control brucellosis in small ruminants. Although an assembled draft version of Rev.1 genome has been available since 2009, this genome has not been investigated to characterize this important vaccine. In the present work, we used the draft genome of Rev.1 to perform a thorough genomic comparison and sequence analysis to identify and characterize the panel of its unique genetic markers. The draft genome of Rev.1 was compared with genome sequences of 36 different Brucella melitensis strains from the Brucella project of the Broad Institute of MIT and Harvard. The comparative analyses revealed 32 genetic alterations (30 SNPs, 1 single-bp insertion and 1 single-bp deletion) that are exclusively present in the Rev.1 genome. In silico analyses showed that 9 out of the 17 non-synonymous mutations are deleterious. Three ABC transporters are among the disrupted genes that can be linked to virulence attenuation. Out of the 32 mutations, 11 Rev.1 specific markers were selected to test their potential to discriminate Rev.1 using a bi-directional allele-specific PCR assay. Six markers were able to distinguish between Rev.1 and a set of control strains. We succeeded in identifying a panel of 32 genome-specific markers of the B. melitensis Rev.1 vaccine strain. Extensive in silico analysis showed that a considerable number of these mutations could severely affect the function of the associated genes. In addition, some of the discovered markers were able to discriminate Rev.1 strain from a group of control strains using practical PCR tests that can be applied in resource-limited settings. Copyright © 2016 Elsevier Ltd. All rights reserved.
Mobile element biology – new possibilities with high-throughput sequencing
Xing, Jinchuan; Witherspoon, David J.; Jorde, Lynn B.
2014-01-01
Mobile elements compose more than half of the human genome, but until recently their large-scale detection was time-consuming and challenging. With the development of new high-throughput sequencing technologies, the complete spectrum of mobile element variation in humans can now be identified and analyzed. Thousands of new mobile element insertions have been discovered, yielding new insights into mobile element biology, evolution, and genomic variation. We review several high-throughput methods, with an emphasis on techniques that specifically target mobile element insertions in humans, and we highlight recent applications of these methods in evolutionary studies and in the analysis of somatic alterations in human cancers. PMID:23312846
Lee, Hae-Lim; Jansen, Robert K; Chumley, Timothy W; Kim, Ki-Joong
2007-05-01
The chloroplast (cp) DNA sequence of Jasminum nudiflorum (Oleaceae-Jasmineae) is completed and compared with the large single-copy region sequences from 6 related species. The cp genomes of the tribe Jasmineae (Jasminum and Menodora) show several distinctive rearrangements, including inversions, gene duplications, insertions, inverted repeat expansions, and gene and intron losses. The ycf4-psaI region in Jasminum section Primulina was relocated as a result of 2 overlapping inversions of 21,169 and 18,414 bp. The 1st, larger inversion is shared by all members of the Jasmineae indicating that it occurred in the common ancestor of the tribe. Similar rearrangements were also identified in the cp genome of Menodora. In this case, 2 fragments including ycf4 and rps4-trnS-ycf3 genes were moved by 2 additional inversions of 14 and 59 kb that are unique to Menodora. Other rearrangements in the Oleaceae are confined to certain regions of the Jasminum and Menodora cp genomes, including the presence of highly repeated sequences and duplications of coding and noncoding sequences that are inserted into clpP and between rbcL and psaI. These insertions are correlated with the loss of 2 introns in clpP and a serial loss of segments of accD. The loss of the accD gene and clpP introns in both the monocot family Poaceae and the eudicot family Oleaceae are clearly independent evolutionary events. However, their genome organization is surprisingly similar despite the distant relationship of these 2 angiosperm families.
Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie
2016-01-01
Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent. PMID:26779141
Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie
2015-01-01
Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent.
Geurts, Aron M; Collier, Lara S; Geurts, Jennifer L; Oseth, Leann L; Bell, Matthew L; Mu, David; Lucito, Robert; Godbout, Susan A; Green, Laura E; Lowe, Scott W; Hirsch, Betsy A; Leinwand, Leslie A; Largaespada, David A
2006-01-01
Previous studies of the Sleeping Beauty (SB) transposon system, as an insertional mutagen in the germline of mice, have used reverse genetic approaches. These studies have led to its proposed use for regional saturation mutagenesis by taking a forward-genetic approach. Thus, we used the SB system to mutate a region of mouse Chromosome 11 in a forward-genetic screen for recessive lethal and viable phenotypes. This work represents the first reported use of an insertional mutagen in a phenotype-driven approach. The phenotype-driven approach was successful in both recovering visible and behavioral mutants, including dominant limb and recessive behavioral phenotypes, and allowing for the rapid identification of candidate gene disruptions. In addition, a high frequency of recessive lethal mutations arose as a result of genomic rearrangements near the site of transposition, resulting from transposon mobilization. The results suggest that the SB system could be used in a forward-genetic approach to recover interesting phenotypes, but that local chromosomal rearrangements should be anticipated in conjunction with single-copy, local transposon insertions in chromosomes. Additionally, these mice may serve as a model for chromosome rearrangements caused by transposable elements during the evolution of vertebrate genomes. PMID:17009875
USDA-ARS?s Scientific Manuscript database
Miniature inverted-repeat transposable elements (MITEs) are non-autonomous transposons (devoid a transposase gene, tps) involving insertion/deletion of genomic DNA in bacterial genomes influencing gene functions. No transposon has yet been reported in “Candidatus Liberibacter asiaticus”, an alpha-pr...
Survey of microsatellite DNA in pine
Craig S. Echt; P. May-Marquardt
1997-01-01
A large insert genomic library from eastern white pine (Pinus strobus) was probed for the microsatellite motifs (AC)n and (AG)n, all 10 trinucleotide motifs, and 22 of the 33 possible tetranucleotide motifs. For comparison with a species from a different subgenus, a loblolly pine (Pinus taeda) genomic...
Metcalfe, Cushla J.; Casane, Didier
2013-01-01
Very large genomes, that is, those above 20 Gb, are rare but widely distributed throughout the eukaryotes. They are found within the diatoms, dinoflagellates, metazoans and green plants, but so far have not been found in the excavates. There is a known positive correlation between genome size and the proportion of the genome composed of transposable elements (TEs). Very large genomes may therefore be expected to be almost entirely composed of TEs. Of the large genomes examined, in the angiosperms, gymnosperms and the dinoflagellates only a small portion of the genome was identified as TEs, most of these genomes were unidentified and may be novel or diverse TEs. In the salamanders and lungfish, 25 to 47% of the genome were identifiable retrotransposons, that is, TEs that copy themselves before insertion. However, the predominant class of TEs found in the lungfish was not the same as that found in the salamanders. The little data we have at the moment suggests therefore that the diversity and abundance of TEs is variable between taxa with large genomes, similar to patterns found in taxa with smaller genomes. Based on results from the human genome, we suggest that the ‘missing’ portion of the lungfish and salamander genomes are old, highly divergent, and therefore inactive copies of TEs. The data available indicate that, unlike plants with large genomes, neither the lungfish nor the salamanders show an increased risk of extinction. Based on a slow rate of DNA loss in salamanders it has been suggested that the large salamander genome is the result of run-away genome expansion involving genome size increases via TE proliferation associated with reduced recombination rate. We know of no studies on DNA loss or recombination rates in lungfish genomes, however a similar scenario could describe the process of genome expansion in the lungfish. A series of waves of TE transposition and sequence decay would describe the pattern of TE content seen in both the lungfish and the salamanders. The lungfish and salamanders, therefore, may accommodate their large load of TEs because these TEs have accumulated gradually over a long period of time and have been subject to inactivation and decay. PMID:24616835
Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard
2011-01-01
Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52–171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes. PMID:21940637
Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard
2011-01-01
Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52-171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes.
2013-01-01
Background Persistent infection of Penaeus stylirostris densovirus (PstDNV) (also called IHHNV) and its non-infectious inserts in the black tiger shrimp, Penaeus monodon (P. monodon) genome are commonly found without apparent disease. Here, we introduced the method of multiplex PCR in order to differentiate shrimp with viral inserts from ones with the infectious virus. The method allowed us to study the effect of pre-infection of IHHNV, in comparison to IHHNV inserts, on WSSV resistance in P. monodon. Results A multiplex PCR system was developed to amplify the entire IHHNV genome, ensuring the accurate diagnosis. Field samples containing IHHNV DNA templates as low as 20 pg or equivalent 150 viral copies can be detected by this method. By challenging the two groups of diagnosed shrimp with WSSV, we found that shrimp with IHHNV infection and those with viral inserts responded to WSSV differently. Considering cumulative mortality, average time to death of shrimp in IHHNV-infected group (day 14) was significantly delayed relative to that (day 10) of IHHNV-inserted group. Real-time PCR analysis of WSSV copy number indicated the lower amount of WSSV in the IHHNV-infected group than the virus-inserted group. The ratio of IHHNV: WSSV copy number in all determined IHHNV-infected samples ranged from approximately 4 to 300-fold. Conclusion The multiplex PCR assay developed herein proved optimal for convenient differentiation of shrimp specimens with real IHHNV infection and those with insert types. Diagnosed shrimp were also found to exhibit different WSSV tolerance. After exposed to WSSV, the naturally pre-infected IHHNV P. monodon were less susceptible to WSSV and, consequently, survived longer than the IHHNV-inserted shrimp. PMID:23414329
The infinite sites model of genome evolution.
Ma, Jian; Ratan, Aakrosh; Raney, Brian J; Suh, Bernard B; Miller, Webb; Haussler, David
2008-09-23
We formalize the problem of recovering the evolutionary history of a set of genomes that are related to an unseen common ancestor genome by operations of speciation, deletion, insertion, duplication, and rearrangement of segments of bases. The problem is examined in the limit as the number of bases in each genome goes to infinity. In this limit, the chromosomes are represented by continuous circles or line segments. For such an infinite-sites model, we present a polynomial-time algorithm to find the most parsimonious evolutionary history of any set of related present-day genomes.
Schröder, Christiane; Bleidorn, Christoph; Hartmann, Stefanie; Tiedemann, Ralph
2009-12-15
Investigating the dog genome we found 178965 introns with a moderate length of 200-1000 bp. A screening of these sequences against 23 different repeat libraries to find insertions of short interspersed elements (SINEs) detected 45276 SINEs. Virtually all of these SINEs (98%) belong to the tRNA-derived Can-SINE family. Can-SINEs arose about 55 million years ago before Carnivora split into two basal groups, the Caniformia (dog-like carnivores) and the Feliformia (cat-like carnivores). Genome comparisons of dog and cat recovered 506 putatively informative SINE loci for caniformian phylogeny. In this study we show how to use such genome information of model organisms to research the phylogeny of related non-model species of interest. Investigating a dataset including representatives of all major caniformian lineages, we analysed 24 randomly chosen loci for 22 taxa. All loci were amplifiable and revealed 17 parsimony-informative SINE insertions. The screening for informative SINE insertions yields a large amount of sequence information, in particular of introns, which contain reliable phylogenetic information as well. A phylogenetic analysis of intron- and SINE sequence data provided a statistically robust phylogeny which is congruent with the absence/presence pattern of our SINE markers. This phylogeny strongly supports a sistergroup relationship of Musteloidea and Pinnipedia. Within Pinnipedia, we see strong support from bootstrapping and the presence of a SINE insertion for a sistergroup relationship of the walrus with the Otariidae.
Efficient gene editing in Corynebacterium glutamicum using the CRISPR/Cas9 system.
Peng, Feng; Wang, Xinyue; Sun, Yang; Dong, Guibin; Yang, Yankun; Liu, Xiuxia; Bai, Zhonghu
2017-11-14
Corynebacterium glutamicum (C. glutamicum) has traditionally been used as a microbial cell factory for the industrial production of many amino acids and other industrially important commodities. C. glutamicum has recently been established as a host for recombinant protein expression; however, some intrinsic disadvantages could be improved by genetic modification. Gene editing techniques, such as deletion, insertion, or replacement, are important tools for modifying chromosomes. In this research, we report a CRISPR/Cas9 system in C. glutamicum for rapid and efficient genome editing, including gene deletion and insertion. The system consists of two plasmids: one containing a target-specific guide RNA and a homologous sequence to a target gene, the other expressing Cas9 protein. With high efficiency (up to 100%), this system was used to disrupt the porB, mepA, clpX and Ncgl0911 genes, which affect the ability to express proteins. The porB- and mepA-deletion strains had enhanced expression of green fluorescent protein, compared with the wild-type stain. This system can also be used to engineer point mutations and gene insertions. In this study, we adapted the CRISPR/Cas9 system from S. pyogens to gene deletion, point mutations and insertion in C. glutamicum. Compared with published genome modification methods, methods based on the CRISPR/Cas9 system can rapidly and efficiently achieve genome editing. Our research provides a powerful tool for facilitating the study of gene function, metabolic pathways, and enhanced productivity in C. glutamicum.
Mechanism for DNA transposons to generate introns on genomic scales
Huff, Jason T.; Zilberman, Daniel; Roy, Scott W.
2017-01-01
Discovered four decades ago, the existence of introns was one of the most unexpected findings in molecular biology1. Introns are sequences interrupting genes that must be removed as part of mRNA production. Genome sequencing projects have documented that most eukaryotic genes contain at least one and frequently many introns2,3. Comparison of these genomes reveals a history of long evolutionary periods with little intron gain punctuated by episodes of rapid, extensive gain2,3. However, no detailed mechanism for such episodic intron generation has been empirically supported on a sufficient scale, despite several proposals4–8. Here we show how short non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from gene sequence duplicated upon transposon insertion, allowing perfect splicing out of RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between preexisting nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases2 and prevalence of nucleosome-sized exons9 observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism plausibly accounting for episodes of rapid, extensive intron gain during eukaryotic evolution2,3. PMID:27760113
McCarthy, Alex J; Stabler, Richard A; Taylor, Peter W
2018-04-01
Escherichia coli K1 strains are major causative agents of invasive disease of newborn infants. The age dependency of infection can be reproduced in neonatal rats. Colonization of the small intestine following oral administration of K1 bacteria leads rapidly to invasion of the blood circulation; bacteria that avoid capture by the mesenteric lymphatic system and evade antibacterial mechanisms in the blood may disseminate to cause organ-specific infections such as meningitis. Some E. coli K1 surface constituents, in particular the polysialic acid capsule, are known to contribute to invasive potential, but a comprehensive picture of the factors that determine the fully virulent phenotype has not emerged so far. We constructed a library and constituent sublibraries of ∼775,000 Tn 5 transposon mutants of E. coli K1 strain A192PP and employed transposon-directed insertion site sequencing (TraDIS) to identify genes required for fitness for infection of 2-day-old rats. Transposon insertions were lacking in 357 genes following recovery on selective agar; these genes were considered essential for growth in nutrient-replete medium. Colonization of the midsection of the small intestine was facilitated by 167 E. coli K1 gene products. Restricted bacterial translocation across epithelial barriers precluded TraDIS analysis of gut-to-blood and blood-to-brain transits; 97 genes were required for survival in human serum. This study revealed that a large number of bacterial genes, many of which were not previously associated with systemic E. coli K1 infection, are required to realize full invasive potential. IMPORTANCE Escherichia coli K1 strains cause life-threatening infections in newborn infants. They are acquired from the mother at birth and colonize the small intestine, from where they invade the blood and central nervous system. It is difficult to obtain information from acutely ill patients that sheds light on physiological and bacterial factors determining invasive disease. Key aspects of naturally occurring age-dependent human infection can be reproduced in neonatal rats. Here, we employ transposon-directed insertion site sequencing to identify genes essential for the in vitro growth of E. coli K1 and genes that contribute to the colonization of susceptible rats. The presence of bottlenecks to invasion of the blood and cerebrospinal compartments precluded insertion site sequencing analysis, but we identified genes for survival in serum. Copyright © 2018 McCarthy et al.
McCarthy, Alex J.
2018-01-01
ABSTRACT Escherichia coli K1 strains are major causative agents of invasive disease of newborn infants. The age dependency of infection can be reproduced in neonatal rats. Colonization of the small intestine following oral administration of K1 bacteria leads rapidly to invasion of the blood circulation; bacteria that avoid capture by the mesenteric lymphatic system and evade antibacterial mechanisms in the blood may disseminate to cause organ-specific infections such as meningitis. Some E. coli K1 surface constituents, in particular the polysialic acid capsule, are known to contribute to invasive potential, but a comprehensive picture of the factors that determine the fully virulent phenotype has not emerged so far. We constructed a library and constituent sublibraries of ∼775,000 Tn5 transposon mutants of E. coli K1 strain A192PP and employed transposon-directed insertion site sequencing (TraDIS) to identify genes required for fitness for infection of 2-day-old rats. Transposon insertions were lacking in 357 genes following recovery on selective agar; these genes were considered essential for growth in nutrient-replete medium. Colonization of the midsection of the small intestine was facilitated by 167 E. coli K1 gene products. Restricted bacterial translocation across epithelial barriers precluded TraDIS analysis of gut-to-blood and blood-to-brain transits; 97 genes were required for survival in human serum. This study revealed that a large number of bacterial genes, many of which were not previously associated with systemic E. coli K1 infection, are required to realize full invasive potential. IMPORTANCE Escherichia coli K1 strains cause life-threatening infections in newborn infants. They are acquired from the mother at birth and colonize the small intestine, from where they invade the blood and central nervous system. It is difficult to obtain information from acutely ill patients that sheds light on physiological and bacterial factors determining invasive disease. Key aspects of naturally occurring age-dependent human infection can be reproduced in neonatal rats. Here, we employ transposon-directed insertion site sequencing to identify genes essential for the in vitro growth of E. coli K1 and genes that contribute to the colonization of susceptible rats. The presence of bottlenecks to invasion of the blood and cerebrospinal compartments precluded insertion site sequencing analysis, but we identified genes for survival in serum. PMID:29339415
Gene and enhancer trap tagging of vascular-expressed genes in poplar trees
Andrew Groover; Joseph R. Fontana; Gayle Dupper; Caiping Ma; Robert Martienssen; Steven Strauss; Richard Meilan
2004-01-01
We report a gene discovery system for poplar trees based on gene and enhancer traps. Gene and enhancer trap vectors carrying the β-glucuronidase (GUS) reporter gene were inserted into the poplar genome via Agrobacterium tumefaciens transformation, where they reveal the expression pattern of genes at or near the insertion sites. Because GUS...
Galindo-González, Leonardo; Mhiri, Corinne; Grandbastien, Marie-Angèle; Deyholos, Michael K
2016-12-07
Initial characterization of the flax genome showed that Ty1-copia retrotransposons are abundant, with several members being recently inserted, and in close association with genes. Recent insertions indicate a potential for ongoing transpositional activity that can create genomic diversity among accessions, cultivars or varieties. The polymorphisms generated constitute a good source of molecular markers that may be associated with phenotype if the insertions alter gene activity. Flax, where accessions are bred mainly for seed nutritional properties or for fibers, constitutes a good model for studying the relationship of transpositional activity with diversification and breeding. In this study, we estimated copy number and used a type of transposon display known as Sequence-Specific Amplification Polymorphisms (SSAPs), to characterize six families of Ty1-copia elements across 14 flax accessions. Polymorphic insertion sites were sequenced to find insertions that could potentially alter gene expression, and a preliminary test was performed with selected genes bearing transposable element (TE) insertions. Quantification of six families of Ty1-copia elements indicated different abundances among TE families and between flax accessions, which suggested diverse transpositional histories. SSAPs showed a high level of polymorphism in most of the evaluated retrotransposon families, with a trend towards higher levels of polymorphism in low-copy number families. Ty1-copia insertion polymorphisms among cultivars allowed a general distinction between oil and fiber types, and between spring and winter types, demonstrating their utility in diversity studies. Characterization of polymorphic insertions revealed an overwhelming association with genes, with insertions disrupting exons, introns or within 1 kb of coding regions. A preliminary test on the potential transcriptional disruption by TEs of four selected genes evaluated in three different tissues, showed one case of significant impact of the insertion on gene expression. We demonstrated that specific Ty1-copia families have been active since breeding commenced in flax. The retrotransposon-derived polymorphism can be used to separate flax types, and the close association of many insertions with genes defines a good source of potential mutations that could be associated with phenotypic changes, resulting in diversification processes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Venken, Koen J. T.; Popodi, Ellen; Holtzman, Stacy L.
We describe a molecularly defined duplication kit for the X chromosome of Drosophila melanogaster. A set of 408 overlapping P[acman] BAC clones was used to create small duplications (average length 88 kb) covering the 22-Mb sequenced portion of the chromosome. The BAC clones were inserted into an attP docking site on chromosome 3L using C31 integrase, allowing direct comparison of different transgenes. The insertions complement 92% of the essential and viable mutations and deletions tested, demonstrating that almost all Drosophila genes are compact and that the current annotations of the genome are reasonably accurate. Moreover, almost all genes are toleratedmore » at twice the normal dosage. Finally, we more precisely mapped two regions at which duplications cause diplo-lethality in males. This collection comprises the first molecularly defined duplication set to cover a whole chromosome in a multicellular organism. The work presented removes a long-standing barrier to genetic analysis of the Drosophila X chromosome, will greatly facilitate functional assays of X-linked genes in vivo, and provides a model for functional analyses of entire chromosomes in other species.« less
Billeter, M A; Naim, H Y; Udem, S A
2009-01-01
An overview is given on the development of technologies to allow reverse genetics of RNA viruses, i.e., the rescue of viruses from cDNA, with emphasis on nonsegmented negative-strand RNA viruses (Mononegavirales), as exemplified for measles virus (MV). Primarily, these technologies allowed site-directed mutagenesis, enabling important insights into a variety of aspects of the biology of these viruses. Concomitantly, foreign coding sequences were inserted to (a) allow localization of virus replication in vivo through marker gene expression, (b) develop candidate multivalent vaccines against measles and other pathogens, and (c) create candidate oncolytic viruses. The vector use of these viruses was experimentally encouraged by the pronounced genetic stability of the recombinants unexpected for RNA viruses, and by the high load of insertable genetic material, in excess of 6 kb. The known assets, such as the small genome size of the vector in comparison to DNA viruses proposed as vectors, the extensive clinical experience of attenuated MV as vaccine with a proven record of high safety and efficacy, and the low production cost per vaccination dose are thus favorably complemented.
Families of transposable elements, population structure and the origin of species.
Jurka, Jerzy; Bao, Weidong; Kojima, Kenji K
2011-09-19
Eukaryotic genomes harbor diverse families of repetitive DNA derived from transposable elements (TEs) that are able to replicate and insert into genomic DNA. The biological role of TEs remains unclear, although they have profound mutagenic impact on eukaryotic genomes and the origin of repetitive families often correlates with speciation events. We present a new hypothesis to explain the observed correlations based on classical concepts of population genetics. The main thesis presented in this paper is that the TE-derived repetitive families originate primarily by genetic drift in small populations derived mostly by subdivisions of large populations into subpopulations. We outline the potential impact of the emerging repetitive families on genetic diversification of different subpopulations, and discuss implications of such diversification for the origin of new species. Several testable predictions of the hypothesis are examined. First, we focus on the prediction that the number of diverse families of TEs fixed in a representative genome of a particular species positively correlates with the cumulative number of subpopulations (demes) in the historical metapopulation from which the species has emerged. Furthermore, we present evidence indicating that human AluYa5 and AluYb8 families might have originated in separate proto-human subpopulations. We also revisit prior evidence linking the origin of repetitive families to mammalian phylogeny and present additional evidence linking repetitive families to speciation based on mammalian taxonomy. Finally, we discuss evidence that mammalian orders represented by the largest numbers of species may be subject to relatively recent population subdivisions and speciation events. The hypothesis implies that subdivision of a population into small subpopulations is the major step in the origin of new families of TEs as well as of new species. The origin of new subpopulations is likely to be driven by the availability of new biological niches, consistent with the hypothesis of punctuated equilibria. The hypothesis also has implications for the ongoing debate on the role of genetic drift in genome evolution.
Tranchida-Lombardo, Valentina; Aiese Cigliano, Riccardo; Anzar, Irantzu; Landi, Simone; Palombieri, Samuela; Colantuono, Chiara; Bostan, Hamed; Termolino, Pasquale; Aversano, Riccardo; Batelli, Giorgia; Cammareri, Maria; Carputo, Domenico; Chiusano, Maria Luisa; Conicella, Clara; Consiglio, Federica; D'Agostino, Nunzio; De Palma, Monica; Di Matteo, Antonio; Grandillo, Silvana; Sanseverino, Walter; Tucci, Marina; Grillo, Stefania
2017-11-14
Tomato is a high value crop and the primary model for fleshy fruit development and ripening. Breeding priorities include increased fruit quality, shelf life and tolerance to stresses. To contribute towards this goal, we re-sequenced the genomes of Corbarino (COR) and Lucariello (LUC) landraces, which both possess the traits of plant adaptation to water deficit, prolonged fruit shelf-life and good fruit quality. Through the newly developed pipeline Reconstructor, we generated the genome sequences of COR and LUC using datasets of 65.8 M and 56.4 M of 30-150 bp paired-end reads, respectively. New contigs including reads that could not be mapped to the tomato reference genome were assembled, and a total of 43, 054 and 44, 579 gene loci were annotated in COR and LUC. Both genomes showed novel regions with similarity to Solanum pimpinellifolium and Solanum pennellii. In addition to small deletions and insertions, 2, 000 and 1, 700 single nucleotide polymorphisms (SNPs) could exert potentially disruptive effects on 1, 371 and 1, 201 genes in COR and LUC, respectively. A detailed survey of the SNPs occurring in fruit quality, shelf life and stress tolerance related-genes identified several candidates of potential relevance. Variations in ethylene response components may concur in determining peculiar phenotypes of COR and LUC. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Whole-genome sequencing of Atacama skeleton shows novel mutations linked with dysplasia
Bhattacharya, Sanchita; Li, Jian; Sockell, Alexandra; Kan, Matthew J.; Bava, Felice A.; Chen, Shann-Ching; Ávila-Arcos, María C.; Ji, Xuhuai; Smith, Emery; Asadi, Narges B.; Lachman, Ralph S.; Lam, Hugo Y.K.; Bustamante, Carlos D.; Butte, Atul J.; Nolan, Garry P.
2018-01-01
Over a decade ago, the Atacama humanoid skeleton (Ata) was discovered in the Atacama region of Chile. The Ata specimen carried a strange phenotype—6-in stature, fewer than expected ribs, elongated cranium, and accelerated bone age—leading to speculation that this was a preserved nonhuman primate, human fetus harboring genetic mutations, or even an extraterrestrial. We previously reported that it was human by DNA analysis with an estimated bone age of about 6–8 yr at the time of demise. To determine the possible genetic drivers of the observed morphology, DNA from the specimen was subjected to whole-genome sequencing using the Illumina HiSeq platform with an average 11.5× coverage of 101-bp, paired-end reads. In total, 3,356,569 single nucleotide variations (SNVs) were found as compared to the human reference genome, 518,365 insertions and deletions (indels), and 1047 structural variations (SVs) were detected. Here, we present the detailed whole-genome analysis showing that Ata is a female of human origin, likely of Chilean descent, and its genome harbors mutations in genes (COL1A1, COL2A1, KMT2D, FLNB, ATR, TRIP11, PCNT) previously linked with diseases of small stature, rib anomalies, cranial malformations, premature joint fusion, and osteochondrodysplasia (also known as skeletal dysplasia). Together, these findings provide a molecular characterization of Ata's peculiar phenotype, which likely results from multiple known and novel putative gene mutations affecting bone development and ossification. PMID:29567674
Phylogenetics of modern birds in the era of genomics
Edwards, Scott V; Bryan Jennings, W; Shedlock, Andrew M
2005-01-01
In the 14 years since the first higher-level bird phylogenies based on DNA sequence data, avian phylogenetics has witnessed the advent and maturation of the genomics era, the completion of the chicken genome and a suite of technologies that promise to add considerably to the agenda of avian phylogenetics. In this review, we summarize current approaches and data characteristics of recent higher-level bird studies and suggest a number of as yet untested molecular and analytical approaches for the unfolding tree of life for birds. A variety of comparative genomics strategies, including adoption of objective quality scores for sequence data, analysis of contiguous DNA sequences provided by large-insert genomic libraries, and the systematic use of retroposon insertions and other rare genomic changes all promise an integrated phylogenetics that is solidly grounded in genome evolution. The avian genome is an excellent testing ground for such approaches because of the more balanced representation of single-copy and repetitive DNA regions than in mammals. Although comparative genomics has a number of obvious uses in avian phylogenetics, its application to large numbers of taxa poses a number of methodological and infrastructural challenges, and can be greatly facilitated by a ‘community genomics’ approach in which the modest sequencing throughputs of single PI laboratories are pooled to produce larger, complementary datasets. Although the polymerase chain reaction era of avian phylogenetics is far from complete, the comparative genomics era—with its ability to vastly increase the number and type of molecular characters and to provide a genomic context for these characters—will usher in a host of new perspectives and opportunities for integrating genome evolution and avian phylogenetics. PMID:16024355
Comparative genomics of wild type yeast strains unveils important genome diversity
Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS
2008-01-01
Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome. PMID:18983662
Resources for Biological Annotation of the Drosophila Genome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerald M. Rubin
2005-08-08
This project supported seed money for the development of cDNA and genetic resources to support studies of the Drosophila melanogaster genome. Key publications supported by this work that provide additional detail: (1) ''The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes''; and (2) ''The Berkeley Drosophila Genome Project gene disruption project: Single P-element insertions mutating 25% of vital Drosophila genes''.
Dubois, Emeline; Bischerour, Julien; Marmignon, Antoine; Mathy, Nathalie; Régnier, Vinciane; Bétermier, Mireille
2012-01-01
Sequences related to transposons constitute a large fraction of extant genomes, but insertions within coding sequences have generally not been tolerated during evolution. Thanks to their unique nuclear dimorphism and to their original mechanism of programmed DNA elimination from their somatic nucleus (macronucleus), ciliates are emerging model organisms for the study of the impact of transposable elements on genomes. The germline genome of the ciliate Paramecium, located in its micronucleus, contains thousands of short intervening sequences, the IESs, which interrupt 47% of genes. Recent data provided support to the hypothesis that an evolutionary link exists between Paramecium IESs and Tc1/mariner transposons. During development of the macronucleus, IESs are excised precisely thanks to the coordinated action of PiggyMac, a domesticated piggyBac transposase, and of the NHEJ double-strand break repair pathway. A PiggyMac homolog is also required for developmentally programmed DNA elimination in another ciliate, Tetrahymena. Here, we present an overview of the life cycle of these unicellular eukaryotes and of the developmentally programmed genome rearrangements that take place at each sexual cycle. We discuss how ancient domestication of a piggyBac transposase might have allowed Tc1/mariner elements to spread throughout the germline genome of Paramecium, without strong counterselection against insertion within genes. PMID:22888464
Gowda, Malali
2016-01-01
Blast disease caused by the Magnaporthe species is a major factor affecting the productivity of rice, wheat and millets. This study was aimed at generating genomic information for rice and non-rice Magnaporthe isolates to understand the extent of genetic variation. We have sequenced the whole genome of the Magnaporthe isolates, infecting rice (leaf and neck), finger millet (leaf and neck), foxtail millet (leaf) and buffel grass (leaf). Rice and finger millet isolates infecting both leaf and neck tissues were sequenced, since the damage and yield loss caused due to neck blast is much higher as compared to leaf blast. The genome-wide comparison was carried out to study the variability in gene content, candidate effectors, repeat element distribution, genes involved in carbohydrate metabolism and SNPs. The analysis of repeat element footprints revealed some genes such as naringenin, 2-oxoglutarate 3-dioxygenase being targeted by Pot2 and Occan, in isolates from different host species. Some repeat insertions were host-specific while other insertions were randomly shared between isolates. The distributions of repeat elements, secretory proteins, CAZymes and SNPs showed significant variation across host-specific lineages of Magnaporthe indicating an independent genome evolution orchestrated by multiple genomic factors. PMID:27658241
Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong
2014-01-01
Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372
Walsh, Tom; Lee, Ming K.; Casadei, Silvia; Thornton, Anne M.; Stray, Sunday M.; Pennil, Christopher; Nord, Alex S.; Mandell, Jessica B.; Swisher, Elizabeth M.; King, Mary-Claire
2010-01-01
Inherited loss-of-function mutations in the tumor suppressor genes BRCA1, BRCA2, and multiple other genes predispose to high risks of breast and/or ovarian cancer. Cancer-associated inherited mutations in these genes are collectively quite common, but individually rare or even private. Genetic testing for BRCA1 and BRCA2 mutations has become an integral part of clinical practice, but testing is generally limited to these two genes and to women with severe family histories of breast or ovarian cancer. To determine whether massively parallel, “next-generation” sequencing would enable accurate, thorough, and cost-effective identification of inherited mutations for breast and ovarian cancer, we developed a genomic assay to capture, sequence, and detect all mutations in 21 genes, including BRCA1 and BRCA2, with inherited mutations that predispose to breast or ovarian cancer. Constitutional genomic DNA from subjects with known inherited mutations, ranging in size from 1 to >100,000 bp, was hybridized to custom oligonucleotides and then sequenced using a genome analyzer. Analysis was carried out blind to the mutation in each sample. Average coverage was >1200 reads per base pair. After filtering sequences for quality and number of reads, all single-nucleotide substitutions, small insertion and deletion mutations, and large genomic duplications and deletions were detected. There were zero false-positive calls of nonsense mutations, frameshift mutations, or genomic rearrangements for any gene in any of the test samples. This approach enables widespread genetic testing and personalized risk assessment for breast and ovarian cancer. PMID:20616022
Pace, John K; Sen, Shurjo K; Batzer, Mark A; Feschotte, Cédric
2009-05-01
DNA double-strand breaks (DSBs) are a common form of cellular damage that can lead to cell death if not repaired promptly. Experimental systems have shown that DSB repair in eukaryotic cells is often imperfect and may result in the insertion of extra chromosomal DNA or the duplication of existing DNA at the breakpoint. These events are thought to be a source of genomic instability and human diseases, but it is unclear whether they have contributed significantly to genome evolution. Here we developed an innovative computational pipeline that takes advantage of the repetitive structure of genomes to detect repair-mediated duplication events (RDs) that occurred in the germline and created insertions of at least 50 bp of genomic DNA. Using this pipeline we identified over 1,000 probable RDs in the human genome. Of these, 824 were intra-chromosomal, closely linked duplications of up to 619 bp bearing the hallmarks of the synthesis-dependent strand-annealing repair pathway. This mechanism has duplicated hundreds of sequences predicted to be functional in the human genome, including exons, UTRs, intron splice sites and transcription factor binding sites. Dating of the duplication events using comparative genomics and experimental validation revealed that the mechanism has operated continuously but with decreasing intensity throughout primate evolution. The mechanism has produced species-specific duplications in all primate species surveyed and is contributing to genomic variation among humans. Finally, we show that RDs have also occurred, albeit at a lower frequency, in non-primate mammals and other vertebrates, indicating that this mechanism has been an important force shaping vertebrate genome evolution.
Masking as an effective quality control method for next-generation sequencing data analysis.
Yun, Sajung; Yun, Sijung
2014-12-13
Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).
Ebert, Matthias; Laaß, Sebastian; Burghartz, Melanie; Petersen, Jörn; Koßmehl, Sebastian; Wöhlbrand, Lars; Rabus, Ralf; Wittmann, Christoph; Jahn, Dieter
2013-01-01
Anaerobic growth and survival are integral parts of the life cycle of many marine bacteria. To identify genes essential for the anoxic life of Dinoroseobacter shibae, a transposon library was screened for strains impaired in anaerobic denitrifying growth. Transposon insertions in 35 chromosomal and 18 plasmid genes were detected. The essential contribution of plasmid genes to anaerobic growth was confirmed with plasmid-cured D. shibae strains. A combined transcriptome and proteome approach identified oxygen tension-regulated genes. Transposon insertion sites of a total of 1,527 mutants without an anaerobic growth phenotype were determined to identify anaerobically induced but not essential genes. A surprisingly small overlap of only three genes (napA, phaA, and the Na+/Pi antiporter gene Dshi_0543) between anaerobically essential and induced genes was found. Interestingly, transposon mutations in genes involved in dissimilatory and assimilatory nitrate reduction (napA, nasA) and corresponding cofactor biosynthesis (genomic moaB, moeB, and dsbC and plasmid-carried dsbD and ccmH) were found to cause anaerobic growth defects. In contrast, mutation of anaerobically induced genes encoding proteins required for the later denitrification steps (nirS, nirJ, nosD), dimethyl sulfoxide reduction (dmsA1), and fermentation (pdhB1, arcA, aceE, pta, acs) did not result in decreased anaerobic growth under the conditions tested. Additional essential components (ferredoxin, cccA) of the anaerobic electron transfer chain and central metabolism (pdhB) were identified. Another surprise was the importance of sodium gradient-dependent membrane processes and genomic rearrangements via viruses, transposons, and insertion sequence elements for anaerobic growth. These processes and the observed contributions of cell envelope restructuring (lysM, mipA, fadK), C4-dicarboxylate transport (dctM1, dctM3), and protease functions to anaerobic growth require further investigation to unravel the novel underlying adaptation strategies. PMID:23974024
The Essential Genome of Escherichia coli K-12
2018-01-01
ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Graf, Louis; Kim, Yae Jin; Cho, Ga Youn; Miller, Kathy Ann
2017-01-01
Coccophora langsdorfii (Turner) Greville (Fucales) is an intertidal brown alga that is endemic to Northeast Asia and increasingly endangered by habitat loss and climate change. We sequenced the complete circular plastid and mitochondrial genomes of C. langsdorfii. The circular plastid genome is 124,450 bp and contains 139 protein-coding, 28 tRNA and 6 rRNA genes. The circular mitochondrial genome is 35,660 bp and contains 38 protein-coding, 25 tRNA and 3 rRNA genes. The structure and gene content of the C. langsdorfii plastid genome is similar to those of other species in the Fucales. The plastid genomes of brown algae in other orders share similar gene content but exhibit large structural recombination. The large in-frame insert in the cox2 gene in the mitochondrial genome of C. langsdorfii is typical of other brown algae. We explored the effect of this insertion on the structure and function of the cox2 protein. We estimated the usefulness of 135 plastid genes and 35 mitochondrial genes for developing molecular markers. This study shows that 29 organellar genes will prove efficient for resolving brown algal phylogeny. In addition, we propose a new molecular marker suitable for the study of intraspecific genetic diversity that should be tested in a large survey of populations of C. langsdorfii. PMID:29095864
Inserting new technology into small missions
NASA Technical Reports Server (NTRS)
Deutsch, L. J.
2001-01-01
Part of what makes small missions small is that they have less money. Executing missions at low cost implies extensive use of cost sharing with other missions or use of existing solutions. Luckily, there are methods for creating new technology and inserting it into faster-better-cheaper missions.
Bai, Xiaodong; Zhang, Jianhua; Ewing, Adam; Miller, Sally A.; Jancso Radek, Agnes; Shevchenko, Dmitriy V.; Tsukerman, Kiryl; Walunas, Theresa; Lapidus, Alla; Campbell, John W.; Hogenhout, Saskia A.
2006-01-01
Phytoplasmas (“Candidatus Phytoplasma,” class Mollicutes) cause disease in hundreds of economically important plants and are obligately transmitted by sap-feeding insects of the order Hemiptera, mainly leafhoppers and psyllids. The 706,569-bp chromosome and four plasmids of aster yellows phytoplasma strain witches' broom (AY-WB) were sequenced and compared to the onion yellows phytoplasma strain M (OY-M) genome. The phytoplasmas have small repeat-rich genomes. This comparative analysis revealed that the repeated DNAs are organized into large clusters of potential mobile units (PMUs), which contain tra5 insertion sequences (ISs) and genes for specialized sigma factors and membrane proteins. So far, these PMUs appear to be unique to phytoplasmas. Compared to mycoplasmas, phytoplasmas lack several recombination and DNA modification functions, and therefore, phytoplasmas may use different mechanisms of recombination, likely involving PMUs, for the creation of variability, allowing phytoplasmas to adjust to the diverse environments of plants and insects. The irregular GC skews and the presence of ISs and large repeated sequences in the AY-WB and OY-M genomes are indicative of high genomic plasticity. Nevertheless, segments of ∼250 kb located between the lplA and glnQ genes are syntenic between the two phytoplasmas and contain the majority of the metabolic genes and no ISs. AY-WB appears to be further along in the reductive evolution process than OY-M. The AY-WB genome is ∼154 kb smaller than the OY-M genome, primarily as a result of fewer multicopy sequences, including PMUs. Furthermore, AY-WB lacks genes that are truncated and are part of incomplete pathways in OY-M. PMID:16672622
González, Leonardo Galindo; Deyholos, Michael K
2012-11-21
Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression. Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (≥ 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution. The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated.
2012-01-01
Background Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression. Results Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (≥ 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution. Conclusions The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated. PMID:23171245
Forsman, Päivi; Alatossava, Tapani
1991-01-01
The genomes of four Lactobacillus delbrueckii subsp. lactis bacteriophages were characterized by restriction endonuclease mapping, Southern hybridization, and heteroduplex analysis. The phages were isolated from different cheese processing plants in Finland between 1950 and 1972. All four phages had a small isometric head and a long noncontractile tail. Two different types of genome (double-stranded DNA) organization existed among the different phages, the pac type and the cos type, corresponding to alternative types of phage DNA packaging. Three phages belonged to the pac type, and a fourth was a cos-type phage. The pac-type phages were genetically closely related. In the genomes of the pac-type phages, three putative insertion/deletions (0.7 to 0.8 kb, 1.0 kb, and 1.5 kb) and one other region (0.9 kb) containing clustered base substitutions were discovered and localized. At the phenotype level, three main differences were observed among the pac-type phages. These concerned two minor structural proteins and the efficiency of phage DNA packaging. The genomes of the pac-type phages showed only weak homology with that of the cos-type phage. Phage-related DNA, probably a defective prophage, was located in the chromosome of the host strain sensitive to the cos-type phage. This DNA exhibited homology under stringent conditions to the pac-type phages. Images PMID:16348513
Putterman, D G; Gryczan, T J; Dubnau, D; Day, L A
1983-01-01
The genome of Pf3, a filamentous single-stranded DNA bacteriophage of Pseudomonas aeruginosa (a gram-negative organism) was cloned into pBD214, a plasmid cloning vector of Bacillus subtilis (a gram-positive organism). Cloning in the gram-positive organism was done to avoid anticipated lethal effects. The entire Pf3 genome was inserted in each orientation at a unique Bc/I site within a thymidylate synthetase gene (from B. subtilis phage beta 22) on the plasmid. Additional clones were made by inserting EcoRI fragments of Pf3 DNA into a unique EcoRI site within this gene. Images PMID:6306273
Aschard, Hugues; Cattoir, Vincent; Yoder-Himes, Deborah; Lory, Stephen; Pier, Gerald B.
2013-01-01
High-throughput sequencing of transposon (Tn) libraries created within entire genomes identifies and quantifies the contribution of individual genes and operons to the fitness of organisms in different environments. We used insertion-sequencing (INSeq) to analyze the contribution to fitness of all non-essential genes in the chromosome of Pseudomonas aeruginosa strain PA14 based on a library of ∼300,000 individual Tn insertions. In vitro growth in LB provided a baseline for comparison with the survival of the Tn insertion strains following 6 days of colonization of the murine gastrointestinal tract as well as a comparison with Tn-inserts subsequently able to systemically disseminate to the spleen following induction of neutropenia. Sequencing was performed following DNA extraction from the recovered bacteria, digestion with the MmeI restriction enzyme that hydrolyzes DNA 16 bp away from the end of the Tn insert, and fractionation into oligonucleotides of 1,200–1,500 bp that were prepared for high-throughput sequencing. Changes in frequency of Tn inserts into the P. aeruginosa genome were used to quantify in vivo fitness resulting from loss of a gene. 636 genes had <10 sequencing reads in LB, thus defined as unable to grow in this medium. During in vivo infection there were major losses of strains with Tn inserts in almost all known virulence factors, as well as respiration, energy utilization, ion pumps, nutritional genes and prophages. Many new candidates for virulence factors were also identified. There were consistent changes in the recovery of Tn inserts in genes within most operons and Tn insertions into some genes enhanced in vivo fitness. Strikingly, 90% of the non-essential genes were required for in vivo survival following systemic dissemination during neutropenia. These experiments resulted in the identification of the P. aeruginosa strain PA14 genes necessary for optimal survival in the mucosal and systemic environments of a mammalian host. PMID:24039572
Sanseverino, Walter; Hénaff, Elizabeth; Vives, Cristina; Pinosio, Sara; Burgos-Paz, William; Morgante, Michele; Ramos-Onsins, Sebastián E; Garcia-Mas, Jordi; Casacuberta, Josep Maria
2015-10-01
The availability of extensive databases of crop genome sequences should allow analysis of crop variability at an unprecedented scale, which should have an important impact in plant breeding. However, up to now the analysis of genetic variability at the whole-genome scale has been mainly restricted to single nucleotide polymorphisms (SNPs). This is a strong limitation as structural variation (SV) and transposon insertion polymorphisms are frequent in plant species and have had an important mutational role in crop domestication and breeding. Here, we present the first comprehensive analysis of melon genetic diversity, which includes a detailed analysis of SNPs, SV, and transposon insertion polymorphisms. The variability found among seven melon varieties representing the species diversity and including wild accessions and highly breed lines, is relatively high due in part to the marked divergence of some lineages. The diversity is distributed nonuniformly across the genome, being lower at the extremes of the chromosomes and higher in the pericentromeric regions, which is compatible with the effect of purifying selection and recombination forces over functional regions. Additionally, this variability is greatly reduced among elite varieties, probably due to selection during breeding. We have found some chromosomal regions showing a high differentiation of the elite varieties versus the rest, which could be considered as strongly selected candidate regions. Our data also suggest that transposons and SV may be at the origin of an important fraction of the variability in melon, which highlights the importance of analyzing all types of genetic variability to understand crop genome evolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Babenko, Vladimir N; Makunin, Igor V; Brusentsova, Irina V; Belyaeva, Elena S; Maksimov, Daniil A; Belyakin, Stepan N; Maroy, Peter; Vasil'eva, Lyubov A; Zhimulev, Igor F
2010-05-21
Eukaryotic genomes are organized in extended domains with distinct features intimately linking genome structure, replication pattern and chromatin state. Recently we identified a set of long late replicating euchromatic regions that are underreplicated in salivary gland polytene chromosomes of D. melanogaster. Here we demonstrate that these underreplicated regions (URs) have a low density of P-element and piggyBac insertions compared to the genome average or neighboring regions. In contrast, Minos-based transposons show no paucity in URs but have a strong bias to testis-specific genes. We estimated the suppression level in 2,852 stocks carrying a single P-element by analysis of eye color determined by the mini-white marker gene and demonstrate that the proportion of suppressed transgenes in URs is more than three times higher than in the flanking regions or the genomic average. The suppressed transgenes reside in intergenic, genic or promoter regions of the annotated genes. We speculate that the low insertion frequency of P-elements and piggyBacs in URs partially results from suppression of transgenes that potentially could prevent identification of transgenes due to complete suppression of the marker gene. In a similar manner, the proportion of suppressed transgenes is higher in loci replicating late or very late in Kc cells and these loci have a lower density of P-elements and piggyBac insertions. In transgenes with two marker genes suppression of mini-white gene in eye coincides with suppression of yellow gene in bristles. Our results suggest that the late replication domains have a high inactivation potential apparently linked to the silenced or closed chromatin state in these regions, and that such inactivation potential is largely maintained in different tissues.
Polymorphic integrations of an endogenous gammaretrovirus in the mule deer genome.
Elleder, Daniel; Kim, Oekyung; Padhi, Abinash; Bankert, Jason G; Simeonov, Ivan; Schuster, Stephan C; Wittekindt, Nicola E; Motameny, Susanne; Poss, Mary
2012-03-01
Endogenous retroviruses constitute a significant genomic fraction in all mammalian species. Typically they are evolutionarily old and fixed in the host species population. Here we report on a novel endogenous gammaretrovirus (CrERVγ; for cervid endogenous gammaretrovirus) in the mule deer (Odocoileus hemionus) that is insertionally polymorphic among individuals from the same geographical location, suggesting that it has a more recent evolutionary origin. Using PCR-based methods, we identified seven CrERVγ proviruses and demonstrated that they show various levels of insertional polymorphism in mule deer individuals. One CrERVγ provirus was detected in all mule deer sampled but was absent from white-tailed deer, indicating that this virus originally integrated after the split of the two species, which occurred approximately one million years ago. There are, on average, 100 CrERVγ copies in the mule deer genome based on quantitative PCR analysis. A CrERVγ provirus was sequenced and contained intact open reading frames (ORFs) for three virus genes. Transcripts were identified covering the entire provirus. CrERVγ forms a distinct branch of the gammaretrovirus phylogeny, with the closest relatives of CrERVγ being endogenous gammaretroviruses from sheep and pig. We demonstrated that white-tailed deer (Odocoileus virginianus) and elk (Cervus canadensis) DNA contain proviruses that are closely related to mule deer CrERVγ in a conserved region of pol; more distantly related sequences can be identified in the genome of another member of the Cervidae, the muntjac (Muntiacus muntjak). The discovery of a novel transcriptionally active and insertionally polymorphic retrovirus in mammals could provide a useful model system to study the dynamic interaction between the host genome and an invading retrovirus.
2009-01-01
Background The diploid woodland strawberry (Fragaria vesca) is an attractive system for functional genomics studies. Its small stature, fast regeneration time, efficient transformability and small genome size, together with substantial EST and genomic sequence resources make it an ideal reference plant for Fragaria and other herbaceous perennials. Most importantly, this species shares gene sequence similarity and genomic microcolinearity with other members of the Rosaceae family, including large-statured tree crops (such as apple, peach and cherry), and brambles and roses as well as with the cultivated octoploid strawberry, F. ×ananassa. F. vesca may be used to quickly address questions of gene function relevant to these valuable crop species. Although some F. vesca lines have been shown to be substantially homozygous, in our hands plants in purportedly homozygous populations exhibited a range of morphological and physiological variation, confounding phenotypic analyses. We also found the genotype of a named variety, thought to be well-characterized and even sold commercially, to be in question. An easy to grow, standardized, inbred diploid Fragaria line with documented genotype that is available to all members of the research community will facilitate comparison of results among laboratories and provide the research community with a necessary tool for functionally testing the large amount of sequence data that will soon be available for peach, apple, and strawberry. Results A highly inbred line, YW5AF7, of a diploid strawberry Fragaria vesca f. semperflorens line called "Yellow Wonder" (Y2) was developed and examined. Botanical descriptors were assessed for morphological characterization of this genotype. The plant line was found to be rapidly transformable using established techniques and media formulations. Conclusion The development of the documented YW5AF7 line provides an important tool for Rosaceae functional genomic analyses. These day-neutral plants have a small genome, a seed to seed cycle of 3.0 - 3.5 months, and produce fruit in 7.5 cm pots in a growth chamber. YW5AF7 is runnerless and therefore easy to maintain in the greenhouse, forms abundant branch crowns for vegetative propagation, and produces highly aromatic yellow fruit throughout the year in the greenhouse. F. vesca can be transformed with Agrobacterium tumefaciens, making these plants suitable for insertional mutagenesis, RNAi and overexpression studies that can be compared against a stable baseline of phenotypic descriptors and can be readily genetically substantiated. PMID:19878589
Luis F. Larrondo; Paulo Canessa; Rafael Vicuna; Philip Stewart; Amber Vanden Wymelenberg; Dan Cullen
2007-01-01
We describe the structure, organization, and transcriptional impact of repetitive elements within the lignin-degrading basidiomycete, Phanerochaete chrysosporium. Searches of the P. chrysosporium genome revealed five copies of pce1, a 1,750-nt non-autonomous, class II element. Alleles encoding a putative glucosyltransferase and a cytochrome P450 harbor pce insertions...
USDA-ARS?s Scientific Manuscript database
Newcastle disease virus (NDV), avian paramyxovirus type 1, has been developed as a vector to express foreign genes for vaccine and gene therapy purposes. The foreign genes are usually inserted into a non-coding region of the NDV genome as an independent transcription unit (ITU), which potentially a...
Luo, Meizhong; Kim, Hyeran; Kudrna, Dave; Sisneros, Nicholas B; Lee, So-Jeong; Mueller, Christopher; Collura, Kristi; Zuccolo, Andrea; Buckingham, E Bryan; Grim, Suzanne M; Yanagiya, Kazuyo; Inoko, Hidetoshi; Shiina, Takashi; Flajnik, Martin F; Wing, Rod A; Ohta, Yuko
2006-05-03
Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC) library for the nurse shark, Ginglymostoma cirratum. The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 x 1010 bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6-28 primary positive clones per probe of which 50-90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome.
Use of BAC clones as standardized reagents for Marek’s disease virus research
USDA-ARS?s Scientific Manuscript database
The cloning of the Marek’s disease virus (MDV) genome as an infectious bacterial artificial chromosome (BAC) clone have led to major advances through our ability to study individual gene function by making precise insertions and deletions in the viral genome. We believe that MDV BAC clones will repl...
Inserting new technology into small missions
NASA Technical Reports Server (NTRS)
Deutsch, L. J.
2001-01-01
Part of what makes small missions small is that they have less money. Executing missions at low cost implies extensive use of cost sharing with other missions or use of existing solutions. However, in order to create many small missions, new technology must be developed, applied, and assimilated. Luckily, there are methods for creating new technology and inserting it into faster-better-cheaper (FBC) missions.
Coleman, John W; Wright, Kevin J; Wallace, Olivia L; Sharma, Palka; Arendt, Heather; Martinez, Jennifer; DeStefano, Joanne; Zamb, Timothy P; Zhang, Xinsheng; Parks, Christopher L
2015-03-01
Advancement of new vaccines based on live viral vectors requires sensitive assays to analyze in vivo replication, gene expression and genetic stability. In this study, attenuated canine distemper virus (CDV) was used as a vaccine delivery vector and duplex 2-step quantitative real-time RT-PCR (RT-qPCR) assays specific for genomic RNA (gRNA) or mRNA have been developed that concurrently quantify coding sequences for the CDV nucleocapsid protein (N) and a foreign vaccine antigen (SIV Gag). These amplicons, which had detection limits of about 10 copies per PCR reaction, were used to show that abdominal cavity lymphoid tissues were a primary site of CDV vector replication in infected ferrets, and importantly, CDV gRNA or mRNA was undetectable in brain tissue. In addition, the gRNA duplex assay was adapted for monitoring foreign gene insert genetic stability during in vivo replication by analyzing the ratio of CDV N and SIV gag genomic RNA copies over the course of vector infection. This measurement was found to be a sensitive probe for assessing the in vivo genetic stability of the foreign gene insert. Copyright © 2014 Elsevier B.V. All rights reserved.
A multi-landing pad DNA integration platform for mammalian cell engineering
Gaidukov, Leonid; Wroblewska, Liliana; Teague, Brian; Nelson, Tom; Zhang, Xin; Liu, Yan; Jagtap, Kalpana; Mamo, Selamawit; Tseng, Wen Allen; Lowe, Alexis; Das, Jishnu; Bandara, Kalpanie; Baijuraj, Swetha; Summers, Nevin M; Zhang, Lin; Weiss, Ron
2018-01-01
Abstract Engineering mammalian cell lines that stably express many transgenes requires the precise insertion of large amounts of heterologous DNA into well-characterized genomic loci, but current methods are limited. To facilitate reliable large-scale engineering of CHO cells, we identified 21 novel genomic sites that supported stable long-term expression of transgenes, and then constructed cell lines containing one, two or three ‘landing pad’ recombination sites at selected loci. By using a highly efficient BxB1 recombinase along with different selection markers at each site, we directed recombinase-mediated insertion of heterologous DNA to selected sites, including targeting all three with a single transfection. We used this method to controllably integrate up to nine copies of a monoclonal antibody, representing about 100 kb of heterologous DNA in 21 transcriptional units. Because the integration was targeted to pre-validated loci, recombinant protein expression remained stable for weeks and additional copies of the antibody cassette in the integrated payload resulted in a linear increase in antibody expression. Overall, this multi-copy site-specific integration platform allows for controllable and reproducible insertion of large amounts of DNA into stable genomic sites, which has broad applications for mammalian synthetic biology, recombinant protein production and biomanufacturing. PMID:29617873
Himar1 Transposon for Efficient Random Mutagenesis in Aggregatibacter actinomycetemcomitans
Ding, Qinfeng; Tan, Kai Soo
2017-01-01
Aggregatibacter actinomycetemcomitans is the primary etiological agent of aggressive periodontal disease. Identification of novel virulence factors at the genome-wide level is hindered by lack of efficient genetic tools to perform mutagenesis in this organism. The Himar1 mariner transposon is known to yield a random distribution of insertions in an organism’s genome with requirement for only a TA dinucleotide target and is independent of host-specific factors. However, the utility of this system in A. actinomycetemcomitans is unknown. In this study, we found that Himar1 transposon mutagenesis occurs at a high frequency (×10-4), and can be universally applied to wild-type A. actinomycetemcomitans strains of serotypes a, b, and c. The Himar1 transposon inserts were stably inherited in A. actinomycetemcomitans transconjugants in the absence of antibiotics. A library of 16,000 mutant colonies of A. actinomycetemcomitans was screened for reduced biofilm formation. Mutants with transposon inserts in genes encoding pilus, putative ion transporters, multidrug resistant proteins, transcription regulators and enzymes involved in the synthesis of extracellular polymeric substance, bacterial metabolism and stress response were discovered in this screen. Our results demonstrated the utility of the Himar1 mutagenesis system as a novel genetic tool for functional genomic analysis in A. actinomycetemcomitans. PMID:29018421
Easi-CRISPR for creating knock-in and conditional knockout mouse models using long ssDNA donors.
Miura, Hiromi; Quadros, Rolen M; Gurumurthy, Channabasavaiah B; Ohtsuka, Masato
2018-01-01
CRISPR/Cas9-based genome editing can easily generate knockout mouse models by disrupting the gene sequence, but its efficiency for creating models that require either insertion of exogenous DNA (knock-in) or replacement of genomic segments is very poor. The majority of mouse models used in research involve knock-in (reporters or recombinases) or gene replacement (e.g., conditional knockout alleles containing exons flanked by LoxP sites). A few methods for creating such models have been reported that use double-stranded DNA as donors, but their efficiency is typically 1-10% and therefore not suitable for routine use. We recently demonstrated that long single-stranded DNAs (ssDNAs) serve as very efficient donors, both for insertion and for gene replacement. We call this method efficient additions with ssDNA inserts-CRISPR (Easi-CRISPR) because it is a highly efficient technology (efficiency is typically 30-60% and reaches as high as 100% in some cases). The protocol takes ∼2 months to generate the founder mice.
NASA Astrophysics Data System (ADS)
Panicali, Dennis; Paoletti, Enzo
1982-08-01
We have constructed recombinant vaccinia viruses containing the thymidine kinase gene from herpes simplex virus. The gene was inserted into the genome of a variant of vaccinia virus that had undergone spontaneous deletion as well as into the 120-megadalton genome of the large prototypic vaccinia variant. This was accomplished via in vivo recombination by contransfection of eukaryotic tissue culture cells with cloned BamHI-digested thymidine kinase gene from herpes simplex virus containing flanking vaccinia virus DNA sequences and infectious rescuing vaccinia virus. Pure populations of the recombinant viruses were obtained by replica filter techniques or by growth of the recombinant virus in biochemically selective medium. The herpes simplex virus thymidine kinase gene, as an insert in vaccinia virus, is transcribed in vivo and in vitro, and the fidelity of in vivo transcription into a functional gene product was detected by the phosphorylation of 5-[125I]iodo-2'-deoxycytidine.
Despott, Edward J; Murino, Alberto; Bourikas, Leonidas; Nakamura, Masanao; Ramachandra, Vino; Fraser, Chris
2015-05-01
Spiral enteroscopy is a recently introduced technology alternative to balloon-assisted enteroscopy for examination of the small bowel. To compare small bowel insertion depths and procedure duration by spiral enteroscopy and double-balloon enteroscopy performed in the same cohort of patients, in immediate succession, using the same method of insertion depth estimation. A prospective, back-to-back comparative study was performed in 15 patients. Spiral enteroscopy procedures were performed first and a tattoo was placed to mark the most distal point. Double-balloon enteroscopy passed the tattoo placed at spiral enteroscopy in 14/15 cases (93%). Median insertion depths for double-balloon enteroscopy and spiral enteroscopy were 265cm and 175cm, respectively (P=0.004). Median time to achieve maximal depth of insertion was significantly shorter for spiral enteroscopy compared with double-balloon enteroscopy (24min vs. 45min, respectively; P=0.0005). However, in 14 patients no differences were found in median time to reach the same insertion depth (P=0.28). Double-balloon enteroscopy achieved significantly greater small bowel insertion depth than spiral enteroscopy. Although overall double-balloon enteroscopy procedure duration was longer, the time taken to reach the same small bowel insertion depth by both spiral enteroscopy and double-balloon enteroscopy was similar. Copyright © 2015 Editrice Gastroenterologica Italiana S.r.l. Published by Elsevier Ltd. All rights reserved.
Design and construction of functional AAV vectors.
Gray, John T; Zolotukhin, Serge
2011-01-01
Using the basic principles of molecular biology and laboratory techniques presented in this chapter, researchers should be able to create a wide variety of AAV vectors for both clinical and basic research applications. Basic vector design concepts are covered for both protein coding gene expression and small non-coding RNA gene expression cassettes. AAV plasmid vector backbones (available via AddGene) are described, along with critical sequence details for a variety of modular expression components that can be inserted as needed for specific applications. Protocols are provided for assembling the various DNA components into AAV vector plasmids in Escherichia coli, as well as for transferring these vector sequences into baculovirus genomes for large-scale production of AAV in the insect cell production system.
Hatmaker, E. Anne; Wadl, Phillip A.; Mantooth, Kristie; Scheffler, Brian E.; Ownley, Bonnie H.; Trigiano, Robert N.
2015-01-01
Premise of the study: We developed microsatellites from Fothergilla ×intermedia to establish loci capable of distinguishing species and cultivars, and to assess genetic diversity for use by ornamental breeders and to transfer within Hamamelidaceae. Methods and Results: We sequenced a small insert genomic library enriched for microsatellites to develop 12 polymorphic microsatellite loci. The number of alleles detected ranged from four to 15 across five genera within Hamamelidaceae. Shannon’s information index ranged from 0.07 to 0.14. Conclusions: These microsatellite loci provide a set of markers to evaluate genetic diversity of natural and cultivated collections and assist ornamental plant breeders for genetic studies of five popular genera of woody ornamental plants. PMID:25909044
CRISPR-Cas9D10A Nickase-Assisted Genome Editing in Lactobacillus casei
Song, Xin; Huang, He; Xiong, Zhiqiang
2017-01-01
ABSTRACT Lactobacillus casei has drawn increasing attention as a health-promoting probiotic, while effective genetic manipulation tools are often not available, e.g., the single-gene knockout in L. casei still depends on the classic homologous recombination-dependent double-crossover strategy, which is quite labor-intensive and time-consuming. In the present study, a rapid and precise genome editing plasmid, pLCNICK, was established for L. casei genome engineering based on CRISPR-Cas9D10A. In addition to the P23-Cas9D10A and Pldh-sgRNA (single guide RNA) expression cassettes, pLCNICK includes the homologous arms of the target gene as repair templates. The ability and efficiency of chromosomal engineering using pLCNICK were evaluated by in-frame deletions of four independent genes and chromosomal insertion of an enhanced green fluorescent protein (eGFP) expression cassette at the LC2W_1628 locus. The efficiencies associated with in-frame deletions and chromosomal insertion is 25 to 62%. pLCNICK has been proved to be an effective, rapid, and precise tool for genome editing in L. casei, and its potential application in other lactic acid bacteria (LAB) is also discussed in this study. IMPORTANCE The lack of efficient genetic tools has limited the investigation and biotechnological application of many LAB. The CRISPR-Cas9D10A nickase-based genome editing in Lactobacillus casei, an important food industrial microorganism, was demonstrated in this study. This genetic tool allows efficient single-gene deletion and insertion to be accomplished by one-step transformation, and the cycle time is reduced to 9 days. It facilitates a rapid and precise chromosomal manipulation in L. casei and overcomes some limitations of previous methods. This editing system can serve as a basic technological platform and offers the possibility to start a comprehensive investigation on L. casei. As a broad-host-range plasmid, pLCNICK has the potential to be adapted to other Lactobacillus species for genome editing. PMID:28864652
Fulton, Benjamin O; Sachs, David; Schwarz, Megan C; Palese, Peter; Evans, Matthew J
2017-08-01
The molecular constraints affecting Zika virus (ZIKV) evolution are not well understood. To investigate ZIKV genetic flexibility, we used transposon mutagenesis to add 15-nucleotide insertions throughout the ZIKV MR766 genome and subsequently deep sequenced the viable mutants. Few ZIKV insertion mutants replicated, which likely reflects a high degree of functional constraints on the genome. The NS1 gene exhibited distinct mutational tolerances at different stages of the screen. This result may define regions of the NS1 protein that are required for the different stages of the viral life cycle. The ZIKV structural genes showed the highest degree of insertional tolerance. Although the envelope (E) protein exhibited particular flexibility, the highly conserved envelope domain II (EDII) fusion loop of the E protein was intolerant of transposon insertions. The fusion loop is also a target of pan-flavivirus antibodies that are generated against other flaviviruses and neutralize a broad range of dengue virus and ZIKV isolates. The genetic restrictions identified within the epitopes in the EDII fusion loop likely explain the sequence and antigenic conservation of these regions in ZIKV and among multiple flaviviruses. Thus, our results provide insights into the genetic restrictions on ZIKV that may affect the evolution of this virus. IMPORTANCE Zika virus recently emerged as a significant human pathogen. Determining the genetic constraints on Zika virus is important for understanding the factors affecting viral evolution. We used a genome-wide transposon mutagenesis screen to identify where mutations were tolerated in replicating viruses. We found that the genetic regions involved in RNA replication were mostly intolerant of mutations. The genes coding for structural proteins were more permissive to mutations. Despite the flexibility observed in these regions, we found that epitopes bound by broadly reactive antibodies were genetically constrained. This finding may explain the genetic conservation of these epitopes among flaviviruses. Copyright © 2017 American Society for Microbiology.
In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome
2013-01-01
Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783
The association between cecal insertion time and colorectal neoplasm detection
2013-01-01
Background Information on the impact of cecal insertion time on colorectal neoplasm detection is limited. Our objective was to determine the association between cecal insertion time and colorectal neoplasm detection rate in colonoscopy screening. Methods We performed a cross-sectional study of 12,679 consecutive subjects aged 40–79 years undergoing screening colonoscopy in routine health check-ups at the Center for Health Promotion of the Samsung Medical Center from December 2007 to June 2009. Fixed effects logistic regression conditioning on colonoscopist was used to eliminate confounding due to differences in technical ability and other characteristics across colonoscopists. Results The mean cecal insertion time was 5.9 (SD, 4.4 minutes). We identified 4,249 (33.5%) participants with colorectal neoplasms, of whom 1,956 had small single adenomas (<5 mm), 595 had medium single adenomas (5–9 mm), and 1,699 had multiple adenomas or advanced colorectal neoplasms. The overall rates of colorectal neoplasm detection by quartiles of cecal insertion time were 36.8%, 33.4%, 32.7%, and 31.0%, respectively (p trend <0.001).The odds for small single colorectal adenoma detection was 16% lower (adjusted OR 0.84; 95% CI 0.71 to 0.99) in the fourth compared to the first quartile of insertion time (p trend 0.005). Insertion time was not associated with the detection rate of single adenomas ≥5 mm, multiple adenomas or advanced colorectal neoplasms. Conclusion Shorter insertion times were associated with increased rates of detection of small colorectal adenomas <5 mm. Cecal insertion time may be clinically relevant as missed small colorectal adenomas may progress to more advanced lesions. PMID:23915303
Merhej, Vicky; Raoult, Didier
2012-01-01
Darwin's theory about the evolution of species has been the object of considerable dispute. In this review, we have described seven key principles in Darwin's book The Origin of Species and tried to present how genomics challenge each of these concepts and improve our knowledge about evolution. Darwin believed that species evolution consists on a positive directional selection ensuring the “survival of the fittest.” The most developed state of the species is characterized by increasing complexity. Darwin proposed the theory of “descent with modification” according to which all species evolve from a single common ancestor through a gradual process of small modification of their vertical inheritance. Finally, the process of evolution can be depicted in the form of a tree. However, microbial genomics showed that evolution is better described as the “biological changes over time.” The mode of change is not unidirectional and does not necessarily favors advantageous mutations to increase fitness it is rather subject to random selection as a result of catastrophic stochastic processes. Complexity is not necessarily the completion of development: several complex organisms have gone extinct and many microbes including bacteria with intracellular lifestyle have streamlined highly effective genomes. Genomes evolve through large events of gene deletions, duplications, insertions, and genomes rearrangements rather than a gradual adaptative process. Genomes are dynamic and chimeric entities with gene repertoires that result from vertical and horizontal acquisitions as well as de novo gene creation. The chimeric character of microbial genomes excludes the possibility of finding a single common ancestor for all the genes recorded currently. Genomes are collections of genes with different evolutionary histories that cannot be represented by a single tree of life (TOL). A forest, a network or a rhizome of life may be more accurate to represent evolutionary relationships among species. PMID:22973559
Pestoides F, an atypical Yersinia pestis strain from the former Soviet Union.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, Emilio; Worsham, Patricia; Bearden, S.
2007-01-01
Unlike the classical Yersinia pestis strains, members of an atypical group of Y. pestis from Central Asia, denominated Y. pestis subspecies caucasica (also known as one of several pestoides types), are distinguished by a number of characteristics including their ability to ferment rhamnose and melibiose, their lack of the small plasmid encoding the plasminogen activator (pla) and pesticin, and their exceptionally large variants of the virulence plasmid pMT (encoding murine toxin and capsular antigen). We have obtained the entire genome sequence of Y. pestis Pestoides F, an isolate from the former Soviet Union that has enabled us to carryout amore » comprehensive genome-wide comparison of this organism's genomic content against the six published sequences of Y. pestis and their Y. pseudotuberculosis ancestor. Based on classical glycerol fermentation (+ve) and nitrate reduction (+ve) Y. pestis Pestoides F is an isolate that belongs to the biovar antiqua. This strain is unusual in other characteristics such as the fact that it carries a non-consensus V antigen (lcrV) sequence, and that unlike other Pla(-) strains, Pestoides F retains virulence by the parenteral and aerosol routes. The chromosome of Pestoides F is 4,517,345 bp in size comprising some 3,936 predicted coding sequences, while its pCD and pMT plasmids are 71,507 bp and 137,010 bp in size respectively. Comparison of chromosome-associated genes in Pestoides F with those in the other sequenced Y. pestis strains reveals differences ranging from strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. There is a single approximately 7 kb unique region in the chromosome not found in any of the completed Y. pestis strains sequenced to date, but which is present in the Y. pseudotuberculosis ancestor. Taken together, these findings are consistent with Pestoides F being derived from the most ancient lineage of Y. pestis yet sequenced.« less
Pestoides F, and Atypical Yersinia pestis Strain from the Former Soviet Union
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, E; Worsham, P; Bearden, S
2007-01-05
Unlike the classical Yersinia pestis strains, members of an atypical group of Y. pestis from Central Asia, denominated Y. pestis subspecies caucasica (also known as one of several pestoides types), are distinguished by a number of characteristics including their ability to ferment rhamnose and melibiose, their lacking the small plasmid encoding the plasminogen activator (pla) and pesticin, and their exceptionally large variants of the virulence plasmid pMT (encoding murine toxin and capsular antigen). We have obtained the entire genome sequence of Y. pestis Pestoides F, an isolate from the former Soviet Union that has enabled us to carryout a comprehensivemore » genome-wide comparison of this organism's genomic content against the six published sequences of Y. pestis and their Y. pseudotuberculosis ancestor. Based on classical glycerol fermentation (+ve) and nitrate reduction (+ve) Y. pestis Pestoides F is an isolate that belongs to the biovar antiqua. This strain is unusual in other characteristics such as the fact that it carries a non-consensus V antigen (lcrV) sequence, and that unlike other Pla{sup -} strains, Pestoides F retains virulence by the parenteral and aerosol routes. The chromosome of Pestoides F is 4,517,345 bp in size comprising some 3,936 predicted coding sequences, while its pCD and pMT plasmids are 71,507 bp and 137,010 bp in size respectively. Comparison of chromosome-associated genes in Pestoides F with those in the other sequenced Y. pestis strains, reveals a series of differences ranging from strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. There is a single {approx}7 kb unique region in the chromosome not found in any of the completed Y. pestis strains sequenced to date, but which is present in the Y. pseudotuberculosis ancestor. Taken together, these findings are consistent with Pestoides F being derived from the most ancient lineage of Y. pestis yet sequenced.« less
Traverse, Charles C; Ochman, Howard
2017-08-29
Advances in sequencing technologies have enabled direct quantification of genome-wide errors that occur during RNA transcription. These errors occur at rates that are orders of magnitude higher than rates during DNA replication, but due to technical difficulties such measurements have been limited to single-base substitutions and have not yet quantified the scope of transcription insertions and deletions. Previous reporter gene assay findings suggested that transcription indels are produced exclusively by elongation complex slippage at homopolymeric runs, so we enumerated indels across the protein-coding transcriptomes of Escherichia coli and Buchnera aphidicola , which differ widely in their genomic base compositions and incidence of repeat regions. As anticipated from prior assays, transcription insertions prevailed in homopolymeric runs of A and T; however, transcription deletions arose in much more complex sequences and were rarely associated with homopolymeric runs. By reconstructing the relocated positions of the elongation complex as inferred from the sequences inserted or deleted during transcription, we show that continuation of transcription after slippage hinges on the degree of nucleotide complementarity within the RNA:DNA hybrid at the new DNA template location. IMPORTANCE The high level of mistakes generated during transcription can result in the accumulation of malfunctioning and misfolded proteins which can alter global gene regulation and in the expenditure of energy to degrade these nonfunctional proteins. The transcriptome-wide occurrence of base substitutions has been elucidated in bacteria, but information on transcription insertions and deletions-errors that potentially have more dire effects on protein function-is limited to reporter gene constructs. Here, we capture the transcriptome-wide spectrum of insertions and deletions in Escherichia coli and Buchnera aphidicola and show that they occur at rates approaching those of base substitutions. Knowledge of the full extent of sequences subject to transcription indels supports a new model of bacterial transcription slippage, one that relies on the number of complementary bases between the transcript and the DNA template to which it slipped. Copyright © 2017 Traverse and Ochman.
Haines, Bryan; Hughes, James; Corbett, Mark; Shaw, Marie; Innes, Josie; Patel, Leena; Gecz, Jozef; Clayton-Smith, Jill; Thomas, Paul
2015-05-01
46,XX male sex reversal occurs in approximately 1: 20 000 live births and is most commonly caused by interchromosomal translocations of the Y-linked sex-determining gene, SRY. Rearrangements of the closely related SOX3 gene on the X chromosome are also associated with 46,XX male sex reversal. It has been hypothesized that sex reversal in the latter is caused by ectopic expression of SOX3 in the developing urogenital ridge where it triggers male development by acting as an analog of SRY. However, altered regulation of SOX3 in individuals with XX male sex reversal has not been demonstrated. Here we report a boy with SRY-negative XX male sex reversal who was diagnosed at birth with a small phallus, mixed gonads, and borderline-normal T. Molecular characterization of the affected individual was performed using array comparative genomic hybridization, fluorescent in situ hybridization of metaphase chromosomes, whole-genome sequencing, and RT-PCR expression analysis of lymphoblast cell lines. The affected male carries ∼774-kb insertion translocation from chromosome 1 into a human-specific palindromic sequence 82 kb distal to SOX3. Importantly, robust SOX3 expression was identified in cells derived from the affected individual but not from control XX or XY cells, indicating that the translocation has a direct effect on SOX3 regulation. This is the first demonstration of altered SOX3 expression in an individual with XX male sex reversal and suggests that SOX3 can substitute for SRY to initiate male development in humans.
Steige, Kim A.; Reimegård, Johan; Koenig, Daniel; Scofield, Douglas G.; Slotte, Tanja
2015-01-01
The selfing syndrome constitutes a suite of floral and reproductive trait changes that have evolved repeatedly across many evolutionary lineages in response to the shift to selfing. Convergent evolution of the selfing syndrome suggests that these changes are adaptive, yet our understanding of the detailed molecular genetic basis of the selfing syndrome remains limited. Here, we investigate the role of cis-regulatory changes during the recent evolution of the selfing syndrome in Capsella rubella, which split from the outcrosser Capsella grandiflora less than 200 ka. We assess allele-specific expression (ASE) in leaves and flower buds at a total of 18,452 genes in three interspecific F1 C. grandiflora x C. rubella hybrids. Using a hierarchical Bayesian approach that accounts for technical variation using genomic reads, we find evidence for extensive cis-regulatory changes. On average, 44% of the assayed genes show evidence of ASE; however, only 6% show strong allelic expression biases. Flower buds, but not leaves, show an enrichment of cis-regulatory changes in genomic regions responsible for floral and reproductive trait divergence between C. rubella and C. grandiflora. We further detected an excess of heterozygous transposable element (TE) insertions near genes with ASE, and TE insertions targeted by uniquely mapping 24-nt small RNAs were associated with reduced expression of nearby genes. Our results suggest that cis-regulatory changes have been important during the recent adaptive floral evolution in Capsella and that differences in TE dynamics between selfing and outcrossing species could be important for rapid regulatory divergence in association with mating system shifts. PMID:26318184
Generation of Knock-in Mouse by Genome Editing.
Fujii, Wataru
2017-01-01
Knock-in mice are useful for evaluating endogenous gene expressions and functions in vivo. Instead of the conventional gene-targeting method using embryonic stem cells, an exogenous DNA sequence can be inserted into the target locus in the zygote using genome editing technology. In this chapter, I describe the generation of epitope-tagged mice using engineered endonuclease and single-stranded oligodeoxynucleotide through the mouse zygote as an example of how to generate a knock-in mouse by genome editing.
Howard, Thomas P; Hayward, Andrew P; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A; Tohme, Joe; Kausch, Albert P; Mottinger, John P; Dellaporta, Stephen L
2014-01-01
Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform.
Straub, Shannon C K; Cronn, Richard C; Edwards, Christopher; Fishbein, Mark; Liston, Aaron
2013-01-01
Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae]) and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2-rpoC2 intergenic spacer of the plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found recent gene conversion of the mitochondrial rpoC2 pseudogene in Asclepias by the plastid gene, which reflects continued interaction of these genomes.
Howard, Thomas P.; Hayward, Andrew P.; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A.; Tohme, Joe; Kausch, Albert P.; Mottinger, John P.; Dellaporta, Stephen L.
2014-01-01
Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform. PMID:24498020
Straub, Shannon C.K.; Cronn, Richard C.; Edwards, Christopher; Fishbein, Mark; Liston, Aaron
2013-01-01
Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae]) and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2–rpoC2 intergenic spacer of the plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found recent gene conversion of the mitochondrial rpoC2 pseudogene in Asclepias by the plastid gene, which reflects continued interaction of these genomes. PMID:24029811
Halász, Júlia; Kodad, Ossama; Hegedűs, Attila
2014-07-01
Miniature inverted-repeat transposable elements (MITEs) are known to contribute to the evolution of plants, but only limited information is available for MITEs in the Prunus genome. We identified a MITE that has been named Falling Stones, FaSt. All structural features (349-bp size, 82-bp terminal inverted repeats and 9-bp target site duplications) are consistent with this MITE being a putative member of the Mutator transposase superfamily. FaSt showed a preferential accumulation in the short AT-rich segments of the euchromatin region of the peach genome. DNA sequencing and pollination experiments have been performed to confirm that the nested insertion of FaSt into the S-haplotype-specific F-box gene of apricot resulted in the breakdown of self-incompatibility (SI). A bioinformatics-based survey of the known Rosaceae and other genomes and a newly designed polymerase chain reaction (PCR) assay verified the Prunoideae-specific occurrence of FaSt elements. Phylogenetic analysis suggested a recent activity of FaSt in the Prunus genome. The occurrence of a nested insertion in the apricot genome further supports the recent activity of FaSt in response to abiotic stress conditions. This study reports on a presumably active non-autonomous Mutator element in Prunus that exhibits a major indirect genome shaping force through inducing loss-of-function mutation in the SI locus. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong
2013-11-01
Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.
Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A.; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S.
2013-01-01
DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301
Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S
2013-08-01
DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease.
Novel insertion mutation in a non-Jewish Caucasian type 1 Gaucher disease patient
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choy, F.Y.M.; Humphries, M.L.; Ferreira, P.
1997-01-20
Gaucher disease is the most prevalent lysosomal storage disorder. It is autosomal recessive, resulting in lysosomal glucocerebrosidase deficiency. Three clinical forms of Gaucher disease have been described: type 1 (nonneuronopathic), type 2 (acute neuronopathic), and type 3 (subacute neuronopathic). We performed PCR-thermal cycle sequence analysis of glucocerebrosidase genomic DNA and identified a novel mutation in a non-Jewish type 1 Gaucher disease patient. It is a C insertion in exon 3 at cDNA nucleotide position 122 and genomic nucleotide position 1626. This mutation causes a frameshift and, subsequently, four of the five codons immediately downstream of the insertion were changed whilemore » the sixth was converted to a stop codon, resulting in premature termination of protein translation. The 122CC insertion abolishes a Cac81 restriction endonuclease cleavage site, allowing a convenient and reliable method for detection using RFLP analysis of PCR-amplified glucocerebrosidase genomic DNA. The mutation in the other Gaucher allele was found to be an A{r_arrow}G substitution at glucocerebrosidase cDNA nucleotide position 1226 that so far has only been reported among type 1 Gaucher disease patients. Since mutation 122CC causes a frameshift and early termination of protein translation, it most likely results in a meaningless transcript and subsequently no residual glucocerebrosidase enzyme activity. We speculate that mutation 122CC may result in a worse prognosis than mutations associated with partial activity. When present in the homozygous form, it could be a lethal allele similar to what has been postulated for the other known insertion mutation, 84GG. Our patient, who is a compound heterozygote 122CC/1226G, has moderately severe type 1 Gaucher disease. Her clinical response to Ceredase{reg_sign} therapy that began 31 months ago has been favorable, though incomplete. 30 refs., 3 figs., 2 tabs.« less
Russo, Alice G; Eden, John-Sebastian; Enosi Tuipulotu, Daniel; Shi, Mang; Selechnik, Daniel; Shine, Richard; Rollins, Lee Ann; Holmes, Edward C; White, Peter A
2018-06-13
Cane toads are a notorious invasive species, inhabiting over 1.2 million km 2 of Australia and threatening native biodiversity. Release of pathogenic cane toad viruses is one possible biocontrol strategy yet is currently hindered by the poorly-described cane toad virome. Metatranscriptomic analysis of 16 cane toad livers revealed the presence of a novel and full-length picornavirus, Rhimavirus A (RhiV-A), a member of a reptile and amphibian specific-cluster of the Picornaviridae basal to the Kobuvirus -like group. In the combined liver transcriptome, we also identified a complete genome sequence of a distinct epsilonretrovirus, R. marina endogenous retrovirus (RMERV). The recently sequenced cane toad genome contains eight complete RMERV proviruses, as well as 21 additional truncated insertions. The oldest full length RMERV provirus was estimated to have inserted 1.9 MYA. To screen for these viral sequences in additional toads, we analysed publicly available transcriptomes from six diverse Australian locations. RhiV-A transcripts were identified in toads sampled from three locations across 1,000 km of Australia, stretching to the current Western Australia (WA) invasion front, whilst RMERV transcripts were observed at all six sites. Lastly, we scanned the cane toad genome for non-retroviral endogenous viral elements, finding three sequences related to small DNA viruses in the family Circoviridae This shows ancestral circoviral infection with subsequent genomic integration. The identification of these current and past viral infections enriches our knowledge of the cane toad virome, an understanding of which will facilitate future work on infection and disease in this important invasive species. Importance Cane toads are poisonous amphibians which were introduced to Australia in 1935 for insect control. Since then, their population has increased dramatically, and they now threat many native Australian species. One potential method to control the population is to release a cane toad virus with high mortality, yet few cane toad viruses have been characterised. This study samples cane toads from different Australian locations and uses an RNA sequencing and computational approach to find new viruses. We report novel complete picornavirus and retrovirus sequences which were genetically similar to viruses infecting frogs, reptiles and fish. Using data generated in other studies, we show that these viral sequences are present in cane toads from distinct Australian locations. Three sequences related to circoviruses were also found in the toad genome. The identification of new viral sequences will aid future studies which investigate their prevalence and potential as agents for biocontrol. Copyright © 2018 American Society for Microbiology.
Murukarthick, Jayakodi; Sampath, Perumal; Lee, Sang Choon; Choi, Beom-Soon; Senthil, Natesan; Liu, Shengyi; Yang, Tae-Jin
2014-06-20
MITE, TRIM and SINEs are miniature form transposable elements (mTEs) that are ubiquitous and dispersed throughout entire plant genomes. Tens of thousands of members cause insertion polymorphism at both the inter- and intra- species level. Therefore, mTEs are valuable targets and resources for development of markers that can be utilized for breeding, genetic diversity and genome evolution studies. Taking advantage of the completely sequenced genomes of Brassica rapa and B. oleracea, characterization of mTEs and building a curated database are prerequisite to extending their utilization for genomics and applied fields in Brassica crops. We have developed BrassicaTED as a unique web portal containing detailed characterization information for mTEs of Brassica species. At present, BrassicaTED has datasets for 41 mTE families, including 5894 and 6026 members from 20 MITE families, 1393 and 1639 members from 5 TRIM families, 1270 and 2364 members from 16 SINE families in B. rapa and B. oleracea, respectively. BrassicaTED offers different sections to browse structural and positional characteristics for every mTE family. In addition, we have added data on 289 MITE insertion polymorphisms from a survey of seven Brassica relatives. Genes with internal mTE insertions are shown with detailed gene annotation and microarray-based comparative gene expression data in comparison with their paralogs in the triplicated B. rapa genome. This database also includes a novel tool, K BLAST (Karyotype BLAST), for clear visualization of the locations for each member in the B. rapa and B. oleracea pseudo-genome sequences. BrassicaTED is a newly developed database of information regarding the characteristics and potential utility of mTEs including MITE, TRIM and SINEs in B. rapa and B. oleracea. The database will promote the development of desirable mTE-based markers, which can be utilized for genomics and breeding in Brassica species. BrassicaTED will be a valuable repository for scientists and breeders, promoting efficient research on Brassica species. BrassicaTED can be accessed at http://im-crop.snu.ac.kr/BrassicaTED/index.php.
New traits in crops produced by genome editing techniques based on deletions.
van de Wiel, C C M; Schaart, J G; Lotz, L A P; Smulders, M J M
2017-01-01
One of the most promising New Plant Breeding Techniques is genome editing (also called gene editing) with the help of a programmable site-directed nuclease (SDN). In this review, we focus on SDN-1, which is the generation of small deletions or insertions (indels) at a precisely defined location in the genome with zinc finger nucleases (ZFN), TALENs, or CRISPR-Cas9. The programmable nuclease is used to induce a double-strand break in the DNA, while the repair is left to the plant cell itself, and mistakes are introduced, while the cell is repairing the double-strand break using the relatively error-prone NHEJ pathway. From a biological point of view, it could be considered as a form of targeted mutagenesis. We first discuss improvements and new technical variants for SDN-1, in particular employing CRISPR-Cas, and subsequently explore the effectiveness of targeted deletions that eliminate the function of a gene, as an approach to generate novel traits useful for improving agricultural sustainability, including disease resistances. We compare them with examples of deletions that resulted in novel functionality as known from crop domestication and classical mutation breeding (both using radiation and chemical mutagens). Finally, we touch upon regulatory and access and benefit sharing issues regarding the plants produced.
Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G
2008-04-01
The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome-genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I-III in one clade, while plastome IV appears to be closest to the common ancestor.
Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V.; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G.
2008-01-01
The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome–genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I–III in one clade, while plastome IV appears to be closest to the common ancestor. PMID:18299283
Mu, John C.; Tootoonchi Afshar, Pegah; Mohiyuddin, Marghoob; Chen, Xi; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B.; Wong, Wing H.; Lam, Hugo Y. K.
2015-01-01
A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools. PMID:26412485
Srinivasachary; Dida, Mathews M; Gale, Mike D; Devos, Katrien M
2007-08-01
Finger millet is an allotetraploid (2n = 4x = 36) grass that belongs to the Chloridoideae subfamily. A comparative analysis has been carried out to determine the relationship of the finger millet genome with that of rice. Six of the nine finger millet homoeologous groups corresponded to a single rice chromosome each. Each of the remaining three finger millet groups were orthologous to two rice chromosomes, and in all the three cases one rice chromosome was inserted into the centromeric region of a second rice chromosome to give the finger millet chromosomal configuration. All observed rearrangements were, among the grasses, unique to finger millet and, possibly, the Chloridoideae subfamily. Gene orders between rice and finger millet were highly conserved, with rearrangements being limited largely to single marker transpositions and small putative inversions encompassing at most three markers. Only some 10% of markers mapped to non-syntenic positions in rice and finger millet and the majority of these were located in the distal 14% of chromosome arms, supporting a possible correlation between recombination and sequence evolution as has previously been observed in wheat. A comparison of the organization of finger millet, Panicoideae and Pooideae genomes relative to rice allowed us to infer putative ancestral chromosome configurations in the grasses.
Tenebrio molitor antifreeze protein gene identification and regulation.
Qin, Wensheng; Walker, Virginia K
2006-02-15
The yellow mealworm, Tenebrio molitor, is a freeze susceptible, stored product pest. Its winter survival is facilitated by the accumulation of antifreeze proteins (AFPs), encoded by a small gene family. We have now isolated 11 different AFP genomic clones from 3 genomic libraries. All the clones had a single coding sequence, with no evidence of intervening sequences. Three genomic clones were further characterized. All have putative TATA box sequences upstream of the coding regions and multiple potential poly(A) signal sequences downstream of the coding regions. A TmAFP regulatory region, B1037, conferred transcriptional activity when ligated to a luciferase reporter sequence and after transfection into an insect cell line. A 143 bp core promoter including a TATA box sequence was identified. Its promoter activity was increased 4.4 times by inserting an exotic 245 bp intron into the construct, similar to the enhancement of transgenic expression seen in several other systems. The addition of a duplication of the first 120 bp sequence from the 143 bp core promoter decreased promoter activity by half. Although putative hormonal response sequences were identified, none of the five hormones tested enhanced reporter activity. These studies on the mechanisms of AFP transcriptional control are important for the consideration of any transfer of freeze-resistance phenotypes to beneficial hosts.
Active Transposition in Genomes
Huang, Cheng Ran Lisa; Burns, Kathleen H.; Boeke, Jef D.
2013-01-01
Transposons are DNA sequences capable of moving in genomes. Early evidence showed their accumulation in many species and suggested their continued activity in at least isolated organisms. In the past decade, with the development of various genomic technologies, it has become abundantly clear that ongoing activity is the rule rather than the exception. Active transposons of various classes are observed throughout plants and animals, including humans. They continue to create new insertions, have an enormous variety of structural and functional impact on genes and genomes, and play important roles in genome evolution. Transposon activities have been identified and measured by employing various strategies. Here, we summarize evidence of current transposon activity in various plant and animal genomes. PMID:23145912
Liu, Yan; Wu, Bin; Weinstock, George; Walker, David H.; Yu, Xue-jie
2014-01-01
Louse borne typhus (also called epidemic typhus) was one of man's major scourges, and epidemics of the disease can be reignited when social, economic, or political systems are disrupted. The fear of a bioterrorist attack using the etiologic agent of typhus, Rickettsia prowazekii, was a reality. An attenuated typhus vaccine, R. prowazekii Madrid E strain, was observed to revert to virulence as demonstrated by isolation of the virulent revertant Evir strain from animals which were inoculated with Madrid E strain. The mechanism of the mutation in R. prowazekii that affects the virulence of the vaccine was not known. We sequenced the genome of the virulent revertant Evir strain and compared its genome sequence with the genome sequences of its parental strain, Madrid E. We found that only a single nucleotide in the entire genome was different between the vaccine strain Madrid E and its virulent revertant strain Evir. The mutation is a single nucleotide insertion in the methyltransferase gene (also known as PR028) in the vaccine strain that inactivated the gene. We also confirmed that the vaccine strain E did not cause fever in guinea pigs and the virulent revertant strain Evir caused fever in guinea pigs. We concluded that a single nucleotide insertion in the methyltransferase gene of R. prowazekii attenuated the R. prowazekii vaccine strain E. This suggested that an irreversible insertion or deletion mutation in the methyl transferase gene of R. prowazekii is required for Madrid E to be considered a safe vaccine. PMID:25412248
Liu, Yan; Wu, Bin; Weinstock, George; Walker, David H; Yu, Xue-Jie
2014-01-01
Louse borne typhus (also called epidemic typhus) was one of man's major scourges, and epidemics of the disease can be reignited when social, economic, or political systems are disrupted. The fear of a bioterrorist attack using the etiologic agent of typhus, Rickettsia prowazekii, was a reality. An attenuated typhus vaccine, R. prowazekii Madrid E strain, was observed to revert to virulence as demonstrated by isolation of the virulent revertant Evir strain from animals which were inoculated with Madrid E strain. The mechanism of the mutation in R. prowazekii that affects the virulence of the vaccine was not known. We sequenced the genome of the virulent revertant Evir strain and compared its genome sequence with the genome sequences of its parental strain, Madrid E. We found that only a single nucleotide in the entire genome was different between the vaccine strain Madrid E and its virulent revertant strain Evir. The mutation is a single nucleotide insertion in the methyltransferase gene (also known as PR028) in the vaccine strain that inactivated the gene. We also confirmed that the vaccine strain E did not cause fever in guinea pigs and the virulent revertant strain Evir caused fever in guinea pigs. We concluded that a single nucleotide insertion in the methyltransferase gene of R. prowazekii attenuated the R. prowazekii vaccine strain E. This suggested that an irreversible insertion or deletion mutation in the methyl transferase gene of R. prowazekii is required for Madrid E to be considered a safe vaccine.
Global Genomic Diversity of Oryza sativa Varieties Revealed by Comparative Physical Mapping
Wang, Xiaoming; Kudrna, David A.; Pan, Yonglong; Wang, Hao; Liu, Lin; Lin, Haiyan; Zhang, Jianwei; Song, Xiang; Goicoechea, Jose Luis; Wing, Rod A.; Zhang, Qifa; Luo, Meizhong
2014-01-01
Bacterial artificial chromosome (BAC) physical maps embedding a large number of BAC end sequences (BESs) were generated for Oryza sativa ssp. indica varieties Minghui 63 (MH63) and Zhenshan 97 (ZS97) and were compared with the genome sequences of O. sativa spp. japonica cv. Nipponbare and O. sativa ssp. indica cv. 93-11. The comparisons exhibited substantial diversities in terms of large structural variations and small substitutions and indels. Genome-wide BAC-sized and contig-sized structural variations were detected, and the shared variations were analyzed. In the expansion regions of the Nipponbare reference sequence, in comparison to the MH63 and ZS97 physical maps, as well as to the previously constructed 93-11 physical map, the amounts and types of the repeat contents, and the outputs of gene ontology analysis, were significantly different from those of the whole genome. Using the physical maps of four wild Oryza species from OMAP (http://www.omap.org) as a control, we detected many conserved and divergent regions related to the evolution process of O. sativa. Between the BESs of MH63 and ZS97 and the two reference sequences, a total of 1532 polymorphic simple sequence repeats (SSRs), 71,383 SNPs, 1767 multiple nucleotide polymorphisms, 6340 insertions, and 9137 deletions were identified. This study provides independent whole-genome resources for intra- and intersubspecies comparisons and functional genomics studies in O. sativa. Both the comparative physical maps and the GBrowse, which integrated the QTL and molecular markers from GRAMENE (http://www.gramene.org) with our physical maps and analysis results, are open to the public through our Web site (http://gresource.hzau.edu.cn/resource/resource.html). PMID:24424778
Arashida, Ryo; Kakizawa, Shigeyuki; Hoshi, Ayaka; Ishii, Yoshiko; Jung, Hee-Young; Kagiwada, Satoshi; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou
2008-04-01
Phytoplasmas are phloem-limited plant pathogens that are transmitted by insect vectors and are associated with diseases in hundreds of plant species. Despite their small sizes, phytoplasma genomes have repeat-rich sequences, which are due to several genes that are encoded as multiple copies. These multiple genes exist in a gene cluster, the potential mobile unit (PMU). PMUs are present at several distinct regions in the phytoplasma genome. The multicopy genes encoded by PMUs (herein named mobile unit genes [MUGs]) and similar genes elsewhere in the genome (herein named fundamental genes [FUGs]) are likely to have the same function based on their annotations. In this manuscript we show evidence that MUGs and FUGs do not cluster together within the same clade. Each MUG is in a cluster with a short branch length, suggesting that MUGs are recently diverged paralogs, whereas the origin of FUGs is different from that of MUGs. We also compared the genome structures around the lplA gene in two derivative lines of the 'Candidatus Phytoplasma asteris' OY strain, the severe-symptom line W (OY-W) and the mild-symptom line M (OY-M). The gene organizations of the nucleotide sequences upstream of the lplA genes of OY-W and OY-M were dramatically different. The tra5 insertion sequence, an element of PMUs, was found only in this region in OY-W. These results suggest that transposition of entire PMUs and PMU sections has occurred frequently in the OY phytoplasma genome. The difference in the pathogenicities of OY-W and OY-M might be caused by the duplication and transposition of PMUs, followed by genome rearrangement.
Beare, Paul A.; Samuel, James E.; Howe, Dale; Virtaneva, Kimmo; Porcella, Stephen F.; Heinzen, Robert A.
2006-01-01
Coxiella burnetii, a gram-negative obligate intracellular bacterium, causes human Q fever and is considered a potential agent of bioterrorism. Distinct genomic groups of C. burnetii are revealed by restriction fragment-length polymorphisms (RFLP). Here we comprehensively define the genetic diversity of C. burnetii by hybridizing the genomes of 20 RFLP-grouped and four ungrouped isolates from disparate sources to a high-density custom Affymetrix GeneChip containing all open reading frames (ORFs) of the Nine Mile phase I (NMI) reference isolate. We confirmed the relatedness of RFLP-grouped isolates and showed that two ungrouped isolates represent distinct genomic groups. Isolates contained up to 20 genomic polymorphisms consisting of 1 to 18 ORFs each. These were mostly complete ORF deletions, although partial deletions, point mutations, and insertions were also identified. A total of 139 chromosomal and plasmid ORFs were polymorphic among all C. burnetii isolates, representing ca. 7% of the NMI coding capacity. Approximately 67% of all deleted ORFs were hypothetical, while 9% were annotated in NMI as nonfunctional (e.g., frameshifted). The remaining deleted ORFs were associated with diverse cellular functions. The only deletions associated with isogenic NMI variants of attenuated virulence were previously described large deletions containing genes involved in lipopolysaccharide (LPS) biosynthesis, suggesting that these polymorphisms alone are responsible for the lower virulence of these variants. Interestingly, a variant of the Australia QD isolate producing truncated LPS had no detectable deletions, indicating LPS truncation can occur via small genetic changes. Our results provide new insight into the genetic diversity and virulence potential of Coxiella species. PMID:16547017
Elevated mitochondrial genome variation after 50 generations of radiation exposure in a wild rodent.
Baker, Robert J; Dickins, Benjamin; Wickliffe, Jeffrey K; Khan, Faisal A A; Gaschak, Sergey; Makova, Kateryna D; Phillips, Caleb D
2017-09-01
Currently, the effects of chronic, continuous low dose environmental irradiation on the mitochondrial genome of resident small mammals are unknown. Using the bank vole ( Myodes glareolus ) as a model system, we tested the hypothesis that approximately 50 generations of exposure to the Chernobyl environment has significantly altered genetic diversity of the mitochondrial genome. Using deep sequencing, we compared mitochondrial genomes from 131 individuals from reference sites with radioactive contamination comparable to that present in northern Ukraine before the 26 April 1986 meltdown, to populations where substantial fallout was deposited following the nuclear accident. Population genetic variables revealed significant differences among populations from contaminated and uncontaminated localities. Therefore, we rejected the null hypothesis of no significant genetic effect from 50 generations of exposure to the environment created by the Chernobyl meltdown. Samples from contaminated localities exhibited significantly higher numbers of haplotypes and polymorphic loci, elevated genetic diversity, and a significantly higher average number of substitutions per site across mitochondrial gene regions. Observed genetic variation was dominated by synonymous mutations, which may indicate a history of purify selection against nonsynonymous or insertion/deletion mutations. These significant differences were not attributable to sample size artifacts. The observed increase in mitochondrial genomic diversity in voles from radioactive sites is consistent with the possibility that chronic, continuous irradiation resulting from the Chernobyl disaster has produced an accelerated mutation rate in this species over the last 25 years. Our results, being the first to demonstrate this phenomenon in a wild mammalian species, are important for understanding genetic consequences of exposure to low-dose radiation sources.
Whole-genome sequencing of Atacama skeleton shows novel mutations linked with dysplasia.
Bhattacharya, Sanchita; Li, Jian; Sockell, Alexandra; Kan, Matthew J; Bava, Felice A; Chen, Shann-Ching; Ávila-Arcos, María C; Ji, Xuhuai; Smith, Emery; Asadi, Narges B; Lachman, Ralph S; Lam, Hugo Y K; Bustamante, Carlos D; Butte, Atul J; Nolan, Garry P
2018-04-01
Over a decade ago, the Atacama humanoid skeleton (Ata) was discovered in the Atacama region of Chile. The Ata specimen carried a strange phenotype-6-in stature, fewer than expected ribs, elongated cranium, and accelerated bone age-leading to speculation that this was a preserved nonhuman primate, human fetus harboring genetic mutations, or even an extraterrestrial. We previously reported that it was human by DNA analysis with an estimated bone age of about 6-8 yr at the time of demise. To determine the possible genetic drivers of the observed morphology, DNA from the specimen was subjected to whole-genome sequencing using the Illumina HiSeq platform with an average 11.5× coverage of 101-bp, paired-end reads. In total, 3,356,569 single nucleotide variations (SNVs) were found as compared to the human reference genome, 518,365 insertions and deletions (indels), and 1047 structural variations (SVs) were detected. Here, we present the detailed whole-genome analysis showing that Ata is a female of human origin, likely of Chilean descent, and its genome harbors mutations in genes ( COL1A1 , COL2A1 , KMT2D , FLNB , ATR , TRIP11 , PCNT ) previously linked with diseases of small stature, rib anomalies, cranial malformations, premature joint fusion, and osteochondrodysplasia (also known as skeletal dysplasia). Together, these findings provide a molecular characterization of Ata's peculiar phenotype, which likely results from multiple known and novel putative gene mutations affecting bone development and ossification. © 2018 Bhattacharya et al.; Published by Cold Spring Harbor Laboratory Press.
An Inducible, Isogenic Cancer Cell Line System for Targeting the State of Mismatch Repair Deficiency
Bailis, Julie M.; Gordon, Marcia L.; Gurgel, Jesse L.; Komor, Alexis C.; Barton, Jacqueline K.; Kirsch, Ilan R.
2013-01-01
The DNA mismatch repair system (MMR) maintains genome stability through recognition and repair of single-base mismatches and small insertion-deletion loops. Inactivation of the MMR pathway causes microsatellite instability and the accumulation of genomic mutations that can cause or contribute to cancer. In fact, 10-20% of certain solid and hematologic cancers are MMR-deficient. MMR-deficient cancers do not respond to some standard of care chemotherapeutics because of presumed increased tolerance of DNA damage, highlighting the need for novel therapeutic drugs. Toward this goal, we generated isogenic cancer cell lines for direct comparison of MMR-proficient and MMR-deficient cells. We engineered NCI-H23 lung adenocarcinoma cells to contain a doxycycline-inducible shRNA designed to suppress the expression of the mismatch repair gene MLH1, and compared single cell subclones that were uninduced (MLH1-proficient) versus induced for the MLH1 shRNA (MLH1-deficient). Here we present the characterization of these MMR-inducible cell lines and validate a novel class of rhodium metalloinsertor compounds that differentially inhibit the proliferation of MMR-deficient cancer cells. PMID:24205301
Soifer, Harris S; Zaragoza, Adriana; Peyvan, Maany; Behlke, Mark A; Rossi, John J
2005-01-01
Long interspersed nuclear elements (LINE-1 or L1) comprise 17% of the human genome, although only 80-100 L1s are considered retrotransposition-competent (RC-L1). Despite their small number, RC-L1s are still potential hazards to genome integrity through insertional mutagenesis, unequal recombination and chromosome rearrangements. In this study, we provide several lines of evidence that the LINE-1 retrotransposon is susceptible to RNA interference (RNAi). First, double-stranded RNA (dsRNA) generated in vitro from an L1 template is converted into functional short interfering RNA (siRNA) by DICER, the RNase III enzyme that initiates RNAi in human cells. Second, pooled siRNA from in vitro cleavage of L1 dsRNA, as well as synthetic L1 siRNA, targeting the 5'-UTR leads to sequence-specific mRNA degradation of an L1 fusion transcript. Finally, both synthetic and pooled siRNA suppressed retrotransposition from a highly active RC-L1 clone in cell culture assay. Our report is the first to demonstrate that a human transposable element is subjected to RNAi.
Mutation detection using automated fluorescence-based sequencing.
Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju
2008-04-01
The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.
Wicker, Thomas; Yu, Yeisoo; Haberer, Georg; Mayer, Klaus F. X.; Marri, Pradeep Reddy; Rounsley, Steve; Chen, Mingsheng; Zuccolo, Andrea; Panaud, Olivier; Wing, Rod A.; Roffler, Stefan
2016-01-01
DNA (class 2) transposons are mobile genetic elements which move within their ‘host' genome through excising and re-inserting elsewhere. Although the rice genome contains tens of thousands of such elements, their actual role in evolution is still unclear. Analysing over 650 transposon polymorphisms in the rice species Oryza sativa and Oryza glaberrima, we find that DNA repair following transposon excisions is associated with an increased number of mutations in the sequences neighbouring the transposon. Indeed, the 3,000 bp flanking the excised transposons can contain over 10 times more mutations than the genome-wide average. Since DNA transposons preferably insert near genes, this is correlated with increases in mutation rates in coding sequences and regulatory regions. Most importantly, we find this phenomenon also in maize, wheat and barley. Thus, these findings suggest that DNA transposon activity is a major evolutionary force in grasses which provide the basis of most food consumed by humankind. PMID:27599761
Hulse-Kemp, Amanda M; Maheshwari, Shamoni; Stoffel, Kevin; Hill, Theresa A; Jaffe, David; Williams, Stephen R; Weisenfeld, Neil; Ramakrishnan, Srividya; Kumar, Vijay; Shah, Preyas; Schatz, Michael C; Church, Deanna M; Van Deynze, Allen
2018-01-01
Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper ( Capsicum annuum ) genome with a single Linked-Read library. Plant genomes, including pepper, are characterized by long, highly similar repetitive sequences. Accordingly, significant effort is used to ensure that the sequenced plant is highly homozygous and the resulting assembly is a haploid consensus. With a phased assembly approach, we targeted a heterozygous F 1 derived from a wide cross to assess the ability to derive both haplotypes and characterize a pungency gene with a large insertion/deletion. The Supernova software generated a highly ordered, more contiguous sequence assembly than all currently available C. annuum reference genomes. Over 83% of the final assembly was anchored and oriented using four publicly available de novo linkage maps. A comparison of the annotation of conserved eukaryotic genes indicated the completeness of assembly. The validity of the phased assembly is further demonstrated with the complete recovery of both 2.5-Kb insertion/deletion haplotypes of the PUN1 locus in the F 1 sample that represents pungent and nonpungent peppers, as well as nearly full recovery of the BUSCO2 gene set within each of the two haplotypes. The most contiguous pepper genome assembly to date has been generated which demonstrates that Linked-Read library technology provides a tool to de novo assemble complex highly repetitive heterozygous plant genomes. This technology can provide an opportunity to cost-effectively develop high-quality genome assemblies for other complex plants and compare structural and gene differences through accurate haplotype reconstruction.
Drancourt, M
2012-03-01
With plague being not only a subject of interest for historians, but still a disease of public health concern in several countries, mainly in Africa, there were hopes that analyses of the Yersinia pestis genomes would put an end to this deadly epidemic pathogen. Genomics revealed that Y. pestis isolates evolved from Yersinia pseudotuberculosis in Central Asia some millennia ago, after the acquisition of two Y. pestis-specific plasmids balanced genomic reduction parallel with the expansion of insertion sequences, illustrating the modern concept that, except for the acquisition of plasmid-borne toxin-encoding genes, the increased virulence of Y. pestis resulted from gene loss rather than gene acquisition. The telluric persistence of Y. pestis reminds us of this close relationship, and matters in terms of plague epidemiology. Whereas biotype Orientalis isolates spread worldwide, the Antiqua and Medievalis isolates showed more limited expansion. In addition to animal ectoparasites, human ectoparasites such as the body louse may have participated in this expansion and in devastating historical epidemics. The recent analysis of a Black Death genome indicated that it was more closely related to the Orientalis branch than to the Medievalis branch. Modern Y. pestis isolates grossly exhibit the same gene content, but still undergo micro-evolution in geographically limited areas by differing in the genome architecture, owing to inversions near insertion sequences and the stabilization of the YpfPhi prophage in Orientalis biotype isolates. Genomics have provided several new molecular tools for the genotyping and phylogeographical tracing of isolates and description of plague foci. However, genomics and post-genomics approaches have not yet provided new tools for the prevention, diagnosis and management of plague patients and the plague epidemics still raging in some sub-Saharan countries. © 2012 The Author. Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious Diseases.
Structural diversity of domain superfamilies in the CATH database.
Reeves, Gabrielle A; Dallman, Timothy J; Redfern, Oliver C; Akpor, Adrian; Orengo, Christine A
2006-07-14
The CATH database of domain structures has been used to explore the structural variation of homologous domains in 294 well populated domain structure superfamilies, each containing at least three sequence diverse relatives. Our analyses confirm some previously detected trends relating sequence divergence to structural variation but for a much larger dataset and in some superfamilies the new data reveal exceptional structural variation. Use of a new algorithm (2DSEC) to analyse variability in secondary structure compositions across a superfamily sheds new light on how structures evolve. 2DSEC detects inserted secondary structures that embellish the core of conserved secondary structures found throughout the superfamily. Analysis showed that for 56% of highly populated superfamilies (>9 sequence diverse relatives), there are twofold or more increases in the numbers of secondary structures in some relatives. In some families fivefold increases occur, sometimes modifying the fold of the domain. Manual inspection of secondary structure insertions or embellishments in 48 particularly variable superfamilies revealed that although these insertions were usually discontiguous in the sequence they were often co-located in 3D resulting in a larger structural motif that often modified the geometry of the active site or the surface conformation promoting diverse domain partnerships and protein interactions. These observations, supported by automatic analysis of all well populated CATH families, suggest that accretion of small secondary structure insertions may provide a simple mechanism for evolving new functions in diverse relatives. Some layered domain architectures (e.g. mainly-beta and alpha-beta sandwiches) that recur highly in the genomes more frequently exploit these types of embellishments to modify function. In these architectures, aggregation occurs most often at the edges, top or bottom of the beta-sheets. Information on structural variability across domain superfamilies has been made available through the CATH Dictionary of Homologous Structures (DHS).
USDA-ARS?s Scientific Manuscript database
Over 10,000 new mutants have been added to the UniformMu reverse genetics resource in release 7, bringing the total to over 67,000 germinal transposon insertions. These are available in 11,140 independent seed stocks. Close to half of the maize filtered gene set (42%) is represented by at least one ...
Using Cellular Proteins to Reveal Mechanisms of HIV Infection | Center for Cancer Research
A vital step in HIV infection is the insertion of viral DNA into the genome of the host cell. In order for the insertion to occur, viral nucleic acid must be transported through the membrane that separates the main cellular compartment (the cytoplasm) from the nucleus, where the host DNA is located. Scientists are actively studying the mechanism used to transport viral DNA
Novel modes of RNA editing in mitochondria
Moreira, Sandrine; Valach, Matus; Aoulad-Aissa, Mohamed; Otto, Christian; Burger, Gertraud
2016-01-01
Abstract Gene structure and expression in diplonemid mitochondria are unparalleled. Genes are fragmented in pieces (modules) that are separately transcribed, followed by the joining of module transcripts to contiguous RNAs. Some instances of unique uridine insertion RNA editing at module boundaries were noted, but the extent and potential occurrence of other editing types remained unknown. Comparative analysis of deep transcriptome and genome data from Diplonema papillatum mitochondria reveals ∼220 post-transcriptional insertions of uridines, but no insertions of other nucleotides nor deletions. In addition, we detect in total 114 substitutions of cytosine by uridine and adenosine by inosine, amassed into unusually compact clusters. Inosines in transcripts were confirmed experimentally. This is the first report of adenosine-to-inosine editing of mRNAs and ribosomal RNAs in mitochondria. In mRNAs, editing causes mostly amino-acid additions and non-synonymous substitutions; in ribosomal RNAs, it permits formation of canonical secondary structures. Two extensively edited transcripts were compared across four diplonemids. The pattern of uridine-insertion editing is strictly conserved, whereas substitution editing has diverged dramatically, but still rendering diplonemid proteins more similar to other eukaryotic orthologs. We posit that RNA editing not only compensates but also sustains, or even accelerates, ultra-rapid evolution of genome structure and sequence in diplonemid mitochondria. PMID:27001515
Insertion Sequence-Caused Large Scale-Rearrangements in the Genome of Escherichia coli
2016-07-18
rearrangements in the genome of Escherichia coli Heewook Lee1,2, Thomas G. Doak3,4, Ellen Popodi3, Patricia L. Foster3 and Haixu Tang1,* 1School of...and excisions of IS elements and recombi- nation between homologous IS elements identified in a large collection of Escherichia coli mutation accu...scale rear- rangements arose in the Escherichia coli genome during a long-term evolution experiment in a recent study (8). Com- bining WGSS with
Pietras, D F; Bennett, K L; Siracusa, L D; Woodworth-Gutai, M; Chapman, V M; Gross, K W; Kane-Haas, C; Hastie, N D
1983-01-01
We report the construction of a small library of recombinant plasmids containing Mus musculus repetitive DNA inserts. The repetitive cloned fraction was derived from denatured genomic DNA by reassociation to a Cot value at which repetitive, but not unique, sequences have reannealed followed by exhaustive S1 nuclease treatment to degrade single stranded DNA. Initial characterizations of this library by colony filter hybridizations have led to the identification of a previously undetected M. musculus minor satellite as well as to clones containing M. musculus major satellite sequences. This new satellite is repeated 10-20 times less than the major satellite in the M. musculus genome. It has a repeat length of 130 nucleotides compared with the M. musculus major satellite with a repeat length of 234 nucleotides. Sequence analysis of the minor satellite has shown that it has a 29 base pair region with extensive homology to one of the major satellite repeating subunits. We also show by in situ hybridization that this minor satellite sequence is located at the centromeres and possibly the arms of at least half the M musculus chromosomes. Sequences related to the minor satellite have been found in the DNA of a related Mus species, Mus spretus, and may represent the major satellite of that species. Images PMID:6314268
NASA Technical Reports Server (NTRS)
Norga, Koenraad K.; Gurganus, Marjorie C.; Dilda, Christy L.; Yamamoto, Akihiko; Lyman, Richard F.; Patel, Prajal H.; Rubin, Gerald M.; Hoskins, Roger A.; Mackay, Trudy F.; Bellen, Hugo J.
2003-01-01
BACKGROUND: The identification of the function of all genes that contribute to specific biological processes and complex traits is one of the major challenges in the postgenomic era. One approach is to employ forward genetic screens in genetically tractable model organisms. In Drosophila melanogaster, P element-mediated insertional mutagenesis is a versatile tool for the dissection of molecular pathways, and there is an ongoing effort to tag every gene with a P element insertion. However, the vast majority of P element insertion lines are viable and fertile as homozygotes and do not exhibit obvious phenotypic defects, perhaps because of the tendency for P elements to insert 5' of transcription units. Quantitative genetic analysis of subtle effects of P element mutations that have been induced in an isogenic background may be a highly efficient method for functional genome annotation. RESULTS: Here, we have tested the efficacy of this strategy by assessing the extent to which screening for quantitative effects of P elements on sensory bristle number can identify genes affecting neural development. We find that such quantitative screens uncover an unusually large number of genes that are known to function in neural development, as well as genes with yet uncharacterized effects on neural development, and novel loci. CONCLUSIONS: Our findings establish the use of quantitative trait analysis for functional genome annotation through forward genetics. Similar analyses of quantitative effects of P element insertions will facilitate our understanding of the genes affecting many other complex traits in Drosophila.
Hensing, Thomas; Schrock, Alexa B.; Allen, Justin; Sanford, Eric; Gowen, Kyle; Kulkarni, Atul; He, Jie; Suh, James H.; Lipson, Doron; Elvin, Julia A.; Yelensky, Roman; Chalmers, Zachary; Chmielecki, Juliann; Peled, Nir; Klempner, Samuel J.; Firozvi, Kashif; Frampton, Garrett M.; Molina, Julian R.; Menon, Smitha; Brahmer, Julie R.; MacMahon, Heber; Nowak, Jan; Ou, Sai-Hong Ignatius; Zauderer, Marjorie; Ladanyi, Marc; Zakowski, Maureen; Fischbach, Neil; Ross, Jeffrey S.; Stephens, Phil J.; Miller, Vincent A.; Wakelee, Heather
2016-01-01
Introduction. For patients with non-small cell lung cancer (NSCLC) to benefit from ALK inhibitors, sensitive and specific detection of ALK genomic rearrangements is needed. ALK break-apart fluorescence in situ hybridization (FISH) is the U.S. Food and Drug Administration approved and standard-of-care diagnostic assay, but identification of ALK rearrangements by other methods reported in NSCLC cases that tested negative for ALK rearrangements by FISH suggests a significant false-negative rate. We report here a large series of NSCLC cases assayed by hybrid-capture-based comprehensive genomic profiling (CGP) in the course of clinical care. Materials and Methods. Hybrid-capture-based CGP using next-generation sequencing was performed in the course of clinical care of 1,070 patients with advanced lung cancer. Each tumor sample was evaluated for all classes of genomic alterations, including base-pair substitutions, insertions/deletions, copy number alterations and rearrangements, as well as fusions/rearrangements. Results. A total of 47 patients (4.4%) were found to harbor ALK rearrangements, of whom 41 had an EML4-ALK fusion, and 6 had other fusion partners, including 3 previously unreported rearrangement events: EIF2AK-ALK, PPM1B-ALK, and PRKAR1A-ALK. Of 41 patients harboring ALK rearrangements, 31 had prior FISH testing results available. Of these, 20 were ALK FISH positive, and 11 (35%) were ALK FISH negative. Of the latter 11 patients, 9 received crizotinib based on the CGP results, and 7 achieved a response with median duration of 17 months. Conclusion. Comprehensive genomic profiling detected canonical ALK rearrangements and ALK rearrangements with noncanonical fusion partners in a subset of patients with NSCLC with previously negative ALK FISH results. In this series, such patients had durable responses to ALK inhibitors, comparable to historical response rates for ALK FISH-positive cases. Implications for Practice: Comprehensive genomic profiling (CGP) that includes hybrid capture and specific baiting of intron 19 of ALK is a highly sensitive, alternative method for identification of drug-sensitive ALK fusions in patients with non-small cell lung cancer (NSCLC) who had previously tested negative using standard ALK fluorescence in situ hybridization (FISH) diagnostic assays. Given the proven benefit of treatment with crizotinib and second-generation ALK inhibitors in patients with ALK fusions, CGP should be considered in patients with NSCLC, including those who have tested negative for other alterations, including negative results using ALK FISH testing. PMID:27245569
Ali, Siraj M; Hensing, Thomas; Schrock, Alexa B; Allen, Justin; Sanford, Eric; Gowen, Kyle; Kulkarni, Atul; He, Jie; Suh, James H; Lipson, Doron; Elvin, Julia A; Yelensky, Roman; Chalmers, Zachary; Chmielecki, Juliann; Peled, Nir; Klempner, Samuel J; Firozvi, Kashif; Frampton, Garrett M; Molina, Julian R; Menon, Smitha; Brahmer, Julie R; MacMahon, Heber; Nowak, Jan; Ou, Sai-Hong Ignatius; Zauderer, Marjorie; Ladanyi, Marc; Zakowski, Maureen; Fischbach, Neil; Ross, Jeffrey S; Stephens, Phil J; Miller, Vincent A; Wakelee, Heather; Ganesan, Shridar; Salgia, Ravi
2016-06-01
For patients with non-small cell lung cancer (NSCLC) to benefit from ALK inhibitors, sensitive and specific detection of ALK genomic rearrangements is needed. ALK break-apart fluorescence in situ hybridization (FISH) is the U.S. Food and Drug Administration approved and standard-of-care diagnostic assay, but identification of ALK rearrangements by other methods reported in NSCLC cases that tested negative for ALK rearrangements by FISH suggests a significant false-negative rate. We report here a large series of NSCLC cases assayed by hybrid-capture-based comprehensive genomic profiling (CGP) in the course of clinical care. Hybrid-capture-based CGP using next-generation sequencing was performed in the course of clinical care of 1,070 patients with advanced lung cancer. Each tumor sample was evaluated for all classes of genomic alterations, including base-pair substitutions, insertions/deletions, copy number alterations and rearrangements, as well as fusions/rearrangements. A total of 47 patients (4.4%) were found to harbor ALK rearrangements, of whom 41 had an EML4-ALK fusion, and 6 had other fusion partners, including 3 previously unreported rearrangement events: EIF2AK-ALK, PPM1B-ALK, and PRKAR1A-ALK. Of 41 patients harboring ALK rearrangements, 31 had prior FISH testing results available. Of these, 20 were ALK FISH positive, and 11 (35%) were ALK FISH negative. Of the latter 11 patients, 9 received crizotinib based on the CGP results, and 7 achieved a response with median duration of 17 months. Comprehensive genomic profiling detected canonical ALK rearrangements and ALK rearrangements with noncanonical fusion partners in a subset of patients with NSCLC with previously negative ALK FISH results. In this series, such patients had durable responses to ALK inhibitors, comparable to historical response rates for ALK FISH-positive cases. Comprehensive genomic profiling (CGP) that includes hybrid capture and specific baiting of intron 19 of ALK is a highly sensitive, alternative method for identification of drug-sensitive ALK fusions in patients with non-small cell lung cancer (NSCLC) who had previously tested negative using standard ALK fluorescence in situ hybridization (FISH) diagnostic assays. Given the proven benefit of treatment with crizotinib and second-generation ALK inhibitors in patients with ALK fusions, CGP should be considered in patients with NSCLC, including those who have tested negative for other alterations, including negative results using ALK FISH testing. ©AlphaMed Press.
Horvath, Robert
2017-01-01
Abstract To avoid negative effects of transposable element (TE) proliferation, plants epigenetically silence TEs using a number of mechanisms, including RNA-directed DNA methylation. These epigenetic modifications can extend outside the boundaries of TE insertions and lead to silencing of nearby genes, resulting in a trade-off between TE silencing and interference with nearby gene regulation. Therefore, purifying selection is expected to remove silenced TE insertions near genes more efficiently and prevent their accumulation within a population. To explore how effects of TE silencing on gene regulation shapes purifying selection on TEs, we analyzed whole genome sequencing data from 166 individuals of a large population of the outcrossing species Capsella grandiflora. We found that most TEs are rare, and in chromosome arms, silenced TEs are exposed to stronger purifying selection than those that are not silenced by 24-nucleotide small RNAs, especially with increasing proximity to genes. An age-of-allele test of neutrality on a subset of TEs supports our inference of purifying selection on silenced TEs, suggesting that our results are robust to varying transposition rates. Our results provide new insights into the processes affecting the accumulation of TEs in an outcrossing species and support the view that epigenetic silencing of TEs results in a trade-off between preventing TE proliferation and interference with nearby gene regulation. We also suggest that in the centromeric and pericentromeric regions, the negative aspects of epigenetic TE silencing are missing. PMID:29036316
Kassis, J. A.
1994-01-01
We have previously shown that a 2-kb fragment of engrailed DNA can suppress expression of a linked marker gene, white, in the P element vector CaSpeR. This suppression is dependent on the presence of two copies of engrailed DNA-containing P elements (P[en]) in proximity in the Drosophila genome (either in cis or in trans). In this study, the 2-kb fragment was dissected and found to contain three fragments of DNA which could mediate white suppression [called ``pairing-sensitive sites'' (PS)]. A PS site was also identified in regulatory DNA from the Drosophila escargot gene. The eye colors of six different P[en] insertions in the escargot gene suggest an interaction between P[en]-encoded and genome-encoded PS sites. I hypothesize that white gene expression from P[en] is repressed by the formation of a protein complex which is initiated at the engrailed PS sites and also requires interactions with flanking genomic DNA. Genes were sought which influence the function of PS sites. Mutations in some Polycomb and trithorax group genes were found to affect the eye color from some P[en] insertion sites. However, different mutations affected expression from different P[en] insertion sites and no one mutation was found to affect expression from all P[en] insertion sites examined. These results suggest that white expression from P[en] is not directly regulated by members of the Polycomb and trithorax group genes, but in some cases can be influenced by them. I propose that engrailed PS sites normally act to promote interactions between distantly located engrailed regulatory sites and the engrailed promoter. PMID:8005412
2011-01-01
Background Twenty-nine Marek's disease virus (MDV) strains were isolated during a 3 year period (2007-2010) from vaccinated and infected chicken flocks in Poland. These strains had caused severe clinical symptoms and lesions. In spite of proper vaccination with mono- or bivalent vaccines against Marek's disease (MD), the chickens developed symptoms of MD with paralysis. Because of this we decided to investigate possible changes and mutations in the field strains that could potentially increase their virulence. We supposed that such mutations may have been caused by recombination with retroviruses of poultry - especially reticuloendotheliosis virus (REV). Methods In order to detect the possible reasons of recent changes in virulence of MDV strains, polymerase chain reaction (PCR) analyses for meq oncogene and for long-terminal repeat (LTR) region of REV were conducted. The obtained PCR products were sequenced and compared with other MDV and REV strains isolated worldwide and accessible in the GeneBank database. Results Sequencing of the meq oncogene showed a 68 basepair insertion and frame shift within 12 of 24 field strains. Interestingly, the analyses also showed 0.78, 0.8, 0.82, 1.6 kb and other random LTR-REV insertions into the MDV genome in 28 of 29 of strains. These genetic inserts were present after passage in chicken embryo kidney cells suggesting LTR integration into a non-functional region of the MDV genome. Conclusion The results indicate the presence of a recombination between MDV and REV under field conditions in Polish chicken farms. The genetic changes within the MDV genome may influence the virus replication and its features in vivo. However, there is no evidence that meq alteration and REV insertions are related to the strains' virulence. PMID:21320336
Genetic Control of Plant Root Colonization by the Biocontrol agent, Pseudomonas fluorescens
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cole, Benjamin J.; Fletcher, Meghan; Waters, Jordan
Plant growth promoting rhizobacteria (PGPR) are a critical component of plant root ecosystems. PGPR promote plant growth by solubilizing inaccessible minerals, suppressing pathogenic microorganisms in the soil, and directly stimulating growth through hormone synthesis. Pseudomonas fluorescens is a well-established PGPR isolated from wheat roots that can also colonize the root system of the model plant, Arabidopsis thaliana. We have created barcoded transposon insertion mutant libraries suitable for genome-wide transposon-mediated mutagenesis followed by sequencing (TnSeq). These libraries consist of over 105 independent insertions, collectively providing loss-of-function mutants for nearly all genes in the P.fluorescens genome. Each insertion mutant can be unambiguouslymore » identified by a randomized 20 nucleotide sequence (barcode) engineered into the transposon sequence. We used these libraries in a gnotobiotic assay to examine the colonization ability of P.fluorescens on A.thaliana roots. Taking advantage of the ability to distinguish individual colonization events using barcode sequences, we assessed the timing and microbial concentration dependence of colonization of the rhizoplane niche. These data provide direct insight into the dynamics of plant root colonization in an in vivo system and define baseline parameters for the systematic identification of the bacterial genes and molecular pathways using TnSeq assays. Having determined parameters that facilitate potential colonization of roots by thousands of independent insertion mutants in a single assay, we are currently establishing a genome-wide functional map of genes required for root colonization in P.fluorescens. Importantly, the approach developed and optimized here for P.fluorescens>A.thaliana colonization will be applicable to a wide range of plant-microbe interactions, including biofuel feedstock plants and microbes known or hypothesized to impact on biofuel-relevant traits including biomass productivity and pathogen resistance.« less
El-Telbany, Ahmed
2012-01-01
Cancer is now known as a disease of genomic alterations. Mutational analysis and genomics profiling in recent years have advanced the field of lung cancer genetics/genomics significantly. It is becoming more accepted now that the identification of genomic alterations in lung cancer can impact therapeutics, especially when the alterations represent “oncogenic drivers” in the processes of tumorigenesis and progression. In this review, we will highlight the key driver oncogenic gene mutations and fusions identified in lung cancer. The review will summarize and report the available demographic and clinicopathological data as well as molecular details behind various lung cancer gene alterations in the context of race. We hope to shed some light into the disparities in the incidence of various genetic mutations among lung cancer patients of different racial backgrounds. As molecularly targeted therapy continues to advance in lung cancer, racial differences in specific genetic/genomic alterations can have an important impact in the choices of therapeutics and in our understanding of the drug sensitivity/resistance profile. The most relevant genes in lung cancer described in this review include the following: EGFR, KRAS, MET, LKB1, BRAF, PIK3CA, ALK, RET, and ROS1. Commonly identified genetic/genomic alterations such as missense or nonsense mutations, small insertions or deletions, alternative splicing, and chromosomal fusion rearrangements were discussed. Relevance in current targeted therapeutic drugs was mentioned when appropriate. We also highlighted various targeted therapeutics that are currently under clinical development, such as the MET inhibitors and antibodies. With the advent of next-generation sequencing, the landscape of genomic alterations in lung cancer is expected to be much transformed and detailed in upcoming years. These genomic landscape differences in the context of racial disparities should be emphasized both in tumorigenesis and in drug sensitivity/resistance. It is hoped that such effort will help to diminish racial disparities in lung cancer outcome in the future. PMID:23264847
Kayansamruaj, Pattanapon; Pirarat, Nopadon; Kondo, Hidehiro; Hirono, Ikuo; Rodkhum, Channarong
2015-12-01
Streptococcus agalactiae, or Group B streptococcus (GBS), is a highly virulent pathogen in aquatic animals, causing huge mortalities worldwide. In Thailand, the serotype Ia, β-hemolytic GBS, belonging to sequence type (ST) 7 of clonal complex (CC) 7, was found to be the major cause of streptococcosis outbreaks in fish farms. In this study, we performed an in silico genomic comparison, aiming to investigate the phylogenetic relationship between the pathogenic fish strains of Thai ST7 and other ST7 from different hosts and geographical origins. In general, the genomes of Thai ST7 strains are closely related to other fish ST7s, as the core genome is shared by 92-95% of any individual fish ST7 genome. Among the fish ST7 genomes, we observed only small dissimilarities, based on the analysis of clustered regularly interspaced short palindromic repeats (CRISPRs), surface protein markers, insertions sequence (IS) elements and putative virulence genes. The phylogenetic tree based on single nucleotide polymorphisms (SNPs) of the core genome sequences clearly categorized the ST7 strains according to their geographical and host origins, with the human ST7 being genetically distant from other fish ST7 strains. A pan-genome analysis of ST7 strains detected a 48-kb gene island specifically in the Thai ST7 isolates. The orientations and predicted amino acid sequences of the genes in the island closely matched those of Tn5252, a streptococcal conjugative transposon, in GBS 2603V/R serotype V, Streptococcus pneumoniae and Streptococcus suis. Thus, it was presumed that Thai ST7 acquired this Tn5252 homologue from related streptococci. The close phylogenetic relationship between the fish ST7 strains suggests that these strains were derived from a common ancestor and have diverged in different geographical regions and in different hosts. Copyright © 2015 Elsevier B.V. All rights reserved.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2005-01-01
Background The Streptophyta comprise all land plants and six monophyletic groups of charophycean green algae. Phylogenetic analyses of four genes from three cellular compartments support the following branching order for these algal lineages: Mesostigmatales, Chlorokybales, Klebsormidiales, Zygnematales, Coleochaetales and Charales, with the last lineage being sister to land plants. Comparative analyses of the Mesostigma viride (Mesostigmatales) and land plant chloroplast genome sequences revealed that this genome experienced many gene losses, intron insertions and gene rearrangements during the evolution of charophyceans. On the other hand, the chloroplast genome of Chaetosphaeridium globosum (Coleochaetales) is highly similar to its land plant counterparts in terms of gene content, intron composition and gene order, indicating that most of the features characteristic of land plant chloroplast DNA (cpDNA) were acquired from charophycean green algae. To gain further insight into when the highly conservative pattern displayed by land plant cpDNAs originated in the Streptophyta, we have determined the cpDNA sequences of the distantly related zygnematalean algae Staurastrum punctulatum and Zygnema circumcarinatum. Results The 157,089 bp Staurastrum and 165,372 bp Zygnema cpDNAs encode 121 and 125 genes, respectively. Although both cpDNAs lack an rRNA-encoding inverted repeat (IR), they are substantially larger than Chaetosphaeridium and land plant cpDNAs. This increased size is explained by the expansion of intergenic spacers and introns. The Staurastrum and Zygnema genomes differ extensively from one another and from their streptophyte counterparts at the level of gene order, with the Staurastrum genome more closely resembling its land plant counterparts than does Zygnema cpDNA. Many intergenic regions in Zygnema cpDNA harbor tandem repeats. The introns in both Staurastrum (8 introns) and Zygnema (13 introns) cpDNAs represent subsets of those found in land plant cpDNAs. They represent 16 distinct insertion sites, only five of which are shared by the two zygnematalean genomes. Three of these insertions sites have not been identified in Chaetosphaeridium cpDNA. Conclusion The chloroplast genome experienced substantial changes in overall structure, gene order, and intron content during the evolution of the Zygnematales. Most of the features considered earlier as typical of land plant cpDNAs probably originated before the emergence of the Zygnematales and Coleochaetales. PMID:16236178
Garrels, Wiebke; Mátés, Lajos; Holler, Stephanie; Dalda, Anna; Taylor, Ulrike; Petersen, Björn; Niemann, Heiner; Izsvák, Zsuzsanna; Ivics, Zoltán; Kues, Wilfried A.
2011-01-01
Genetic engineering can expand the utility of pigs for modeling human diseases, and for developing advanced therapeutic approaches. However, the inefficient production of transgenic pigs represents a technological bottleneck. Here, we assessed the hyperactive Sleeping Beauty (SB100X) transposon system for enzyme-catalyzed transgene integration into the embryonic porcine genome. The components of the transposon vector system were microinjected as circular plasmids into the cytoplasm of porcine zygotes, resulting in high frequencies of transgenic fetuses and piglets. The transgenic animals showed normal development and persistent reporter gene expression for >12 months. Molecular hallmarks of transposition were confirmed by analysis of 25 genomic insertion sites. We demonstrate germ-line transmission, segregation of individual transposons, and continued, copy number-dependent transgene expression in F1-offspring. In addition, we demonstrate target-selected gene insertion into transposon-tagged genomic loci by Cre-loxP-based cassette exchange in somatic cells followed by nuclear transfer. Transposase-catalyzed transgenesis in a large mammalian species expands the arsenal of transgenic technologies for use in domestic animals and will facilitate the development of large animal models for human diseases. PMID:21897845
Primers-4-Yeast: a comprehensive web tool for planning primers for Saccharomyces cerevisiae.
Yofe, Ido; Schuldiner, Maya
2014-02-01
The budding yeast Saccharomyces cerevisiae is a key model organism of functional genomics, due to its ease and speed of genetic manipulations. In fact, in this yeast, the requirement for homologous sequences for recombination purposes is so small that 40 base pairs (bp) are sufficient. Hence, an enormous variety of genetic manipulations can be performed by simply planning primers with the correct homology, using a defined set of transformation plasmids. Although designing primers for yeast transformations and for the verification of their correct insertion is a common task in all yeast laboratories, primer planning is usually done manually and a tool that would enable easy, automated primer planning for the yeast research community is still lacking. Here we introduce Primers-4-Yeast, a web tool that allows primers to be designed in batches for S. cerevisiae gene-targeting transformations, and for the validation of correct insertions. This novel tool enables fast, automated, accurate primer planning for large sets of genes, introduces consistency in primer planning and is therefore suggested to serve as a standard in yeast research. Primers-4-Yeast is available at: http://www.weizmann.ac.il/Primers-4-Yeast Copyright © 2013 John Wiley & Sons, Ltd.
2013-01-01
Background The field of cancer genomics has rapidly adopted next-generation sequencing (NGS) in order to study and characterize malignant tumors with unprecedented resolution. In particular for cancer, one is often trying to identify somatic mutations – changes specific to a tumor and not within an individual’s germline. However, false positive and false negative detections often result from lack of sufficient variant evidence, contamination of the biopsy by stromal tissue, sequencing errors, and the erroneous classification of germline variation as tumor-specific. Results We have developed a generalized Bayesian analysis framework for matched tumor/normal samples with the purpose of identifying tumor-specific alterations such as single nucleotide mutations, small insertions/deletions, and structural variation. We describe our methodology, and discuss its application to other types of paired-tissue analysis such as the detection of loss of heterozygosity as well as allelic imbalance. We also demonstrate the high level of sensitivity and specificity in discovering simulated somatic mutations, for various combinations of a) genomic coverage and b) emulated heterogeneity. Conclusion We present a Java-based implementation of our methods named Seurat, which is made available for free academic use. We have demonstrated and reported on the discovery of different types of somatic change by applying Seurat to an experimentally-derived cancer dataset using our methods; and have discussed considerations and practices regarding the accurate detection of somatic events in cancer genomes. Seurat is available at https://sites.google.com/site/seuratsomatic. PMID:23642077
Christoforides, Alexis; Carpten, John D; Weiss, Glen J; Demeure, Michael J; Von Hoff, Daniel D; Craig, David W
2013-05-04
The field of cancer genomics has rapidly adopted next-generation sequencing (NGS) in order to study and characterize malignant tumors with unprecedented resolution. In particular for cancer, one is often trying to identify somatic mutations--changes specific to a tumor and not within an individual's germline. However, false positive and false negative detections often result from lack of sufficient variant evidence, contamination of the biopsy by stromal tissue, sequencing errors, and the erroneous classification of germline variation as tumor-specific. We have developed a generalized Bayesian analysis framework for matched tumor/normal samples with the purpose of identifying tumor-specific alterations such as single nucleotide mutations, small insertions/deletions, and structural variation. We describe our methodology, and discuss its application to other types of paired-tissue analysis such as the detection of loss of heterozygosity as well as allelic imbalance. We also demonstrate the high level of sensitivity and specificity in discovering simulated somatic mutations, for various combinations of a) genomic coverage and b) emulated heterogeneity. We present a Java-based implementation of our methods named Seurat, which is made available for free academic use. We have demonstrated and reported on the discovery of different types of somatic change by applying Seurat to an experimentally-derived cancer dataset using our methods; and have discussed considerations and practices regarding the accurate detection of somatic events in cancer genomes. Seurat is available at https://sites.google.com/site/seuratsomatic.
Construction of BAC Libraries from Flow-Sorted Chromosomes.
Šafář, Jan; Šimková, Hana; Doležel, Jaroslav
2016-01-01
Cloned DNA libraries in bacterial artificial chromosome (BAC) are the most widely used form of large-insert DNA libraries. BAC libraries are typically represented by ordered clones derived from genomic DNA of a particular organism. In the case of large eukaryotic genomes, whole-genome libraries consist of a hundred thousand to a million clones, which make their handling and screening a daunting task. The labor and cost of working with whole-genome libraries can be greatly reduced by constructing a library derived from a smaller part of the genome. Here we describe construction of BAC libraries from mitotic chromosomes purified by flow cytometric sorting. Chromosome-specific BAC libraries facilitate positional gene cloning, physical mapping, and sequencing in complex plant genomes.
A filtering method to generate high quality short reads using illumina paired-end technology.
Eren, A Murat; Vineis, Joseph H; Morrison, Hilary G; Sogin, Mitchell L
2013-01-01
Consensus between independent reads improves the accuracy of genome and transcriptome analyses, however lack of consensus between very similar sequences in metagenomic studies can and often does represent natural variation of biological significance. The common use of machine-assigned quality scores on next generation platforms does not necessarily correlate with accuracy. Here, we describe using the overlap of paired-end, short sequence reads to identify error-prone reads in marker gene analyses and their contribution to spurious OTUs following clustering analysis using QIIME. Our approach can also reduce error in shotgun sequencing data generated from libraries with small, tightly constrained insert sizes. The open-source implementation of this algorithm in Python programming language with user instructions can be obtained from https://github.com/meren/illumina-utils.
O'Neill, F J; Gao, Y; Xu, X
1993-11-01
The DNAs of polyomaviruses ordinarily exist as a single circular molecule of approximately 5000 base pairs. Variants of SV40, BKV and JCV have been described which contain two complementing defective DNA molecules. These defectives, which form a bipartite genome structure, contain either the viral early region or the late region. The defectives have the unique property of being able to tolerate variable sized reiterations of regulatory and terminus region sequences, and portions of the coding region. They can also exchange coding region sequences with other polyomaviruses. It has been suggested that the bipartite genome structure might be a stage in the evolution of polyomaviruses which can uniquely sustain genome and sequence diversity. However, it is not known if the regulatory and terminus region sequences are highly mutable. Also, it is not known if the bipartite genome structure is reversible and what the conditions might be which would favor restoration of the monomolecular genome structure. We addressed the first question by sequencing the reiterated regulatory and terminus regions of E- and L-SV40 DNAs. This revealed a large number of mutations in the regulatory regions of the defective genomes, including deletions, insertions, rearrangements and base substitutions. We also detected insertions and base substitutions in the T-antigen gene. We addressed the second question by introducing into permissive simian cells, E- and L-SV40 genomes which had been engineered to contain only a single regulatory region. Analysis of viral DNA from transfected cells demonstrated recombined genomes containing a wild type monomolecular DNA structure. However, the complete defectives, containing reiterated regulatory regions, could often compete away the wild type genomes. The recombinant monomolecular genomes were isolated, cloned and found to be infectious. All of the DNA alterations identified in one of the regulatory regions of E-SV40 DNA were present in the recombinant monomolecular genomes. These and other findings indicate that the bipartite genome state can sustain many mutations which wtSV40 cannot directly sustain. However, the mutations can later be introduced into the wild type genomes when the E- and L-SV40 DNAs recombine to generate a new monomolecular genome structure.
Inserts Automatically Lubricate Ball Bearings
NASA Technical Reports Server (NTRS)
Hager, J. A.
1983-01-01
Inserts on ball-separator ring of ball bearings provide continuous film of lubricant on ball surfaces. Inserts are machined or molded. Small inserts in ball pockets provide steady supply of lubricant. Technique is utilized on equipment for which maintenance is often poor and lubrication interval is uncertain, such as household appliances, automobiles, and marine engines.
Targeted mutagenesis in sea urchin embryos using TALENs.
Hosoi, Sayaka; Sakuma, Tetsushi; Sakamoto, Naoaki; Yamamoto, Takashi
2014-01-01
Genome editing with engineered nucleases such as zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) has been reported in various animals. We previously described ZFN-mediated targeted mutagenesis and insertion of reporter genes in sea urchin embryos. In this study, we demonstrate that TALENs can induce mutagenesis at specific genomic loci of sea urchin embryos. Injection of TALEN mRNAs targeting the HpEts transcription factor into fertilized eggs resulted in the impairment of skeletogenesis. Sequence analyses of the mutations showed that deletions and/or insertions occurred at the HpEts target site in the TALEN mRNAs-injected embryos. The results suggest that targeted gene disruption using TALENs is feasible in sea urchin embryos. © 2013 The Authors Development, Growth & Differentiation © 2013 Japanese Society of Developmental Biologists.
Mateos, L M; Schäfer, A; Kalinowski, J; Martin, J F; Pühler, A
1996-10-01
Conjugative transfer of mobilizable derivatives of the Escherichia coli narrow-host-range plasmids pBR322, pBR325, pACYC177, and pACYC184 from E. coli to species of the gram-positive genera Corynebacterium and Brevibacterium resulted in the integration of the plasmids into the genomes of the recipient bacteria. Transconjugants appeared at low frequencies and reproducibly with a delay of 2 to 3 days compared with matings with replicative vectors. Southern analysis of corynebacterial transconjugants and nucleotide sequences from insertion sites revealed that integration occurs at different locations and that different parts of the vector are involved in the process. Integration is not dependent on indigenous insertion sequence elements but results from recombination between very short homologous DNA segments (8 to 12 bp) present in the vector and in the host DNA. In the majority of the cases (90%), integration led to cointegrate formation, and in some cases, deletions or rearrangements occurred during the recombination event. Insertions were found to be quite stable even in the absence of selective pressure.
In vivo blunt-end cloning through CRISPR/Cas9-facilitated non-homologous end-joining
Geisinger, Jonathan M.; Turan, Sören; Hernandez, Sophia; Spector, Laura P.; Calos, Michele P.
2016-01-01
The CRISPR/Cas9 system facilitates precise DNA modifications by generating RNA-guided blunt-ended double-strand breaks. We demonstrate that guide RNA pairs generate deletions that are repaired with a high level of precision by non-homologous end-joining in mammalian cells. We present a method called knock-in blunt ligation for exploiting these breaks to insert exogenous PCR-generated sequences in a homology-independent manner without loss of additional nucleotides. This method is useful for making precise additions to the genome such as insertions of marker gene cassettes or functional elements, without the need for homology arms. We successfully utilized this method in human and mouse cells to insert fluorescent protein cassettes into various loci, with efficiencies up to 36% in HEK293 cells without selection. We also created versions of Cas9 fused to the FKBP12-L106P destabilization domain in an effort to improve Cas9 performance. Our in vivo blunt-end cloning method and destabilization-domain-fused Cas9 variant increase the repertoire of precision genome engineering approaches. PMID:26762978
Mateos, L M; Schäfer, A; Kalinowski, J; Martin, J F; Pühler, A
1996-01-01
Conjugative transfer of mobilizable derivatives of the Escherichia coli narrow-host-range plasmids pBR322, pBR325, pACYC177, and pACYC184 from E. coli to species of the gram-positive genera Corynebacterium and Brevibacterium resulted in the integration of the plasmids into the genomes of the recipient bacteria. Transconjugants appeared at low frequencies and reproducibly with a delay of 2 to 3 days compared with matings with replicative vectors. Southern analysis of corynebacterial transconjugants and nucleotide sequences from insertion sites revealed that integration occurs at different locations and that different parts of the vector are involved in the process. Integration is not dependent on indigenous insertion sequence elements but results from recombination between very short homologous DNA segments (8 to 12 bp) present in the vector and in the host DNA. In the majority of the cases (90%), integration led to cointegrate formation, and in some cases, deletions or rearrangements occurred during the recombination event. Insertions were found to be quite stable even in the absence of selective pressure. PMID:8824624
Proels, Reinhard K; Roitsch, Thomas
2006-03-01
Very few CACTA transposon-like sequences have been described in Solanaceae species. Sequence information has been restricted to partial transposase (TPase)-like fragments, and no target gene of CACTA-like transposon insertion has been described in tomato to date. In this manuscript, we report on a CACTA transposon-like insertion in intron I of tomato (Lycopersicon esculentum) invertase gene Lin5 and TPase-like sequences of several Solanaceae species. Consensus primers deduced from the TPase region of the tomato CACTA transposon-like element allowed the amplification of similar sequences from various Solanaceae species of different subfamilies including Solaneae (Solanum tuberosum), Cestreae (Nicotiana tabacum) and Datureae (Datura stramonium). This demonstrates the ubiquitous presence of CACTA-like elements in Solanaceae genomes. The obtained partial sequences are highly conserved, and allow further detection and detailed analysis of CACTA-like transposons throughout Solanaceae species. CACTA-like transposon sequences make possible the evaluation of their use for genome analysis, functional studies of genes and the evolutionary relationships between plant species.
Fiston-Lavier, Anna-Sophie; Barrón, Maite G; Petrov, Dmitri A; González, Josefa
2015-02-27
Transposable elements (TEs) constitute the most active, diverse and ancient component in a broad range of genomes. Complete understanding of genome function and evolution cannot be achieved without a thorough understanding of TE impact and biology. However, in-depth analysis of TEs still represents a challenge due to the repetitive nature of these genomic entities. In this work, we present a broadly applicable and flexible tool: T-lex2. T-lex2 is the only available software that allows routine, automatic and accurate genotyping of individual TE insertions and estimation of their population frequencies both using individual strain and pooled next-generation sequencing data. Furthermore, T-lex2 also assesses the quality of the calls allowing the identification of miss-annotated TEs and providing the necessary information to re-annotate them. The flexible and customizable design of T-lex2 allows running it in any genome and for any type of TE insertion. Here, we tested the fidelity of T-lex2 using the fly and human genomes. Overall, T-lex2 represents a significant improvement in our ability to analyze the contribution of TEs to genome function and evolution as well as learning about the biology of TEs. T-lex2 is freely available online at http://sourceforge.net/projects/tlex. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome-scale engineering for systems and synthetic biology
Esvelt, Kevin M; Wang, Harris H
2013-01-01
Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering. PMID:23340847
Tn5099, a xylE promoter probe transposon for Streptomyces spp.
Hahn, D R; Solenberg, P J; Baltz, R H
1991-01-01
Tn5099, a promoter probe transposon for Streptomyces spp., was constructed by inserting a promoterless xylE gene and a hygromycin resistance gene into IS493. Tn5099 transposed into different sites in the Streptomyces griseofuscus genome, and the xylE reporter gene was expressed in some of the transposition mutants. Strains containing Tn5099 insertions that gave regulated expression of the xylE gene were identified. Images PMID:1653213
Georges, Arthur; Li, Qiye; Lian, Jinmin; O'Meally, Denis; Deakin, Janine; Wang, Zongji; Zhang, Pei; Fujita, Matthew; Patel, Hardip R; Holleley, Clare E; Zhou, Yang; Zhang, Xiuwen; Matsubara, Kazumi; Waters, Paul; Graves, Jennifer A Marshall; Sarre, Stephen D; Zhang, Guojie
2015-01-01
The lizards of the family Agamidae are one of the most prominent elements of the Australian reptile fauna. Here, we present a genomic resource built on the basis of a wild-caught male ZZ central bearded dragon Pogona vitticeps. The genomic sequence for P. vitticeps, generated on the Illumina HiSeq 2000 platform, comprised 317 Gbp (179X raw read depth) from 13 insert libraries ranging from 250 bp to 40 kbp. After filtering for low-quality and duplicated reads, 146 Gbp of data (83X) was available for assembly. Exceptionally high levels of heterozygosity (0.85 % of single nucleotide polymorphisms plus sequence insertions or deletions) complicated assembly; nevertheless, 96.4 % of reads mapped back to the assembled scaffolds, indicating that the assembly included most of the sequenced genome. Length of the assembly was 1.8 Gbp in 545,310 scaffolds (69,852 longer than 300 bp), the longest being 14.68 Mbp. N50 was 2.29 Mbp. Genes were annotated on the basis of de novo prediction, similarity to the green anole Anolis carolinensis, Gallus gallus and Homo sapiens proteins, and P. vitticeps transcriptome sequence assemblies, to yield 19,406 protein-coding genes in the assembly, 63 % of which had intact open reading frames. Our assembly captured 99 % (246 of 248) of core CEGMA genes, with 93 % (231) being complete. The quality of the P. vitticeps assembly is comparable or superior to that of other published squamate genomes, and the annotated P. vitticeps genome can be accessed through a genome browser available at https://genomics.canberra.edu.au.
Wild-Type Measles Viruses with Non-Standard Genome Lengths
Bankamp, Bettina; Liu, Chunyu; Rivailler, Pierre; Bera, Jayati; Shrivastava, Susmita; Kirkness, Ewen F.; Bellini, William J.; Rota, Paul A.
2014-01-01
The length of the single stranded, negative sense RNA genome of measles virus (MeV) is highly conserved at 15,894 nucleotides (nt). MeVs can be grouped into 24 genotypes based on the highly variable 450 nucleotides coding for the carboxyl-terminus of the nucleocapsid protein (N-450). Here, we report the genomic sequences of 2 wild-type viral isolates of genotype D4 with genome lengths of 15,900 nt. Both genomes had a 7 nt insertion in the 3′ untranslated region (UTR) of the matrix (M) gene and a 1 nt deletion in the 5′ UTR of the fusion (F) gene. The net gain of 6 nt complies with the rule-of-six required for replication competency of the genomes of morbilliviruses. The insertions and deletion (indels) were confirmed in a patient sample that was the source of one of the viral isolates. The positions of the indels were identical in both viral isolates, even though epidemiological data and the 3 nt differences in N-450 between the two genomes suggested that the viruses represented separate chains of transmission. Identical indels were found in the M-F intergenic regions of 14 additional genotype D4 viral isolates that were imported into the US during 2007–2010. Viral isolates with and without indels produced plaques of similar size and replicated efficiently in A549/hSLAM and Vero/hSLAM cells. This is the first report of wild-type MeVs with genome lengths other than 15,894 nt and demonstrates that the length of the M-F UTR of wild-type MeVs is flexible. PMID:24748123
Comparative genomic data of the Avian Phylogenomics Project.
Zhang, Guojie; Li, Bo; Li, Cai; Gilbert, M Thomas P; Jarvis, Erich D; Wang, Jun
2014-01-01
The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.
Continuous Influx of Genetic Material from Host to Virus Populations
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane
2016-01-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors. PMID:26829124
Continuous Influx of Genetic Material from Host to Virus Populations.
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane; Cordaux, Richard; Herniou, Elisabeth A
2016-02-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors.
Domb, Katherine; Keidar, Danielle; Yaakov, Beery; Khasdan, Vadim; Kashkush, Khalil
2017-10-27
Natural populations of the tetraploid wild emmer wheat (genome AABB) were previously shown to demonstrate eco-geographically structured genetic and epigenetic diversity. Transposable elements (TEs) might make up a significant part of the genetic and epigenetic variation between individuals and populations because they comprise over 80% of the wild emmer wheat genome. In this study, we performed detailed analyses to assess the dynamics of transposable elements in 50 accessions of wild emmer wheat collected from 5 geographically isolated sites. The analyses included: the copy number variation of TEs among accessions in the five populations, population-unique insertional patterns, and the impact of population-unique/specific TE insertions on structure and expression of genes. We assessed the copy numbers of 12 TE families using real-time quantitative PCR, and found significant copy number variation (CNV) in the 50 wild emmer wheat accessions, in a population-specific manner. In some cases, the CNV difference reached up to 6-fold. However, the CNV was TE-specific, namely some TE families showed higher copy numbers in one or more populations, and other TE families showed lower copy numbers in the same population(s). Furthermore, we assessed the insertional patterns of 6 TE families using transposon display (TD), and observed significant population-specific insertional patterns. The polymorphism levels of TE-insertional patterns reached 92% among all wild emmer wheat accessions, in some cases. In addition, we observed population-specific/unique TE insertions, some of which were located within or close to protein-coding genes, creating allelic variations in a population-specific manner. We also showed that those genes are differentially expressed in wild emmer wheat. For the first time, this study shows that TEs proliferate in wild emmer wheat in a population-specific manner, creating new alleles of genes, which contribute to the divergent evolution of homeologous genes from the A and B subgenomes.
77 FR 75089 - Federal Acquisition Regulation; Accelerated Payments to Small Business Subcontractors
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-19
... link ``Submit a Comment'' that corresponds with FAR Case 2012-031. Follow the instructions provided at... proper documentation from small business subcontractors. The clause will be inserted into all new... of commercial items. * * * * * (d) * * * (4) Insert the clause at 52.232-XX, Providing Accelerated...
Mapping Ribonucleotides Incorporated into DNA by Hydrolytic End-Sequencing.
Orebaugh, Clinton D; Lujan, Scott A; Burkholder, Adam B; Clausen, Anders R; Kunkel, Thomas A
2018-01-01
Ribonucleotides embedded within DNA render the DNA sensitive to the formation of single-stranded breaks under alkali conditions. Here, we describe a next-generation sequencing method called hydrolytic end sequencing (HydEn-seq) to map ribonucleotides inserted into the genome of Saccharomyce cerevisiae strains deficient in ribonucleotide excision repair. We use this method to map several genomic features in wild-type and replicase variant yeast strains.
Cui, Peng; Ji, Rimutu; Ding, Feng; Qi, Dan; Gao, Hongwei; Meng, He; Yu, Jun; Hu, Songnian; Zhang, Heping
2007-01-01
Background The family Camelidae that evolved in North America during the Eocene survived with two distinct tribes, Camelini and Lamini. To investigate the evolutionary relationship between them and to further understand the evolutionary history of this family, we determined the complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus), the only wild survivor of the Old World camel. Results The mitochondrial genome sequence (16,680 bp) from C. bactrianus ferus contains 13 protein-coding, two rRNA, and 22 tRNA genes as well as a typical control region; this basic structure is shared by all metazoan mitochondrial genomes. Its protein-coding region exhibits codon usage common to all mammals and possesses the three cryptic stop codons shared by all vertebrates. C. bactrianus ferus together with the rest of mammalian species do not share a triplet nucleotide insertion (GCC) that encodes a proline residue found only in the nd1 gene of the New World camelid Lama pacos. This lineage-specific insertion in the L. pacos mtDNA occurred after the split between the Old and New World camelids suggests that it may have functional implication since a proline insertion in a protein backbone usually alters protein conformation significantly, and nd1 gene has not been seen as polymorphic as the rest of ND family genes among camelids. Our phylogenetic study based on complete mitochondrial genomes excluding the control region suggested that the divergence of the two tribes may occur in the early Miocene; it is much earlier than what was deduced from the fossil record (11 million years). An evolutionary history reconstructed for the family Camelidae based on cytb sequences suggested that the split of bactrian camel and dromedary may have occurred in North America before the tribe Camelini migrated from North America to Asia. Conclusion Molecular clock analysis of complete mitochondrial genomes from C. bactrianus ferus and L. pacos suggested that the two tribes diverged from their common ancestor about 25 million years ago, much earlier than what was predicted based on fossil records. PMID:17640355
Potential Links between Hepadnavirus and Bornavirus Sequences in the Host Genome and Cancer.
Honda, Tomoyuki
2017-01-01
Various viruses leave their sequences in the host genomes during infection. Such events occur mainly in retrovirus infection but also sometimes in DNA and non-retroviral RNA virus infections. If viral sequences are integrated into the genomes of germ line cells, the sequences can become inherited as endogenous viral elements (EVEs). The integration events of viral sequences may have oncogenic potential. Because proviral integrations of some retroviruses and/or reactivation of endogenous retroviruses are closely linked to cancers, viral insertions related to non-retroviral viruses also possibly contribute to cancer development. This article focuses on genomic viral sequences derived from two non-retroviral viruses, whose endogenization is already reported, and discusses their possible contributions to cancer. Viral insertions of hepatitis B virus play roles in the development of hepatocellular carcinoma. Endogenous bornavirus-like elements, the only non-retroviral RNA virus-related EVEs found in the human genome, may also be involved in cancer formation. In addition, the possible contribution of the interactions between viruses and retrotransposons, which seem to be a major driving force for generating EVEs related to non-retroviral RNA viruses, to cancers will be discussed. Future studies regarding the possible links described here may open a new avenue for the development of novel therapeutics for tumor virus-related cancers and/or provide novel insights into EVE functions.
Wang, Chun Ming; Lo, Loong Chueng; Feng, Felicia; Gong, Ping; Li, Jian; Zhu, Ze Yuan; Lin, Grace; Yue, Gen Hua
2008-03-25
Barramundi (Lates calcarifer) is an important farmed marine food fish species. Its first generation linkage map has been applied to map QTL for growth traits. To identify genes located in QTL responsible for specific traits, genomic large insert libraries are of crucial importance. We reported herein a bacterial artificial chromosome (BAC) library and the mapping of BAC clones to the linkage map. This BAC library consisted of 49,152 clones with an average insert size of 98 kb, representing 6.9-fold haploid genome coverage. Screening the library with 24 microsatellites and 15 ESTs/genes demonstrated that the library had good genome coverage. In addition, 62 novel microsatellites each isolated from 62 BAC clones were mapped onto the first generation linkage map. A total of 86 BAC clones were anchored on the linkage map with at least one BAC clone on each linkage group. We have constructed the first BAC library for L. calcarifer and mapped 86 BAC clones to the first generation linkage map. This BAC library and the improved linkage map with 302 DNA markers not only supply an indispensable tool to the integration of physical and linkage maps, the fine mapping of QTL and map based cloning genes located in QTL of commercial importance, but also contribute to comparative genomic studies and eventually whole genome sequencing.
Nielsen, Tue Kjærgaard; Rasmussen, Morten; Demanèche, Sandrine; Cecillon, Sébastien; Vogel, Timothy M.
2017-01-01
Abstract Bacterial degraders of chlorophenoxy herbicides have been isolated from various ecosystems, including pristine environments. Among these degraders, the sphingomonads constitute a prominent group that displays versatile xenobiotic-degradation capabilities. Four separate sequencing strategies were required to provide the complete sequence of the complex and plastic genome of the canonical chlorophenoxy herbicide-degrading Sphingobium herbicidovorans MH. The genome has an intricate organization of the chlorophenoxy-herbicide catabolic genes sdpA, rdpA, and cadABCD that encode the (R)- and (S)-enantiomer-specific 2,4-dichlorophenoxypropionate dioxygenases and four subunits of a Rieske non-heme iron oxygenase involved in 2-methyl-chlorophenoxyacetic acid degradation, respectively. Several major genomic rearrangements are proposed to help understand the evolution and mobility of these important genes and their genetic context. Single-strain mobilomic sequence analysis uncovered plasmids and insertion sequence-associated circular intermediates in this environmentally important bacterium and enabled the description of evolutionary models for pesticide degradation in strain MH and related organisms. The mobilome presented a complex mosaic of mobile genetic elements including four plasmids and several circular intermediate DNA molecules of insertion-sequence elements and transposons that are central to the evolution of xenobiotics degradation. Furthermore, two individual chromosomally integrated prophages were shown to excise and form free circular DNA molecules. This approach holds great potential for improving the understanding of genome plasticity, evolution, and microbial ecology. PMID:28961970
Targeted Mutagenesis of Guinea Pig Cytomegalovirus Using CRISPR/Cas9-Mediated Gene Editing.
Bierle, Craig J; Anderholm, Kaitlyn M; Wang, Jian Ben; McVoy, Michael A; Schleiss, Mark R
2016-08-01
The cytomegaloviruses (CMVs) are among the most genetically complex mammalian viruses, with viral genomes that often exceed 230 kbp. Manipulation of cytomegalovirus genomes is largely performed using infectious bacterial artificial chromosomes (BACs), which necessitates the maintenance of the viral genome in Escherichia coli and successful reconstitution of virus from permissive cells after transfection of the BAC. Here we describe an alternative strategy for the mutagenesis of guinea pig cytomegalovirus that utilizes clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9)-mediated genome editing to introduce targeted mutations to the viral genome. Transient transfection and drug selection were used to restrict lytic replication of guinea pig cytomegalovirus to cells that express Cas9 and virus-specific guide RNA. The result was highly efficient editing of the viral genome that introduced targeted insertion or deletion mutations to nonessential viral genes. Cotransfection of multiple virus-specific guide RNAs or a homology repair template was used for targeted, markerless deletions of viral sequence or to introduce exogenous sequence by homology-driven repair. As CRISPR/Cas9 mutagenesis occurs directly in infected cells, this methodology avoids selective pressures that may occur during propagation of the viral genome in bacteria and may facilitate genetic manipulation of low-passage or clinical CMV isolates. The cytomegalovirus genome is complex, and viral adaptations to cell culture have complicated the study of infection in vivo Recombineering of viral bacterial artificial chromosomes enabled the study of recombinant cytomegaloviruses. Here we report the development of an alternative approach using CRISPR/Cas9-based mutagenesis in guinea pig cytomegalovirus, a small-animal model of congenital cytomegalovirus disease. CRISPR/Cas9 mutagenesis can introduce the same types of mutations to the viral genome as bacterial artificial chromosome recombineering but does so directly in virus-infected cells. CRISPR/Cas9 mutagenesis is not dependent on a bacterial intermediate, and defined viral mutants can be recovered after a limited number of viral genome replications, minimizing the risk of spontaneous mutation. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Enhancers Are Major Targets for Murine Leukemia Virus Vector Integration
De Ravin, Suk See; Su, Ling; Theobald, Narda; Choi, Uimook; Macpherson, Janet L.; Poidinger, Michael; Symonds, Geoff; Pond, Susan M.; Ferris, Andrea L.; Hughes, Stephen H.
2014-01-01
ABSTRACT Retroviral vectors have been used in successful gene therapies. However, in some patients, insertional mutagenesis led to leukemia or myelodysplasia. Both the strong promoter/enhancer elements in the long terminal repeats (LTRs) of murine leukemia virus (MLV)-based vectors and the vector-specific integration site preferences played an important role in these adverse clinical events. MLV integration is known to prefer regions in or near transcription start sites (TSS). Recently, BET family proteins were shown to be the major cellular proteins responsible for targeting MLV integration. Although MLV integration sites are significantly enriched at TSS, only a small fraction of the MLV integration sites (<15%) occur in this region. To resolve this apparent discrepancy, we created a high-resolution genome-wide integration map of more than one million integration sites from CD34+ hematopoietic stem cells transduced with a clinically relevant MLV-based vector. The integration sites form ∼60,000 tight clusters. These clusters comprise ∼1.9% of the genome. The vast majority (87%) of the integration sites are located within histone H3K4me1 islands, a hallmark of enhancers. The majority of these clusters also have H3K27ac histone modifications, which mark active enhancers. The enhancers of some oncogenes, including LMO2, are highly preferred targets for integration without in vivo selection. IMPORTANCE We show that active enhancer regions are the major targets for MLV integration; this means that MLV preferentially integrates in regions that are favorable for viral gene expression in a variety of cell types. The results provide insights for MLV integration target site selection and also explain the high risk of insertional mutagenesis that is associated with gene therapy trials using MLV vectors. PMID:24501411
Liu, Xia; Li, Yuan; Yang, Hongyuan; Zhou, Boyang
2018-04-09
The complete chloroplast (cp) genome of Talinum paniculatum (Caryophyllale), a source of pharmaceutical efficacy similar to ginseng, and a widely distributed and planted edible vegetable, were sequenced and analyzed. The cp genome size of T. paniculatum is 156,929 bp, with a pair of inverted repeats (IRs) of 25,751 bp separated by a large single copy (LSC) region of 86,898 bp and a small single copy (SSC) region of 18,529 bp. The genome contains 83 protein-coding genes, 37 transfer RNA (tRNA) genes, eight ribosomal RNA (rRNA) genes and four pseudogenes. Fifty one (51) repeat units and ninety two (92) simple sequence repeats (SSRs) were found in the genome. The pseudogene rpl23 (Ribosomal protein L23) was insert AATT than other Caryophyllale species by sequence alignment, which located in IRs region. The gene of trnK-UUU (tRNA-Lys) and rpl16 (Ribosomal protein L16) have larger introns in T. paniculatum , and the existence of matK (maturase K) genes, which usually located in the introns of trnK-UUU , rich sequence divergence in Caryophyllale. Complete cp genome comparison with other eight Caryophyllales species indicated that the differences between T. paniculatum and P. oleracea were very slight, and the most highly divergent regions occurred in intergenic spacers. Comparisons of IR boundaries among nine Caryophyllales species showed that T. paniculatum have larger IRs region and the contraction is relatively slight. The phylogenetic analysis among 35 Caryophyllales species and two outgroup species revealed that T. paniculatum and P. oleracea do not belong to the same family. All these results give good opportunities for future identification, barcoding of Talinum species, understanding the evolutionary mode of Caryophyllale cp genome and molecular breeding of T. paniculatum with high pharmaceutical efficacy.
Bacterio-opsin mutants of Halobacterium halobium
Betlach, Mary; Pfeifer, Felicitas; Friedman, James; Boyer, Herbert W.
1983-01-01
The bacterio-opsin (bop) gene of Halobacterium halobium R1 has been cloned with about 40 kilobases of flanking genomic sequence. The 40-kilobase segment is derived from the (G+C)-rich fraction of the chromosome and is not homologous to the major (pHH1) or minor endogenous covalently closed circular DNA species of H. halobium. A 5.1-kilobase Pst I fragment containing the bop gene was subcloned in pBR322 and a partial restriction map was determined. Defined restriction fragments of this clone were used as probes to analyze the defects associated with the bop gene in 12 bacterio-opsin mutants. Eleven out of 12 of the mutants examined had inserts ranging from 350 to 3,000 base pairs either in the bop gene or up to 1,400 base pairs upstream. The positions of the inserts were localized to four regions in the 5.1-kilobase genomic fragment: within the gene (one mutant), in a region that overlaps the 5′ end of the gene (seven mutants), and in two different upstream regions (three mutants). Two revertants of the mutant with the most distal insert had an additional insert in the same region. The polar effects of these inserts are discussed in terms of inactivation of a regulatory gene or disruption of part of a coordinately expressed operon. Given the defined nature of the bop mRNA—i.e., it has a 5′ leader sequence of three ribonucleotides—these observations indicate that the bop mRNA might be processed from a large mRNA transcript. Images PMID:16593291
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server
Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J
2006-01-01
Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
Genomic Diversity of Type B3 Bacteriophages of Caulobacter crescentus.
Ash, Kurt T; Drake, Kristina M; Gibbs, Whitney S; Ely, Bert
2017-07-01
The genomes of the type B3 bacteriophages that infect Caulobacter crescentus are among the largest phage genomes thus far deposited into GenBank with sizes over 200 kb. In this study, we introduce six new bacteriophage genomes which were obtained from phage collected from various water systems in the southeastern United States and from tropical locations across the globe. A comparative analysis of the 12 available genomes revealed a "core genome" which accounts for roughly 1/3 of these bacteriophage genomes and is predominately localized to the head, tail, and lysis gene regions. Despite being isolated from geographically distinct locations, the genomes of these bacteriophages are highly conserved in both genome sequence and gene order. We also identified the insertions, deletions, translocations, and horizontal gene transfer events which are responsible for the genomic diversity of this group of bacteriophages and demonstrated that these changes are not consistent with the idea that modular reassortment of genomes occurs in this group of bacteriophages.
Wang, Yali; Gao, Yuan; Li, Chao; Gao, Hong; Zhang, Cheng-Cai; Xu, Xudong
2018-07-01
Anabaena sp. strain PCC 7120 is a model strain for molecular studies of cell differentiation and patterning in heterocyst-forming cyanobacteria. Subtle differences in heterocyst development have been noticed in different laboratories working on the same organism. In this study, 360 mutations, including single nucleotide polymorphisms (SNPs), small insertion/deletions (indels; 1 to 3 bp), fragment deletions, and transpositions, were identified in the genomes of three substrains. Heterogeneous/heterozygous bases were also identified due to the polyploidy nature of the genome and the multicellular morphology but could be completely segregated when plated after filament fragmentation by sonication. hetC is a gene upregulated in developing cells during heterocyst formation in Anabaena sp. strain PCC 7120 and found in approximately half of other heterocyst-forming cyanobacteria. Inactivation of hetC in 3 substrains of Anabaena sp. PCC 7120 led to different phenotypes: the formation of heterocysts, differentiating cells that keep dividing, or the presence of both heterocysts and dividing differentiating cells. The expression of P hetZ - gfp in these hetC mutants also showed different patterns of green fluorescent protein (GFP) fluorescence. Thus, the function of hetC is influenced by the genomic background and epistasis and constitutes an example of evolution under way. IMPORTANCE Our knowledge about the molecular genetics of heterocyst formation, an important cell differentiation process for global N 2 fixation, is mostly based on studies with Anabaena sp. strain PCC 7120. Here, we show that rapid microevolution is under way in this strain, leading to phenotypic variations for certain genes related to heterocyst development, such as hetC This study provides an example for ongoing microevolution, marked by multiple heterogeneous/heterozygous single nucleotide polymorphisms (SNPs), in a multicellular multicopy-genome microorganism. Copyright © 2018 American Society for Microbiology.
2011-01-01
Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. Conclusions The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting. PMID:21447194
Gartemann, Karl-Heinz; Abt, Birte; Bekel, Thomas; Burger, Annette; Engemann, Jutta; Flügel, Monika; Gaigalat, Lars; Goesmann, Alexander; Gräfen, Ines; Kalinowski, Jörn; Kaup, Olaf; Kirchner, Oliver; Krause, Lutz; Linke, Burkhard; McHardy, Alice; Meyer, Folker; Pohle, Sandra; Rückert, Christian; Schneiker, Susanne; Zellermann, Eva-Maria; Pühler, Alfred; Eichenlaub, Rudolf; Kaiser, Olaf; Bartels, Daniela
2008-01-01
Clavibacter michiganensis subsp. michiganensis is a plant-pathogenic actinomycete that causes bacterial wilt and canker of tomato. The nucleotide sequence of the genome of strain NCPPB382 was determined. The chromosome is circular, consists of 3.298 Mb, and has a high G+C content (72.6%). Annotation revealed 3,080 putative protein-encoding sequences; only 26 pseudogenes were detected. Two rrn operons, 45 tRNAs, and three small stable RNA genes were found. The two circular plasmids, pCM1 (27.4 kbp) and pCM2 (70.0 kbp), which carry pathogenicity genes and thus are essential for virulence, have lower G+C contents (66.5 and 67.6%, respectively). In contrast to the genome of the closely related organism Clavibacter michiganensis subsp. sepedonicus, the genome of C. michiganensis subsp. michiganensis lacks complete insertion elements and transposons. The 129-kb chp/tomA region with a low G+C content near the chromosomal origin of replication was shown to be necessary for pathogenicity. This region contains numerous genes encoding proteins involved in uptake and metabolism of sugars and several serine proteases. There is evidence that single genes located in this region, especially genes encoding serine proteases, are required for efficient colonization of the host. Although C. michiganensis subsp. michiganensis grows mainly in the xylem of tomato plants, no evidence for pronounced genome reduction was found. C. michiganensis subsp. michiganensis seems to have as many transporters and regulators as typical soil-inhabiting bacteria. However, the apparent lack of a sulfate reduction pathway, which makes C. michiganensis subsp. michiganensis dependent on reduced sulfur compounds for growth, is probably the reason for the poor survival of C. michiganensis subsp. michiganensis in soil. PMID:18192381
Sung, Yun J; Winkler, Thomas W; de Las Fuentes, Lisa; Bentley, Amy R; Brown, Michael R; Kraja, Aldi T; Schwander, Karen; Ntalla, Ioanna; Guo, Xiuqing; Franceschini, Nora; Lu, Yingchang; Cheng, Ching-Yu; Sim, Xueling; Vojinovic, Dina; Marten, Jonathan; Musani, Solomon K; Li, Changwei; Feitosa, Mary F; Kilpeläinen, Tuomas O; Richard, Melissa A; Noordam, Raymond; Aslibekyan, Stella; Aschard, Hugues; Bartz, Traci M; Dorajoo, Rajkumar; Liu, Yongmei; Manning, Alisa K; Rankinen, Tuomo; Smith, Albert Vernon; Tajuddin, Salman M; Tayo, Bamidele O; Warren, Helen R; Zhao, Wei; Zhou, Yanhua; Matoba, Nana; Sofer, Tamar; Alver, Maris; Amini, Marzyeh; Boissel, Mathilde; Chai, Jin Fang; Chen, Xu; Divers, Jasmin; Gandin, Ilaria; Gao, Chuan; Giulianini, Franco; Goel, Anuj; Harris, Sarah E; Hartwig, Fernando Pires; Horimoto, Andrea R V R; Hsu, Fang-Chi; Jackson, Anne U; Kähönen, Mika; Kasturiratne, Anuradhani; Kühnel, Brigitte; Leander, Karin; Lee, Wen-Jane; Lin, Keng-Hung; 'an Luan, Jian; McKenzie, Colin A; Meian, He; Nelson, Christopher P; Rauramaa, Rainer; Schupf, Nicole; Scott, Robert A; Sheu, Wayne H H; Stančáková, Alena; Takeuchi, Fumihiko; van der Most, Peter J; Varga, Tibor V; Wang, Heming; Wang, Yajuan; Ware, Erin B; Weiss, Stefan; Wen, Wanqing; Yanek, Lisa R; Zhang, Weihua; Zhao, Jing Hua; Afaq, Saima; Alfred, Tamuno; Amin, Najaf; Arking, Dan; Aung, Tin; Barr, R Graham; Bielak, Lawrence F; Boerwinkle, Eric; Bottinger, Erwin P; Braund, Peter S; Brody, Jennifer A; Broeckel, Ulrich; Cabrera, Claudia P; Cade, Brian; Caizheng, Yu; Campbell, Archie; Canouil, Mickaël; Chakravarti, Aravinda; Chauhan, Ganesh; Christensen, Kaare; Cocca, Massimiliano; Collins, Francis S; Connell, John M; de Mutsert, Renée; de Silva, H Janaka; Debette, Stephanie; Dörr, Marcus; Duan, Qing; Eaton, Charles B; Ehret, Georg; Evangelou, Evangelos; Faul, Jessica D; Fisher, Virginia A; Forouhi, Nita G; Franco, Oscar H; Friedlander, Yechiel; Gao, He; Gigante, Bruna; Graff, Misa; Gu, C Charles; Gu, Dongfeng; Gupta, Preeti; Hagenaars, Saskia P; Harris, Tamara B; He, Jiang; Heikkinen, Sami; Heng, Chew-Kiat; Hirata, Makoto; Hofman, Albert; Howard, Barbara V; Hunt, Steven; Irvin, Marguerite R; Jia, Yucheng; Joehanes, Roby; Justice, Anne E; Katsuya, Tomohiro; Kaufman, Joel; Kerrison, Nicola D; Khor, Chiea Chuen; Koh, Woon-Puay; Koistinen, Heikki A; Komulainen, Pirjo; Kooperberg, Charles; Krieger, Jose E; Kubo, Michiaki; Kuusisto, Johanna; Langefeld, Carl D; Langenberg, Claudia; Launer, Lenore J; Lehne, Benjamin; Lewis, Cora E; Li, Yize; Lim, Sing Hui; Lin, Shiow; Liu, Ching-Ti; Liu, Jianjun; Liu, Jingmin; Liu, Kiang; Liu, Yeheng; Loh, Marie; Lohman, Kurt K; Long, Jirong; Louie, Tin; Mägi, Reedik; Mahajan, Anubha; Meitinger, Thomas; Metspalu, Andres; Milani, Lili; Momozawa, Yukihide; Morris, Andrew P; Mosley, Thomas H; Munson, Peter; Murray, Alison D; Nalls, Mike A; Nasri, Ubaydah; Norris, Jill M; North, Kari; Ogunniyi, Adesola; Padmanabhan, Sandosh; Palmas, Walter R; Palmer, Nicholette D; Pankow, James S; Pedersen, Nancy L; Peters, Annette; Peyser, Patricia A; Polasek, Ozren; Raitakari, Olli T; Renström, Frida; Rice, Treva K; Ridker, Paul M; Robino, Antonietta; Robinson, Jennifer G; Rose, Lynda M; Rudan, Igor; Sabanayagam, Charumathi; Salako, Babatunde L; Sandow, Kevin; Schmidt, Carsten O; Schreiner, Pamela J; Scott, William R; Seshadri, Sudha; Sever, Peter; Sitlani, Colleen M; Smith, Jennifer A; Snieder, Harold; Starr, John M; Strauch, Konstantin; Tang, Hua; Taylor, Kent D; Teo, Yik Ying; Tham, Yih Chung; Uitterlinden, André G; Waldenberger, Melanie; Wang, Lihua; Wang, Ya X; Wei, Wen Bin; Williams, Christine; Wilson, Gregory; Wojczynski, Mary K; Yao, Jie; Yuan, Jian-Min; Zonderman, Alan B; Becker, Diane M; Boehnke, Michael; Bowden, Donald W; Chambers, John C; Chen, Yii-Der Ida; de Faire, Ulf; Deary, Ian J; Esko, Tõnu; Farrall, Martin; Forrester, Terrence; Franks, Paul W; Freedman, Barry I; Froguel, Philippe; Gasparini, Paolo; Gieger, Christian; Horta, Bernardo Lessa; Hung, Yi-Jen; Jonas, Jost B; Kato, Norihiro; Kooner, Jaspal S; Laakso, Markku; Lehtimäki, Terho; Liang, Kae-Woei; Magnusson, Patrik K E; Newman, Anne B; Oldehinkel, Albertine J; Pereira, Alexandre C; Redline, Susan; Rettig, Rainer; Samani, Nilesh J; Scott, James; Shu, Xiao-Ou; van der Harst, Pim; Wagenknecht, Lynne E; Wareham, Nicholas J; Watkins, Hugh; Weir, David R; Wickremasinghe, Ananda R; Wu, Tangchun; Zheng, Wei; Kamatani, Yoichiro; Laurie, Cathy C; Bouchard, Claude; Cooper, Richard S; Evans, Michele K; Gudnason, Vilmundur; Kardia, Sharon L R; Kritchevsky, Stephen B; Levy, Daniel; O'Connell, Jeff R; Psaty, Bruce M; van Dam, Rob M; Sims, Mario; Arnett, Donna K; Mook-Kanamori, Dennis O; Kelly, Tanika N; Fox, Ervin R; Hayward, Caroline; Fornage, Myriam; Rotimi, Charles N; Province, Michael A; van Duijn, Cornelia M; Tai, E Shyong; Wong, Tien Yin; Loos, Ruth J F; Reiner, Alex P; Rotter, Jerome I; Zhu, Xiaofeng; Bierut, Laura J; Gauderman, W James; Caulfield, Mark J; Elliott, Paul; Rice, Kenneth; Munroe, Patricia B; Morrison, Alanna C; Cupples, L Adrienne; Rao, Dabeeru C; Chasman, Daniel I
2018-03-01
Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed genome-wide association meta-analyses of systolic and diastolic BP incorporating gene-smoking interactions in 610,091 individuals. Stage 1 analysis examined ∼18.8 million SNPs and small insertion/deletion variants in 129,913 individuals from four ancestries (European, African, Asian, and Hispanic) with follow-up analysis of promising variants in 480,178 additional individuals from five ancestries. We identified 15 loci that were genome-wide significant (p < 5 × 10 -8 ) in stage 1 and formally replicated in stage 2. A combined stage 1 and 2 meta-analysis identified 66 additional genome-wide significant loci (13, 35, and 18 loci in European, African, and trans-ancestry, respectively). A total of 56 known BP loci were also identified by our results (p < 5 × 10 -8 ). Of the newly identified loci, ten showed significant interaction with smoking status, but none of them were replicated in stage 2. Several loci were identified in African ancestry, highlighting the importance of genetic studies in diverse populations. The identified loci show strong evidence for regulatory features and support shared pathophysiology with cardiometabolic and addiction traits. They also highlight a role in BP regulation for biological candidates such as modulators of vascular structure and function (CDKN1B, BCAR1-CFDP1, PXDN, EEA1), ciliopathies (SDCCAG8, RPGRIP1L), telomere maintenance (TNKS, PINX1, AKTIP), and central dopaminergic signaling (MSRA, EBF2). Copyright © 2018 American Society of Human Genetics. All rights reserved.
Begin at the beginning: A BAC-end view of the passion fruit (Passiflora) genome.
Santos, Anselmo Azevedo; Penha, Helen Alves; Bellec, Arnaud; Munhoz, Carla de Freitas; Pedrosa-Harand, Andrea; Bergès, Hélène; Vieira, Maria Lucia Carneiro
2014-09-26
The passion fruit (Passiflora edulis) is a tropical crop of economic importance both for juice production and consumption as fresh fruit. The juice is also used in concentrate blends that are consumed worldwide. However, very little is known about the genome of the species. Therefore, improving our understanding of passion fruit genomics is essential and to some degree a pre-requisite if its genetic resources are to be used more efficiently. In this study, we have constructed a large-insert BAC library and provided the first view on the structure and content of the passion fruit genome, using BAC-end sequence (BES) data as a major resource. The library consisted of 82,944 clones and its levels of organellar DNA were very low. The library represents six haploid genome equivalents, and the average insert size was 108 kb. To check its utility for gene isolation, successful macroarray screening experiments were carried out with probes complementary to eight Passiflora gene sequences available in public databases. BACs harbouring those genes were used in fluorescent in situ hybridizations and unique signals were detected for four BACs in three chromosomes (n=9). Then, we explored 10,000 BES and we identified reads likely to contain repetitive mobile elements (19.6% of all BES), simple sequence repeats and putative proteins, and to estimate the GC content (~42%) of the reads. Around 9.6% of all BES were found to have high levels of similarity to plant genes and ontological terms were assigned to more than half of the sequences analysed (940). The vast majority of the top-hits made by our sequences were to Populus trichocarpa (24.8% of the total occurrences), Theobroma cacao (21.6%), Ricinus communis (14.3%), Vitis vinifera (6.5%) and Prunus persica (3.8%). We generated the first large-insert library for a member of Passifloraceae. This BAC library provides a new resource for genetic and genomic studies, as well as it represents a valuable tool for future whole genome study. Remarkably, a number of BAC-end pair sequences could be mapped to intervals of the sequenced Arabidopsis thaliana, V. vinifera and P. trichocarpa chromosomes, and putative collinear microsyntenic regions were identified.
Genome editing using CRISPR/Cas9-based knock-in approaches in zebrafish.
Albadri, Shahad; Del Bene, Filippo; Revenu, Céline
2017-05-15
With its variety of applications, the CRISPR/Cas9 genome editing technology has been rapidly evolving in the last few years. In the zebrafish community, knock-out reports are constantly increasing but insertion studies have been so far more challenging. With this review, we aim at giving an overview of the homologous directed repair (HDR)-based knock-in generation in zebrafish. We address the critical points and limitations of the procedure such as cutting efficiency of the chosen single guide RNA, use of cas9 mRNA or Cas9 protein, homology arm size etc. but also ways to circumvent encountered issues with HDR insertions by the development of non-homologous dependent strategies. While imprecise, these homology-independent mechanisms based on non-homologous-end-joining (NHEJ) repair have been employed in zebrafish to generate reporter lines or to accurately edit an open reading frame by the use of intron-targeting modifications. Therefore, with higher efficiency and insertion rate, NHEJ-based knock-in seems to be a promising approach to target endogenous loci and to circumvent the limitations of HDR whenever it is possible and appropriate. In this perspective, we propose new strategies to generate cDNA edited or tagged insertions, which once established will constitute a new and versatile toolbox for CRISPR/Cas9-based knock-ins in zebrafish. Copyright © 2017 Elsevier Inc. All rights reserved.
Zhu, Li-Ping; Yue, Xin-Jing; Han, Kui; Li, Zhi-Feng; Zheng, Lian-Shuai; Yi, Xiu-Nan; Wang, Hai-Long; Zhang, You-Ming; Li, Yue-Zhong
2015-07-22
Exotic genes, especially clustered multiple-genes for a complex pathway, are normally integrated into chromosome for heterologous expression. The influences of insertion sites on heterologous expression and allotropic expressions of exotic genes on host remain mostly unclear. We compared the integration and expression efficiencies of single and multiple exotic genes that were inserted into Myxococcus xanthus genome by transposition and attB-site-directed recombination. While the site-directed integration had a rather stable chloramphenicol acetyl transferase (CAT) activity, the transposition produced varied CAT enzyme activities. We attempted to integrate the 56-kb gene cluster for the biosynthesis of antitumor polyketides epothilones into M. xanthus genome by site-direction but failed, which was determined to be due to the insertion size limitation at the attB site. The transposition technique produced many recombinants with varied production capabilities of epothilones, which, however, were not paralleled to the transcriptional characteristics of the local sites where the genes were integrated. Comparative transcriptomics analysis demonstrated that the allopatric integrations caused selective changes of host transcriptomes, leading to varied expressions of epothilone genes in different mutants. With the increase of insertion fragment size, transposition is a more practicable integration method for the expression of exotic genes. Allopatric integrations selectively change host transcriptomes, which lead to varied expression efficiencies of exotic genes.
[The application of genome editing in identification of plant gene function and crop breeding].
Zhou, Xiang-chun; Xing, Yong-zhong
2016-03-01
Plant genome can be modified via current biotechnology with high specificity and excellent efficiency. Zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system are the key engineered nucleases used in the genome editing. Genome editing techniques enable gene targeted mutagenesis, gene knock-out, gene insertion or replacement at the target sites during the endogenous DNA repair process, including non-homologous end joining (NHEJ) and homologous recombination (HR), triggered by the induction of DNA double-strand break (DSB). Genome editing has been successfully applied in the genome modification of diverse plant species, such as Arabidopsis thaliana, Oryza sativa, and Nicotiana tabacum. In this review, we summarize the application of genome editing in identification of plant gene function and crop breeding. Moreover, we also discuss the improving points of genome editing in crop precision genetic improvement for further study.
[Genome editing of industrial microorganism].
Zhu, Linjiang; Li, Qi
2015-03-01
Genome editing is defined as highly-effective and precise modification of cellular genome in a large scale. In recent years, such genome-editing methods have been rapidly developed in the field of industrial strain improvement. The quickly-updating methods thoroughly change the old mode of inefficient genetic modification, which is "one modification, one selection marker, and one target site". Highly-effective modification mode in genome editing have been developed including simultaneous modification of multiplex genes, highly-effective insertion, replacement, and deletion of target genes in the genome scale, cut-paste of a large DNA fragment. These new tools for microbial genome editing will certainly be applied widely, and increase the efficiency of industrial strain improvement, and promote the revolution of traditional fermentation industry and rapid development of novel industrial biotechnology like production of biofuel and biomaterial. The technological principle of these genome-editing methods and their applications were summarized in this review, which can benefit engineering and construction of industrial microorganism.
Genetic and Functional Diversification of Small RNA Pathways in Plants
Gustafson, Adam M; Kasschau, Kristin D; Lellis, Andrew D; Zilberman, Daniel; Jacobsen, Steven E
2004-01-01
Multicellular eukaryotes produce small RNA molecules (approximately 21–24 nucleotides) of two general types, microRNA (miRNA) and short interfering RNA (siRNA). They collectively function as sequence-specific guides to silence or regulate genes, transposons, and viruses and to modify chromatin and genome structure. Formation or activity of small RNAs requires factors belonging to gene families that encode DICER (or DICER-LIKE [DCL]) and ARGONAUTE proteins and, in the case of some siRNAs, RNA-dependent RNA polymerase (RDR) proteins. Unlike many animals, plants encode multiple DCL and RDR proteins. Using a series of insertion mutants of Arabidopsis thaliana, unique functions for three DCL proteins in miRNA (DCL1), endogenous siRNA (DCL3), and viral siRNA (DCL2) biogenesis were identified. One RDR protein (RDR2) was required for all endogenous siRNAs analyzed. The loss of endogenous siRNA in dcl3 and rdr2 mutants was associated with loss of heterochromatic marks and increased transcript accumulation at some loci. Defects in siRNA-generation activity in response to turnip crinkle virus in dcl2 mutant plants correlated with increased virus susceptibility. We conclude that proliferation and diversification of DCL and RDR genes during evolution of plants contributed to specialization of small RNA-directed pathways for development, chromatin structure, and defense. PMID:15024409
48 CFR 819.7115 - Solicitation provisions.
Code of Federal Regulations, 2010 CFR
2010-10-01
... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS VA Mentor-Protégé Program 819.7115 Solicitation provisions. (a) Insert 852.219-71, VA Mentor-Protégé Program, in solicitations that include FAR clause 52.219-9, Small Business Subcontracting Plan. (b) Insert 852.219-72, Evaluation Factor for Participation in the VA Mentor...
48 CFR 4.607 - Solicitation provisions and contract clause.
Code of Federal Regulations, 2014 CFR
2014-10-01
... clause. (a) Insert the provision at 52.204-5, Women-Owned Business (Other Than Small Business), in all solicitations that— (1) Are not set aside for small business concerns; (2) Exceed the simplified acquisition...) Insert the provision at 52.204-6, Data Universal Numbering System Number, in solicitations that do not...
48 CFR 4.607 - Solicitation provisions and contract clause.
Code of Federal Regulations, 2013 CFR
2013-10-01
... clause. (a) Insert the provision at 52.204-5, Women-Owned Business (Other Than Small Business), in all solicitations that— (1) Are not set aside for small business concerns; (2) Exceed the simplified acquisition...) Insert the provision at 52.204-6, Data Universal Numbering System Number, in solicitations that do not...
Shewale, Jaiprakash G; Schneida, Elaine; Wilson, Jonathan; Walker, Jerilyn A; Batzer, Mark A; Sinha, Sudhir K
2007-03-01
The human DNA quantification (H-Quant) system, developed for use in human identification, enables quantitation of human genomic DNA in biological samples. The assay is based on real-time amplification of AluYb8 insertions in hominoid primates. The relatively high copy number of subfamily-specific Alu repeats in the human genome enables quantification of very small amounts of human DNA. The oligonucleotide primers present in H-Quant are specific for human DNA and closely related great apes. During the real-time PCR, the SYBR Green I dye binds to the DNA that is synthesized by the human-specific AluYb8 oligonucleotide primers. The fluorescence of the bound SYBR Green I dye is measured at the end of each PCR cycle. The cycle at which the fluorescence crosses the chosen threshold correlates to the quantity of amplifiable DNA in that sample. The minimal sensitivity of the H-Quant system is 7.6 pg/microL of human DNA. The amplicon generated in the H-Quant assay is 216 bp, which is within the same range of the common amplifiable short tandem repeat (STR) amplicons. This size amplicon enables quantitation of amplifiable DNA as opposed to a quantitation of degraded or nonamplifiable DNA of smaller sizes. Development and validation studies were performed on the 7500 real-time PCR system following the Quality Assurance Standards for Forensic DNA Testing Laboratories.
CRISPRDetect: A flexible algorithm to define CRISPR arrays.
Biswas, Ambarish; Staals, Raymond H J; Morales, Sergio E; Fineran, Peter C; Brown, Chris M
2016-05-17
CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci. We have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with 'extra-large' repeats in bacteria (repeats 44-50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets. CRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.
Zhao, Chaoyang; Shukle, Richard; Navarro-Escalante, Lucio; Chen, Mingshun; Richards, Stephen; Stuart, Jeffrey J
2016-01-01
The genetic tractability of the Hessian fly (HF, Mayetiola destructor) provides an opportunity to investigate the mechanisms insects use to induce plant gall formation. Here we demonstrate that capacity using the newly sequenced HF genome by identifying the gene (vH24) that elicits effector-triggered immunity in wheat (Triticum spp.) seedlings carrying HF resistance gene H24. vH24 was mapped within a 230-kb genomic fragment near the telomere of HF chromosome X1. That fragment contains only 21 putative genes. The best candidate vH24 gene in this region encodes a protein containing a secretion signal and a type-2 serine/threonine protein phosphatase (PP2C) domain. This gene has an H24-virulence associated insertion in its promoter that appears to silence transcription of the gene in H24-virulent larvae. Candidate vH24 is a member of a small family of genes that encode secretion signals and PP2C domains. It belongs to the fraction of genes in the HF genome previously predicted to encode effector proteins. Because PP2C proteins are not normally secreted, our results suggest that these are PP2C effectors that HF larvae inject into wheat cells to redirect, or interfere, with wheat signal transduction pathways. Copyright © 2015 Elsevier Ltd. All rights reserved.
Cheng, Feixiong; Zhao, Junfei; Zhao, Zhongming
2016-07-01
Cancer is often driven by the accumulation of genetic alterations, including single nucleotide variants, small insertions or deletions, gene fusions, copy-number variations, and large chromosomal rearrangements. Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data and catalog somatic mutations in both common and rare cancer types. So far, the somatic mutation landscapes and signatures of >10 major cancer types have been reported; however, pinpointing driver mutations and cancer genes from millions of available cancer somatic mutations remains a monumental challenge. To tackle this important task, many methods and computational tools have been developed during the past several years and, thus, a review of its advances is urgently needed. Here, we first summarize the main features of these methods and tools for whole-exome, whole-genome and whole-transcriptome sequencing data. Then, we discuss major challenges like tumor intra-heterogeneity, tumor sample saturation and functionality of synonymous mutations in cancer, all of which may result in false-positive discoveries. Finally, we highlight new directions in studying regulatory roles of noncoding somatic mutations and quantitatively measuring circulating tumor DNA in cancer. This review may help investigators find an appropriate tool for detecting potential driver or actionable mutations in rapidly emerging precision cancer medicine. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Bardaji, Leire; Añorga, Maite; Jackson, Robert W.; Martínez-Bilbao, Alejandro; Yanguas-Casás, Natalia; Murillo, Jesús
2011-01-01
Mobile genetic elements are widespread in Pseudomonas syringae, and often associate with virulence genes. Genome reannotation of the model bean pathogen P. syringae pv. phaseolicola 1448A identified seventeen types of insertion sequences and two miniature inverted-repeat transposable elements (MITEs) with a biased distribution, representing 2.8% of the chromosome, 25.8% of the 132-kb virulence plasmid and 2.7% of the 52-kb plasmid. Employing an entrapment vector containing sacB, we estimated that transposition frequency oscillated between 2.6×10−5 and 1.1×10−6, depending on the clone, although it was stable for each clone after consecutive transfers in culture media. Transposition frequency was similar for bacteria grown in rich or minimal media, and from cells recovered from compatible and incompatible plant hosts, indicating that growth conditions do not influence transposition in strain 1448A. Most of the entrapped insertions contained a full-length IS801 element, with the remaining insertions corresponding to sequences smaller than any transposable element identified in strain 1448A, and collectively identified as miniature sequences. From these, fragments of 229, 360 and 679-nt of the right end of IS801 ended in a consensus tetranucleotide and likely resulted from one-ended transposition of IS801. An average 0.7% of the insertions analyzed consisted of IS801 carrying a fragment of variable size from gene PSPPH_0008/PSPPH_0017, showing that IS801 can mobilize DNA in vivo. Retrospective analysis of complete plasmids and genomes of P. syringae suggests, however, that most fragments of IS801 are likely the result of reorganizations rather than one-ended transpositions, and that this element might preferentially contribute to genome flexibility by generating homologous regions of recombination. A further miniature sequence previously found to affect host range specificity and virulence, designated MITEPsy1 (100-nt), represented an average 2.4% of the total number of insertions entrapped in sacB, demonstrating for the first time the mobilization of a MITE in bacteria. PMID:22016774
Bardaji, Leire; Añorga, Maite; Jackson, Robert W; Martínez-Bilbao, Alejandro; Yanguas-Casás, Natalia; Murillo, Jesús
2011-01-01
Mobile genetic elements are widespread in Pseudomonas syringae, and often associate with virulence genes. Genome reannotation of the model bean pathogen P. syringae pv. phaseolicola 1448A identified seventeen types of insertion sequences and two miniature inverted-repeat transposable elements (MITEs) with a biased distribution, representing 2.8% of the chromosome, 25.8% of the 132-kb virulence plasmid and 2.7% of the 52-kb plasmid. Employing an entrapment vector containing sacB, we estimated that transposition frequency oscillated between 2.6×10(-5) and 1.1×10(-6), depending on the clone, although it was stable for each clone after consecutive transfers in culture media. Transposition frequency was similar for bacteria grown in rich or minimal media, and from cells recovered from compatible and incompatible plant hosts, indicating that growth conditions do not influence transposition in strain 1448A. Most of the entrapped insertions contained a full-length IS801 element, with the remaining insertions corresponding to sequences smaller than any transposable element identified in strain 1448A, and collectively identified as miniature sequences. From these, fragments of 229, 360 and 679-nt of the right end of IS801 ended in a consensus tetranucleotide and likely resulted from one-ended transposition of IS801. An average 0.7% of the insertions analyzed consisted of IS801 carrying a fragment of variable size from gene PSPPH_0008/PSPPH_0017, showing that IS801 can mobilize DNA in vivo. Retrospective analysis of complete plasmids and genomes of P. syringae suggests, however, that most fragments of IS801 are likely the result of reorganizations rather than one-ended transpositions, and that this element might preferentially contribute to genome flexibility by generating homologous regions of recombination. A further miniature sequence previously found to affect host range specificity and virulence, designated MITEPsy1 (100-nt), represented an average 2.4% of the total number of insertions entrapped in sacB, demonstrating for the first time the mobilization of a MITE in bacteria.
Fléchard, Maud; Gilot, Philippe
2014-07-01
We have referenced and described Streptococcus agalactiae transposable elements encoding DDE transposases. These elements belonged to nine families of insertion sequences (ISs) and to a family of conjugative transposons (TnGBSs). An overview of the physiological impact of the insertion of all these elements is provided. DDE-transposable elements affect S. agalactiae in a number of aspects of its capability to adapt to various environments and modulate the expression of several virulence genes, the scpB-lmB genomic region and the genes involved in capsule expression and haemolysin transport being the targets of several different mobile elements. The referenced mobile elements modify S. agalactiae behaviour by transferring new gene(s) to its genome, by modifying the expression of neighbouring genes at the integration site or by promoting genomic rearrangements. Transposition of some of these elements occurs in vivo, suggesting that by dynamically regulating some adaptation and/or virulence genes, they improve the ability of S. agalactiae to reach different niches within its host and ensure the 'success' of the infectious process. © 2014 The Authors.
What makes up plant genomes: The vanishing line between transposable elements and genes.
Zhao, Dongyan; Ferguson, Ann A; Jiang, Ning
2016-02-01
The ultimate source of evolution is mutation. As the largest component in plant genomes, transposable elements (TEs) create numerous types of mutations that cannot be mimicked by other genetic mechanisms. When TEs insert into genomic sequences, they influence the expression of nearby genes as well as genes unlinked to the insertion. TEs can duplicate, mobilize, and recombine normal genes or gene fragments, with the potential to generate new genes or modify the structure of existing genes. TEs also donate their transposase coding regions for cellular functions in a process called TE domestication. Despite the host defense against TE activity, a subset of TEs survived and thrived through discreet selection of transposition activity, target site, element size, and the internal sequence. Finally, TEs have established strategies to reduce the efficacy of host defense system by increasing the cost of silencing TEs. This review discusses the recent progress in the area of plant TEs with a focus on the interaction between TEs and genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Suárez, Gabriel A; Renda, Brian A; Dasgupta, Aurko; Barrick, Jeffrey E
2017-09-01
The genomes of most bacteria contain mobile DNA elements that can contribute to undesirable genetic instability in engineered cells. In particular, transposable insertion sequence (IS) elements can rapidly inactivate genes that are important for a designed function. We deleted all six copies of IS 1236 from the genome of the naturally transformable bacterium Acinetobacter baylyi ADP1. The natural competence of ADP1 made it possible to rapidly repair deleterious point mutations that arose during strain construction. In the resulting ADP1-ISx strain, the rates of mutations inactivating a reporter gene were reduced by 7- to 21-fold. This reduction was higher than expected from the incidence of new IS 1236 insertions found during a 300-day mutation accumulation experiment with wild-type ADP1 that was used to estimate spontaneous mutation rates in the strain. The extra improvement appears to be due in part to eliminating large deletions caused by IS 1236 activity, as the point mutation rate was unchanged in ADP1-ISx. Deletion of an error-prone polymerase ( dinP ) and a DNA damage response regulator ( umuD Ab [the umuD gene of A. baylyi ]) from the ADP1-ISx genome did not further reduce mutation rates. Surprisingly, ADP1-ISx exhibited increased transformability. This improvement may be due to less autolysis and aggregation of the engineered cells than of the wild type. Thus, deleting IS elements from the ADP1 genome led to a greater than expected increase in evolutionary reliability and unexpectedly enhanced other key strain properties, as has been observed for other clean-genome bacterial strains. ADP1-ISx is an improved chassis for metabolic engineering and other applications. IMPORTANCE Acinetobacter baylyi ADP1 has been proposed as a next-generation bacterial host for synthetic biology and genome engineering due to its ability to efficiently take up DNA from its environment during normal growth. We deleted transposable elements that are capable of copying themselves, inserting into other genes, and thereby inactivating them from the ADP1 genome. The resulting "clean-genome" ADP1-ISx strain exhibited larger reductions in the rates of inactivating mutations than expected from spontaneous mutation rates measured via whole-genome sequencing of lineages evolved under relaxed selection. Surprisingly, we also found that IS element activity reduces transformability and is a major cause of cell aggregation and death in wild-type ADP1 grown under normal laboratory conditions. More generally, our results demonstrate that domesticating a bacterial genome by removing mobile DNA elements that have accumulated during evolution in the wild can have unanticipated benefits. Copyright © 2017 American Society for Microbiology.
The Small Nuclear Genomes of Selaginella Are Associated with a Low Rate of Genome Size Evolution.
Baniaga, Anthony E; Arrigo, Nils; Barker, Michael S
2016-06-03
The haploid nuclear genome size (1C DNA) of vascular land plants varies over several orders of magnitude. Much of this observed diversity in genome size is due to the proliferation and deletion of transposable elements. To date, all vascular land plant lineages with extremely small nuclear genomes represent recently derived states, having ancestors with much larger genome sizes. The Selaginellaceae represent an ancient lineage with extremely small genomes. It is unclear how small nuclear genomes evolved in Selaginella We compared the rates of nuclear genome size evolution in Selaginella and major vascular plant clades in a comparative phylogenetic framework. For the analyses, we collected 29 new flow cytometry estimates of haploid genome size in Selaginella to augment publicly available data. Selaginella possess some of the smallest known haploid nuclear genome sizes, as well as the lowest rate of genome size evolution observed across all vascular land plants included in our analyses. Additionally, our analyses provide strong support for a history of haploid nuclear genome size stasis in Selaginella Our results indicate that Selaginella, similar to other early diverging lineages of vascular land plants, has relatively low rates of genome size evolution. Further, our analyses highlight that a rapid transition to a small genome size is only one route to an extremely small genome. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yu, Danna; Fang, Xindong; Storey, Kenneth B; Zhang, Yongpu; Zhang, Jiayong
2016-05-01
The complete mitochondrial genomes of the yellow-bellied slider (Trachemys scripta scripta) and anoxia tolerant red-eared slider (Trachemys scripta elegans) turtles were sequenced to analyze gene arrangement. The complete mt genomes of T. s. scripta and elegans were circular molecules of 16,791 bp and 16,810 bp in length, respectively, and included an A + 1 frameshift insertion in ND3 and ND4L genes. The AT content of the overall base composition of scripta and elegans was 61.2%. Nucleotide sequence divergence of the mt-genome (p distance) between scripta and elegans was 0.4%. A detailed comparison between the mitochondrial genomes of the two subspecies is shown.
Anton, Brian P; Mongodin, Emmanuel F; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R; Roberts, Richard J; Raleigh, Elisabeth A
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.
Anton, Brian P.; Mongodin, Emmanuel F.; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R.; Roberts, Richard J.; Raleigh, Elisabeth A.
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems. PMID:26010885
Davidson, Rebecca M.; Hasan, Nabeeh A.; de Moura, Vinicius Calado Nogueira; Duarte, Rafael Silva; Jackson, Mary; Strong, Michael
2013-01-01
Rapidly growing, non-tuberculous mycobacteria (NTM) in the Mycobacterium abscessus (MAB) species are emerging pathogens that cause various diseases including skin and respiratory infections. The species has undergone recent taxonomic nomenclature refinement, and is currently recognized as two subspecies, M. abscessus subsp. abscessus (MAB-A) and M. abscessus subsp. bolletii (MAB-B). The recently reported outbreaks of MAB-B in surgical patients in Brazil from 2004 to 2009 and in cystic fibrosis patients in the United Kingdom (UK) in 2006 to 2012 underscore the need to investigate the genetic diversity of clinical MAB strains. To this end, we sequenced the genomes of two Brazilian MAB-B epidemic isolates (CRM-0019 and CRM-0020) derived from an outbreak of skin infections in Rio de Janeiro, two unrelated MAB strains from patients with pulmonary infections in the United States (US) (NJH8 and NJH11) and one type MAB-B strain (CCUG 48898) and compared them to 25 publically available genomes of globally diverse MAB strains. Genome-wide analyses of 27,598 core genome single nucleotide polymorphisms (SNPs) revealed that the two Brazilian derived CRM strains are nearly indistinguishable from one another and are more closely related to UK outbreak isolates infecting CF patients than to strains from the US, Malaysia or France. Comparative genomic analyses of six closely related outbreak strains revealed geographic-specific large-scale insertion/deletion variation that corresponds to bacteriophage insertions and recombination hotspots. Our study integrates new genome sequence data with existing genomic information to explore the global diversity of infectious M. abscessus isolates and to compare clinically relevant outbreak strains from different continents. PMID:24055961
Grandi, Nicole; Cadeddu, Marta; Blomberg, Jonas; Tramontano, Enzo
2016-09-09
Human endogenous retroviruses (HERVs) are ancient sequences integrated in the germ line cells and vertically transmitted through the offspring constituting about 8 % of our genome. In time, HERVs accumulated mutations that compromised their coding capacity. A prominent exception is HERV-W locus 7q21.2, producing a functional Env protein (Syncytin-1) coopted for placental syncytiotrophoblast formation. While expression of HERV-W sequences has been investigated for their correlation to disease, an exhaustive description of the group composition and characteristics is still not available and current HERV-W group information derive from studies published a few years ago that, of course, used the rough assemblies of the human genome available at that time. This hampers the comparison and correlation with current human genome assemblies. In the present work we identified and described in detail the distribution and genetic composition of 213 HERV-W elements. The bioinformatics analysis led to the characterization of several previously unreported features and provided a phylogenetic classification of two main subgroups with different age and structural characteristics. New facts on HERV-W genomic context of insertion and co-localization with sequences putatively involved in disease development are also reported. The present work is a detailed overview of the HERV-W contribution to the human genome and provides a robust genetic background useful to clarify HERV-W role in pathologies with poorly understood etiology, representing, to our knowledge, the most complete and exhaustive HERV-W dataset up to date.
A very early-branching Staphylococcus aureus lineage lacking the carotenoid pigment staphyloxanthin.
Holt, Deborah C; Holden, Matthew T G; Tong, Steven Y C; Castillo-Ramirez, Santiago; Clarke, Louise; Quail, Michael A; Currie, Bart J; Parkhill, Julian; Bentley, Stephen D; Feil, Edward J; Giffard, Philip M
2011-01-01
Here we discuss the evolution of the northern Australian Staphylococcus aureus isolate MSHR1132 genome. MSHR1132 belongs to the divergent clonal complex 75 lineage. The average nucleotide divergence between orthologous genes in MSHR1132 and typical S. aureus is approximately sevenfold greater than the maximum divergence observed in this species to date. MSHR1132 has a small accessory genome, which includes the well-characterized genomic islands, νSAα and νSaβ, suggesting that these elements were acquired well before the expansion of the typical S. aureus population. Other mobile elements show mosaic structure (the prophage ϕSa3) or evidence of recent acquisition from a typical S. aureus lineage (SCCmec, ICE6013 and plasmid pMSHR1132). There are two differences in gene repertoire compared with typical S. aureus that may be significant clues as to the genetic basis underlying the successful emergence of S. aureus as a pathogen. First, MSHR1132 lacks the genes for production of staphyloxanthin, the carotenoid pigment that confers upon S. aureus its characteristic golden color and protects against oxidative stress. The lack of pigment was demonstrated in 126 of 126 CC75 isolates. Second, a mobile clustered regularly interspaced short palindromic repeat (CRISPR) element is inserted into orfX of MSHR1132. Although common in other staphylococcal species, these elements are very rare within S. aureus and may impact accessory genome acquisition. The CRISPR spacer sequences reveal a history of attempted invasion by known S. aureus mobile elements. There is a case for the creation of a new taxon to accommodate this and related isolates.
BESST--efficient scaffolding of large fragmented assemblies.
Sahlin, Kristoffer; Vezzi, Francesco; Nystedt, Björn; Lundeberg, Joakim; Arvestad, Lars
2014-08-15
The use of short reads from High Throughput Sequencing (HTS) techniques is now commonplace in de novo assembly. Yet, obtaining contiguous assemblies from short reads is challenging, thus making scaffolding an important step in the assembly pipeline. Different algorithms have been proposed but many of them use the number of read pairs supporting a linking of two contigs as an indicator of reliability. This reasoning is intuitive, but fails to account for variation in link count due to contig features.We have also noted that published scaffolders are only evaluated on small datasets using output from only one assembler. Two issues arise from this. Firstly, some of the available tools are not well suited for complex genomes. Secondly, these evaluations provide little support for inferring a software's general performance. We propose a new algorithm, implemented in a tool called BESST, which can scaffold genomes of all sizes and complexities and was used to scaffold the genome of P. abies (20 Gbp). We performed a comprehensive comparison of BESST against the most popular stand-alone scaffolders on a large variety of datasets. Our results confirm that some of the popular scaffolders are not practical to run on complex datasets. Furthermore, no single stand-alone scaffolder outperforms the others on all datasets. However, BESST fares favorably to the other tested scaffolders on GAGE datasets and, moreover, outperforms the other methods when library insert size distribution is wide. We conclude from our results that information sources other than the quantity of links, as is commonly used, can provide useful information about genome structure when scaffolding.
Quantitative Effects of P Elements on Hybrid Dysgenesis in Drosophila Melanogaster
Rasmusson, K. E.; Simmons, M. J.; Raymond, J. D.; McLarnon, C. F.
1990-01-01
Genetic analyses involving chromosomes from seven inbred lines derived from a single M' strain were used to study the quantitative relationships between the incidence and severity of P-M hybrid dysgenesis and the number of genomic P elements. In four separate analyses, the mutability of sn(w), a P element-insertion mutation of the X-linked singed locus, was found to be inversely related to the number of autosomal P elements. Since sn(w) mutability is caused by the action of the P transposase, this finding supports the hypothesis that genomic P elements titrate the transposase present within a cell. Other analyses demonstrated that autosomal transmission ratios were distorted by P element action. In these analyses, the amount of distortion against an autosome increased more or less linearly with the number of P elements carried by the autosome. Additional analyses showed that the magnitude of this distortion was reduced when a second P element-containing autosome was present in the genome. This reduction could adequately be explained by transposase titration; there was no evidence that it was due to repressor molecules binding to P elements and inhibiting their movement. The influence of genomic P elements on the incidence of gonadal dysgenesis was also investigated. Although no simple relationship between the number of P elements and the incidence of the trait could be discerned, it was clear that even a small number of elements could increase the incidence markedly. The failure to find a quantitative relationship between P element number and the incidence of gonadal dysgenesis probably reflects the complex etiology of this trait. PMID:2155853
Functional impact of the human mobilome.
Babatz, Timothy D; Burns, Kathleen H
2013-06-01
The human genome is replete with interspersed repetitive sequences derived from the propagation of mobile DNA elements. Three families of human retrotransposons remain active today: LINE1, Alu, and SVA elements. Since 1988, de novo insertions at previously recognized disease loci have been shown to generate highly penetrant alleles in Mendelian disorders. Only recently has the extent of germline-transmitted retrotransposon insertion polymorphism (RIP) in human populations been fully realized. Also exciting are recent studies of somatic retrotransposition in human tissues and reports of tumor-specific insertions, suggesting roles in tissue heterogeneity and tumorigenesis. Here we discuss mobile elements in human disease with an emphasis on exciting developments from the last several years. Copyright © 2013 Elsevier Ltd. All rights reserved.