Science.gov

Sample records for genomic integration mediated

  1. Exogenous gene integration mediated by genome editing technologies in zebrafish.

    PubMed

    Morita, Hitoshi; Taimatsu, Kiyohito; Yanagi, Kanoko; Kawahara, Atsuo

    2017-03-08

    Genome editing technologies, such as transcription activator-like effector nuclease (TALEN) and the clustered regularly interspaced short palindromic repeat (CRISPR)/ CRISPR-associated protein (Cas) systems, can induce DNA double-strand breaks (DSBs) at the targeted genomic locus, leading to frameshift-mediated gene disruption in the process of DSB repair. Recently, the technology-induced DSBs followed by DSB repairs are applied to integrate exogenous genes into the targeted genomic locus in various model organisms. In addition to a conventional knock-in technology mediated by homology-directed repair (HDR), novel knock-in technologies using refined donor vectors have also been developed with the genome editing technologies based on other DSB repair mechanisms, including non-homologous end joining (NHEJ) and microhomology-mediated end joining (MMEJ). Therefore, the improved knock-in technologies would contribute to freely modify the genome of model organisms.

  2. Enhanced CRISPR/Cas9-mediated biallelic genome targeting with dual surrogate reporter-integrated donors.

    PubMed

    Wu, Yun; Xu, Kun; Ren, Chonghua; Li, Xinyi; Lv, Huijiao; Han, Furong; Wei, Zehui; Wang, Xin; Zhang, Zhiying

    2017-03-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system has recently emerged as a simple, yet powerful genome engineering tool, which has been widely used for genome modification in various organisms and cell types. However, screening biallelic genome-modified cells is often time-consuming and technically challenging. In this study, we incorporated two different surrogate reporter cassettes into paired donor plasmids, which were used as both the surrogate reporters and the knock-in donors. By applying our dual surrogate reporter-integrated donor system, we demonstrate high frequency of CRISPR/Cas9-mediated biallelic genome integration in both human HEK293T and porcine PK15 cells (34.09% and 18.18%, respectively). Our work provides a powerful genetic tool for assisting the selection and enrichment of cells with targeted biallelic genome modification. © 2017 Federation of European Biochemical Societies.

  3. Altering genomic integrity: heavy metal exposure promotes trans-posable element-mediated damage

    PubMed Central

    Morales, Maria E.; Servant, Geraldine; Ade, Catherine; Roy-Enge, Astrid M.

    2015-01-01

    Maintenance of genomic integrity is critical for cellular homeostasis and survival. The active transposable elements (TEs) composed primarily of three mobile element lineages LINE-1, Alu, and SVA comprise approximately 30% of the mass of the human genome. For the past two decades, studies have shown that TEs significantly contribute to genetic instability and that TE-caused damages are associated with genetic diseases and cancer. Different environmental exposures, including several heavy metals, influence how TEs interact with its host genome increasing their negative impact. This mini-review provides some basic knowledge on TEs, their contribution to disease and an overview of the current knowledge on how heavy metals influence TE-mediated damage. PMID:25774044

  4. miR146a-mediated targeting of FANCM during inflammation compromises genome integrity

    PubMed Central

    Kim, Hyun Hee; Lee, Hyun-Seo; Jun, Semo; Cha, Jeong-Heon; Kee, Younghoon; You, Ho Jin; Lee, Jung-Hee

    2016-01-01

    Inflammation is a potent inducer of tumorigenesis. Increased DNA damage or loss of genome integrity is thought to be one of the mechanisms linking inflammation and cancer development. It has been suggested that NF-κB-induced microRNA-146 (miR146a) may be a mediator of the inflammatory response. Based on our initial observation that miR146a overexpression strongly increases DNA damage, we investigated its potential role as a modulator of DNA repair. Here, we demonstrate that FANCM, a component in the Fanconi Anemia pathway, is a novel target of miR146a. miR146a suppressed FANCM expression by directly binding to the 3′ untranslated region of the gene. miR146a-induced downregulation of FANCM was associated with inhibition of FANCD2 monoubiquitination, reduced DNA homologous recombination repair and checkpoint response, failed recovery from replication stress, and increased cellular sensitivity to cisplatin. These phenotypes were recapitulated when miR146a expression was induced by overexpressing the NF-κB subunit p65/RelA or Helicobacter pylori infection in a human gastric cell line; the phenotypes were effectively reversed with an anti-miR146a antagomir. These results suggest that undesired inflammation events caused by a pathogen or over-induction of miR146a can impair genome integrity via suppression of FANCM. PMID:27351285

  5. Site-specific gene integration in rice genome mediated by the FLP-FRT recombination system.

    PubMed

    Nandy, Soumen; Srivastava, Vibha

    2011-08-01

    Plant transformation based on random integration of foreign DNA often generates complex integration structures. Precision in the integration process is necessary to ensure the formation of full-length, single-copy integration. Site-specific recombination systems are versatile tools for precise genomic manipulations such as DNA excision, inversion or integration. The yeast FLP-FRT recombination system has been widely used for DNA excision in higher plants. Here, we report the use of FLP-FRT system for efficient targeting of foreign gene into the engineered genomic site in rice. The transgene vector containing a pair of directly oriented FRT sites was introduced by particle bombardment into the cells containing the target locus. FLP activity generated by the co-bombarded FLP gene efficiently separated the transgene construct from the vector-backbone and integrated the backbone-free construct into the target site. Strong FLP activity, derived from the enhanced FLP protein, FLPe, was important for the successful site-specific integration (SSI). The majority of the transgenic events contained a precise integration and expressed the transgene. Interestingly, each transgenic event lacked the co-bombarded FLPe gene, suggesting reversion of the integration structure in the presence of the constitutive FLPe expression. Progeny of the precise transgenic lines inherited the stable SSI locus and expressed the transgene. This work demonstrates the application of FLP-FRT system for site-specific gene integration in plants using rice as a model.

  6. In vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration.

    PubMed

    Suzuki, Keiichiro; Tsunekawa, Yuji; Hernandez-Benitez, Reyna; Wu, Jun; Zhu, Jie; Kim, Euiseok J; Hatanaka, Fumiyuki; Yamamoto, Mako; Araoka, Toshikazu; Li, Zhe; Kurita, Masakazu; Hishida, Tomoaki; Li, Mo; Aizawa, Emi; Guo, Shicheng; Chen, Song; Goebl, April; Soligalla, Rupa Devi; Qu, Jing; Jiang, Tingshuai; Fu, Xin; Jafari, Maryam; Esteban, Concepcion Rodriguez; Berggren, W Travis; Lajara, Jeronimo; Nuñez-Delicado, Estrella; Guillen, Pedro; Campistol, Josep M; Matsuzaki, Fumio; Liu, Guang-Hui; Magistretti, Pierre; Zhang, Kun; Callaway, Edward M; Zhang, Kang; Belmonte, Juan Carlos Izpisua

    2016-12-01

    Targeted genome editing via engineered nucleases is an exciting area of biomedical research and holds potential for clinical applications. Despite rapid advances in the field, in vivo targeted transgene integration is still infeasible because current tools are inefficient, especially for non-dividing cells, which compose most adult tissues. This poses a barrier for uncovering fundamental biological principles and developing treatments for a broad range of genetic disorders. Based on clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) technology, here we devise a homology-independent targeted integration (HITI) strategy, which allows for robust DNA knock-in in both dividing and non-dividing cells in vitro and, more importantly, in vivo (for example, in neurons of postnatal mammals). As a proof of concept of its therapeutic potential, we demonstrate the efficacy of HITI in improving visual function using a rat model of the retinal degeneration condition retinitis pigmentosa. The HITI method presented here establishes new avenues for basic research and targeted gene therapies.

  7. In vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration

    PubMed Central

    Suzuki, Keiichiro; Tsunekawa, Yuji; Hernandez-Benitez, Reyna; Wu, Jun; Zhu, Jie; Kim, Euiseok J.; Hatanaka, Fumiyuki; Yamamoto, Mako; Araoka, Toshikazu; Li, Zhe; Kurita, Masakazu; Hishida, Tomoaki; Li, Mo; Aizawa, Emi; Guo, Shicheng; Chen, Song; Goebl, April; Soligalla, Rupa Devi; Qu, Jing; Jiang, Tingshuai; Fu, Xin; Jafari, Maryam; Esteban, Concepcion Rodriguez; Berggren, W. Travis; Lajara, Jeronimo; Nuñez-Delicado, Estrella; Guillen, Pedro; Campistol, Josep M.; Matsuzaki, Fumio; Liu, Guang-Hui; Magistretti, Pierre; Zhang, Kun; Callaway, Edward M.; Zhang, Kang; Belmonte, Juan Carlos Izpisua

    2017-01-01

    Targeted genome editing via engineered nucleases is an exciting area of biomedical research and holds potential for clinical applications. Despite rapid advances in the field, in vivo targeted transgene integration is still infeasible because current tools are inefficient1, especially for non-dividing cells, which compose most adult tissues. This poses a barrier for uncovering fundamental biological principles and developing treatments for a broad range of genetic disorders2. Based on clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9)3,4 technology, here we devise a homology-independent targeted integration (HITI) strategy, which allows for robust DNA knock-in in both dividing and non-dividing cells in vitro and, more importantly, in vivo (for example, in neurons of postnatal mammals). As a proof of concept of its therapeutic potential, we demonstrate the efficacy of HITI in improving visual function using a rat model of the retinal degeneration condition retinitis pigmentosa. The HITI method presented here establishes new avenues for basic research and targeted gene therapies. PMID:27851729

  8. Exogenous gene can be integrated into Nosema bombycis genome by mediating with a non-transposon vector.

    PubMed

    Guo, Rui; Cao, Guangli; Lu, Yahong; Xue, Renyu; Kumar, Dhiraj; Hu, Xiaolong; Gong, Chengliang

    2016-08-01

    Nosema bombycis, a microsporidium, is a pathogen of pebrine disease of silkworms, and its genomic DNA sequences had been determined. Thus far, the research of gene functions of microsporidium including N. bombycis cannot be performed with gain/loss of function. In the present study, we targeted to construct transgenic N. bombycis. Therefore, hemocytes of the infected silkworm were transfected with a non-transposon vector pIZT/V5-His vector in vivo, and the blood, in which the hemocyte with green fluorescence could be observed, was added to the cultured BmN cells. Furthermore, normal BmN cells were infected with germinated N. bombycis, and the infected cells were transfected with pIZT/V5-His. Continuous fluorescence observations exposed that there were N. bombycis with green fluorescence in some N. bombycis-infected cells, and the extracted genome from the purified N. bombycis spore was used as templates. PCR amplification was carried out with a pair of primers for specifically amplifying the green fluorescence protein (GFP) gene; a specific product representing the gfp gene could be amplified. Expression of the GFP protein through Western blotting also demonstrated that the gfp gene was perfectly inserted into the genome of N. bombysis. These results illustrated that exogenous gene can be integrated into N. bombycis genome by mediating with a non-transposon vector. Our research not only offers a strategy for research on gene function of N. bombycis but also provides an important reference for constructing genetically modified microsporidium utilized for biocontrol of pests.

  9. Analyses of germline, chromosomally integrated human herpesvirus 6A and B genomes indicate emergent infection and new inflammatory mediators.

    PubMed

    Tweedy, J; Spyrou, M A; Hubacek, P; Kuhl, U; Lassner, D; Gompels, U A

    2015-02-01

    Human herpesvirus-6A (HHV-6A) is rarer than HHV-6B in many infant populations. However, they are similarly prevalent as germline, chromosomally integrated genomes (ciHHV-6A/B). This integrated form affects 0.1-1 % of the human population, where potentially virus gene expression could be in every cell, although virus relationships and health effects are not clear. In a Czech/German patient cohort ciHHV-6A was more common and diverse than ciHHV-6B. Quantitative PCR, nucleotide sequencing and telomeric integration site amplification characterized ciHHV-6 in 44 German myocarditis/cardiomyopathy and Czech malignancy/inflammatory disease (MI) patients plus donors. Comparisons were made to sequences from global virus reference strains, and blood DNA from childhood-infections from Zambia (HHV-6A mainly) and Japan (HHV-6B). The MI cohort were 86 % (18/21) ciHHV-6A, the cardiac cohort 65 % (13/20) ciHHV-6B, suggesting different disease links. Reactivation was supported by findings of 1) recombination between ciHHV-6A and HHV-6B genes in 20 % (4/21) of the MI cohort; 2) expression in a patient subset, of early/late transcripts from the inflammatory mediator genes chemokine receptor U51 and chemokine U83, both identical to ciHHV-6A DNA sequences; and 3) superinfection shown by deep sequencing identifying minor virus-variants only in ciHHV-6A, which expressed transcripts, indicating virus infection reactivates latent ciHHV-6A. Half the MI cohort had more than two copies per cell, median 5.2, indicative of reactivation. Remarkably, the integrated genomes encoded the secreted-active form of virus chemokines, rare in virus from childhood-infections. This shows integrated virus genomes can contribute new human genes with links to inflammatory pathology and supports ciHHV-6A reactivation as a source for emergent infection.

  10. Statistical Methods in Integrative Genomics.

    PubMed

    Richardson, Sylvia; Tseng, George C; Sun, Wei

    2016-06-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions.

  11. Statistical Methods in Integrative Genomics

    PubMed Central

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531

  12. Analysis of illegitimate genomic integration mediated by zinc-finger nucleases: implications for specificity of targeted gene correction

    PubMed Central

    2010-01-01

    Background Formation of site specific genomic double strand breaks (DSBs), induced by the expression of a pair of engineered zinc-finger nucleases (ZFNs), dramatically increases the rates of homologous recombination (HR) between a specific genomic target and a donor plasmid. However, for the safe use of ZFN induced HR in practical applications, possible adverse effects of the technology such as cytotoxicity and genotoxicity need to be well understood. In this work, off-target activity of a pair of ZFNs has been examined by measuring the ratio between HR and illegitimate genomic integration in cells that are growing exponentially, and in cells that have been arrested in the G2/M phase. Results A reporter cell line that contained consensus ZFN binding sites in an enhanced green fluorescent protein (EGFP) reporter gene was used to measure ratios between HR and non-homologous integration of a plasmid template. Both in human cells (HEK 293) containing the consensus ZFN binding sites and in cells lacking the ZFN binding sites, a 3.5 fold increase in the level of illegitimate integration was observed upon ZFN expression. Since the reporter gene containing the consensus ZFN target sites was found to be intact in cells where illegitimate integration had occurred, increased rates of illegitimate integration most likely resulted from the formation of off-target genomic DSBs. Additionally, in a fraction of the ZFN treated cells the co-occurrence of both specific HR and illegitimate integration was observed. As a mean to minimize unspecific effects, cell cycle manipulation of the target cells by induction of a transient G2/M cell cycle arrest was shown to stimulate the activity of HR while having little effect on the levels of illegitimate integration, thus resulting in a nearly eight fold increase in the ratio between the two processes. Conclusions The demonstration that ZFN expression, in addition to stimulating specific gene targeting by HR, leads to increased rates of

  13. Histones and genome integrity.

    PubMed

    Williamson, Wes D; Pinto, Ines

    2012-01-01

    Chromosomes undergo extensive structural rearrangements during the cell cycle, from the most open chromatin state required for DNA replication to the highest level of compaction and condensation essential for mitotic segregation of sister chromatids. It is now widely accepted that chromatin is a highly dynamic structure that participates in all DNA-related functions, including transcription, DNA replication, repair, and mitosis; hence, histones have emerged as key players in these cellular processes. We review here the studies that implicate histones in functions that affect the chromosome cycle, defined as the cellular processes involved in the maintenance, replication, and segregation of chromosomal DNA. Disruption of the chromosome cycle affects the integrity of the cellular genome, leading to aneuploidy, polyploidy or cell death. Histone stoichiometry, mutations that affect the structure of the nucleosome core particle, and mutations that affect the structure and/or modifications of the histone tails, all have a direct impact on the fidelity of chromosome transmission and the integrity of the genome.

  14. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity

    PubMed Central

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  15. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    PubMed Central

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease. PMID:27585646

  16. Integrating genomics into Eucalyptus breeding.

    PubMed

    Grattapaglia, Dario

    2004-09-30

    The advent of high throughput genomic technologies has opened new perspectives in the speed, scale and detail with which one can investigate genes, genomes and complex traits in Eucalyptus species. A genomic approach to a more detailed understanding of important metabolic and physiological processes, which affect tree growth and stress resistance, and the identification of genes and their allelic variants, which determine the major chemical and physical features of wood properties, should eventually lead to new opportunities for directed genetic modifications of far-reaching economic impact in forest industry. It should be kept in mind, however, that basic breeding strategies, coupled with sophisticated quantitative methods, breeder's experience and breeder's intuition, will continue to generate significant genetic gains and have a clear measurable impact on production forestry. Even with a much more global view of genetic processes, genomics will only succeed in contributing to the development of improved industrial forests if it is strongly interconnected with intensive fieldwork and creative breeding. Integrated genomic projects involving multi-species expressed sequence tag sequencing and quantitative trait locus detection, single nucleotide polymorphism discovery for association mapping, and the development of a gene-rich physical map for the Eucalyptus genome will quickly move toward linking phenotypes to genes that control the wood formation processes that define industrial-level traits. Exploiting the full power of the superior natural phenotypic variation in wood properties found in Eucalyptus genetic resources will undoubtedly be a key factor to reach this goal.

  17. Integrity Through Mediated Interfaces

    DTIC Science & Technology

    2004-12-01

    26 i 1. Scope This contract was aimed at developing the broad set of technologies ...modifications to the document. It was based on our Instrumented Connectors technology that enables external mediators to monitor and respond to...it was recognized that our Instrumented Connector technology could also be used to protect a computer from malicious Email attachments and with the

  18. CRISPR-mediated Ophthalmic Genome Surgery.

    PubMed

    Cho, Galaxy Y; Abdulla, Yazeed; Sengillo, Jesse D; Justus, Sally; Schaefer, Kellie A; Bassuk, Alexander G; Tsang, Stephen H; Mahajan, Vinit B

    2017-09-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) is a genome engineering system with great potential for clinical applications due to its versatility and programmability. This review highlights the development and use of CRISPR-mediated ophthalmic genome surgery in recent years. Diverse CRISPR techniques are in development to target a wide array of ophthalmic conditions, including inherited and acquired conditions. Preclinical disease modeling and recent successes in gene editing suggest potential efficacy of CRISPR as a therapeutic for inherited conditions. In particular, the treatment of Leber congenital amaurosis with CRISPR-mediated genome surgery is expected to reach clinical trials in the near future. Treatment options for inherited retinal dystrophies are currently limited. CRISPR-mediated genome surgery methods may be able to address this unmet need in the future.

  19. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  20. Yeast Oligo-mediated Genome Engineering (YOGE)

    PubMed Central

    DiCarlo, JE; Conley, AJ; Penttilä, M; Jäntti, J; Wang, HH; Church, GM

    2014-01-01

    High-frequency oligonucleotide-directed recombination engineering (recombineering) has enabled rapid modification of several prokaryotic genomes to date. Here, we present a method for oligonucleotide-mediated recombineering in the model eukaryote and industrial production host S. cerevisiae, which we call Yeast Oligo-mediated Genome Engineering (YOGE). Through a combination of overexpression and knockouts of relevant genes and optimization of transformation and oligonucleotide designs, we achieve high gene modification frequencies at levels that only require screening of dozens of cells. We demonstrate the robustness of our approach in three divergent yeast strains, including those involved in industrial production of bio-based chemicals. Furthermore, YOGE can be iteratively executed via cycling to generate genomic libraries up to 105 individuals at each round for diversity generation. YOGE cycling alone, or in combination with phenotypic selections or endonuclease-based negative genotypic selections, can be used to easily generate modified alleles in yeast populations with high frequencies. PMID:24160921

  1. Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability

    PubMed Central

    Akagi, Keiko; Li, Jingfeng; Broutian, Tatevik R.; Padilla-Nash, Hesed; Xiao, Weihong; Jiang, Bo; Rocco, James W.; Teknos, Theodoros N.; Kumar, Bhavna; Wangsa, Danny; He, Dandan; Ried, Thomas; Symer, David E.; Gillison, Maura L.

    2014-01-01

    Genomic instability is a hallmark of human cancers, including the 5% caused by human papillomavirus (HPV). Here we report a striking association between HPV integration and adjacent host genomic structural variation in human cancer cell lines and primary tumors. Whole-genome sequencing revealed HPV integrants flanking and bridging extensive host genomic amplifications and rearrangements, including deletions, inversions, and chromosomal translocations. We present a model of “looping” by which HPV integrant-mediated DNA replication and recombination may result in viral–host DNA concatemers, frequently disrupting genes involved in oncogenesis and amplifying HPV oncogenes E6 and E7. Our high-resolution results shed new light on a catastrophic process, distinct from chromothripsis and other mutational processes, by which HPV directly promotes genomic instability. PMID:24201445

  2. Genomics Portals: integrative web-platform for mining genomics data

    PubMed Central

    2010-01-01

    Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909

  3. A New Era of Genome Integration-Simply Cut and Paste!

    PubMed

    Liu, Zihe; Liang, Youyun; Ang, Ee Lui; Zhao, Huimin

    2017-01-24

    Genome integration is a powerful tool in both basic and applied biological research. However, traditional genome integration, which is typically mediated by homologous recombination, has been constrained by low efficiencies and limited host range. In recent years, the emergence of homing endonucleases and programmable nucleases has greatly enhanced integration efficiencies and allowed alternative integration mechanisms such as nonhomologous end joining and microhomology-mediated end joining, enabling integration in hosts deficient in homologous recombination. In this review, we will highlight recent advances and breakthroughs in genome integration methods made possible by programmable nucleases, and their new applications in synthetic biology and metabolic engineering.

  4. Integrative genomic analysis of CREB defines a critical role for transcription factor networks in mediating the fed/fasted switch in liver

    PubMed Central

    2013-01-01

    Background Metabolic homeostasis in mammals critically depends on the regulation of fasting-induced genes by CREB in the liver. Previous genome-wide analysis has shown that only a small percentage of CREB target genes are induced in response to fasting-associated signaling pathways. The precise molecular mechanisms by which CREB specifically targets these genes in response to alternating hormonal cues remain to be elucidated. Results We performed chromatin immunoprecipitation coupled to high-throughput sequencing of CREB in livers from both fasted and re-fed mice. In order to quantitatively compare the extent of CREB-DNA interactions genome-wide between these two physiological conditions we developed a novel, robust analysis method, termed the ‘single sample independence’ (SSI) test that greatly reduced the number of false-positive peaks. We found that CREB remains constitutively bound to its target genes in the liver regardless of the metabolic state. Integration of the CREB cistrome with expression microarrays of fasted and re-fed mouse livers and ChIP-seq data for additional transcription factors revealed that the gene expression switches between the two metabolic states are associated with co-localization of additional transcription factors at CREB sites. Conclusions Our results support a model in which CREB is constitutively bound to thousands of target genes, and combinatorial interactions between DNA-binding factors are necessary to achieve the specific transcriptional response of the liver to fasting. Furthermore, our genome-wide analysis identifies thousands of novel CREB target genes in liver, and suggests a previously unknown role for CREB in regulating ER stress genes in response to nutrient influx. PMID:23682854

  5. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-04

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Bovine Genome Database: integrated tools for genome annotation and discovery.

    PubMed

    Childers, Christopher P; Reese, Justin T; Sundaram, Jaideep P; Vile, Donald C; Dickens, C Michael; Childs, Kevin L; Salih, Hanni; Bennett, Anna K; Hagen, Darren E; Adelson, David L; Elsik, Christine G

    2011-01-01

    The Bovine Genome Database (BGD; http://BovineGenome.org) strives to improve annotation of the bovine genome and to integrate the genome sequence with other genomics data. BGD includes GBrowse genome browsers, the Apollo Annotation Editor, a quantitative trait loci (QTL) viewer, BLAST databases and gene pages. Genome browsers, available for both scaffold and chromosome coordinate systems, display the bovine Official Gene Set (OGS), RefSeq and Ensembl gene models, non-coding RNA, repeats, pseudogenes, single-nucleotide polymorphism, markers, QTL and alignments to complementary DNAs, ESTs and protein homologs. The Bovine QTL viewer is connected to the BGD Chromosome GBrowse, allowing for the identification of candidate genes underlying QTL. The Apollo Annotation Editor connects directly to the BGD Chado database to provide researchers with remote access to gene evidence in a graphical interface that allows editing and creating new gene models. Researchers may upload their annotations to the BGD server for review and integration into the subsequent release of the OGS. Gene pages display information for individual OGS gene models, including gene structure, transcript variants, functional descriptions, gene symbols, Gene Ontology terms, annotator comments and links to National Center for Biotechnology Information and Ensembl. Each gene page is linked to a wiki page to allow input from the research community.

  7. Protecting genome integrity during CRISPR immune adaptation.

    PubMed

    Wright, Addison V; Doudna, Jennifer A

    2016-10-01

    Bacterial CRISPR-Cas systems include genomic arrays of short repeats flanking foreign DNA sequences and provide adaptive immunity against viruses. Integration of foreign DNA must occur specifically to avoid damaging the genome or the CRISPR array, but surprisingly promiscuous activity occurs in vitro. Here we reconstituted full-site DNA integration and show that the Streptococcus pyogenes type II-A Cas1-Cas2 integrase maintains specificity in part through limitations on the second integration step. At non-CRISPR sites, integration stalls at the half-site intermediate, thereby enabling reaction reversal. S. pyogenes Cas1-Cas2 is highly specific for the leader-proximal repeat and recognizes the repeat's palindromic ends, thus fitting a model of independent recognition by distal Cas1 active sites. These findings suggest that DNA-insertion sites are less common than suggested by previous work, thereby preventing toxicity during CRISPR immune adaptation and maintaining host genome integrity.

  8. Reverse transcriptase: mediator of genomic plasticity.

    PubMed

    Brosius, J; Tiedge, H

    1995-01-01

    Reverse transcription has been an important mediator of genomic change. This influence dates back more than three billion years, when the RNA genome was converted into the DNA genome. While the current cellular role(s) of reverse transcriptase are not yet completely understood, it has become clear over the last few years that this enzyme is still responsible for generating significant genomic change and that its activities are one of the driving forces of evolution. Reverse transcriptase generates, for example, extra gene copies (retrogenes), using as a template mature messenger RNAs. Such retrogenes do not always end up as nonfunctional pseudogenes but form, after reinsertion into the genome, new unions with resident promoter elements that may alter the gene's temporal and/or spatial expression levels. More frequently, reverse transcriptase produces copies of nonmessenger RNAs, such as small nuclear or cytoplasmic RNAs. Extremely high copy numbers can be generated by this process. The resulting reinserted DNA copies are therefore referred to as short interspersed repetitive elements (SINEs). SINEs have long been considered selfish DNA, littering the genome via exponential propagation but not contributing to the host's fitness. Many SINEs, however, can give rise to novel genes encoding small RNAs, and are the migrant carriers of numerous control elements and sequence motifs that can equip resident genes with novel regulatory elements [Brosius J. and Gould S.J., Proc Natl Acad Sci USA 89, 10706-10710, 1992]. Retrosequences, such as SINEs and portions of retroelements (e.g., long terminal repeats, LTRs), are capable of donating sequence motifs for nucleosome positioning, DNA methylation, transcriptional enhancers and silencers, poly(A) addition sequences, determinants of RNA stability or transport, splice sites, and even amino acid codons for incorporation into open reading frames as novel protein domains. Retroposition can therefore be considered as a major

  9. Integrated genome browser: visual analytics platform for genomics

    PubMed Central

    Norris, David C.; Loraine, Ann E.

    2016-01-01

    Motivation: Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Results: Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB’s ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. Availability and implementation: IGB is open source and is freely available from http://bioviz.org/igb. Contact: aloraine@uncc.edu PMID:27153568

  10. Integrated genome browser: visual analytics platform for genomics.

    PubMed

    Freese, Nowlan H; Norris, David C; Loraine, Ann E

    2016-07-15

    Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB's ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. IGB is open source and is freely available from http://bioviz.org/igb aloraine@uncc.edu. © The Author 2016. Published by Oxford University Press.

  11. Transcription as a Threat to Genome Integrity.

    PubMed

    Gaillard, Hélène; Aguilera, Andrés

    2016-06-02

    Genomes undergo different types of sporadic alterations, including DNA damage, point mutations, and genome rearrangements, that constitute the basis for evolution. However, these changes may occur at high levels as a result of cell pathology and trigger genome instability, a hallmark of cancer and a number of genetic diseases. In the last two decades, evidence has accumulated that transcription constitutes an important natural source of DNA metabolic errors that can compromise the integrity of the genome. Transcription can create the conditions for high levels of mutations and recombination by its ability to open the DNA structure and remodel chromatin, making it more accessible to DNA insulting agents, and by its ability to become a barrier to DNA replication. Here we review the molecular basis of such events from a mechanistic perspective with particular emphasis on the role of transcription as a genome instability determinant.

  12. Methods of Genomic Competency Integration in Practice

    PubMed Central

    Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

    2015-01-01

    Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through

  13. PGWD: Integrating Personal Genome for Warfarin Dosing.

    PubMed

    Pan, Yidan; Cheng, Ronghai; Li, Zhoufang; Zhao, Yujun; He, Jiankui

    2016-03-01

    Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR. In recent years, due to the rapidly dropping cost of whole genome sequencing and genotyping, more and more people get their whole genome sequenced or genotyped. However, current software for warfarin dosing prediction is based on low-throughput genetic information from either real-time PCR or melting curve methods. There is no bioinformatics tool available that can take the high-throughput genome sequencing data as input and determine the accurate dosage of warfarin. Here, we present PGWD, a web tool that analyzes personal genome sequencing data and integrates with clinical information for warfarin dosing.

  14. Integrative bayesian network analysis of genomic data.

    PubMed

    Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran

    2014-01-01

    Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.

  15. Viral sequences integrated into plant genomes.

    PubMed

    Harper, Glyn; Hull, Roger; Lockhart, Ben; Olszewski, Neil

    2002-01-01

    Sequences of various DNA plant viruses have been found integrated into the host genome. There are two forms of integrant, those that can form episomal viral infections and those that cannot. Integrants of three pararetroviruses, Banana streak virus (BSV), Tobacco vein clearing virus (TVCV), and Petunia vein clearing virus (PVCV), can generate episomal infections in certain hybrid plant hosts in response to stress. In the case of BSV and TVCV, one of the parents contains the integrant but is has not been seen to be activated in that parent; the other parent does not contain the integrant. The number of integrant loci is low for BSV and PVCV and high in TVCV. The structure of the integrants is complex, and it is thought that episomal virus is released by recombination and/or reverse transcription. Geminiviral and pararetroviral sequences are found in plant genomes although not so far associated with a virus disease. It appears that integration of viral sequences is widespread in the plant kingdom and has been occurring for a long period of time.

  16. Integrating Mediators and Moderators in Research Design

    ERIC Educational Resources Information Center

    MacKinnon, David P.

    2011-01-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating…

  17. An Integrated System for Precise Genome Modification in Escherichia coli

    PubMed Central

    Tas, Huseyin; Nguyen, Cac T.; Patel, Ravish; Kim, Neil H.; Kuhlman, Thomas E.

    2015-01-01

    We describe an optimized system for the easy, effective, and precise modification of the Escherichia coli genome. Genome changes are introduced first through the integration of a 1.3 kbp Landing Pad consisting of a gene conferring resistance to tetracycline (tetA) or the ability to metabolize the sugar galactose (galK). The Landing Pad is then excised as a result of double-strand breaks by the homing endonuclease I-SceI, and replaced with DNA fragments bearing the desired change via λ-Red mediated homologous recombination. Repair of the double strand breaks and counterselection against the Landing Pad (using NiCl2 for tetA or 2-deoxy-galactose for galK) allows the isolation of modified bacteria without the use of additional antibiotic selection. We demonstrate the power of this method to make a variety of genome modifications: the exact integration, without any extraneous sequence, of the lac operon (~6.5 kbp) to any desired location in the genome and without the integration of antibiotic markers; the scarless deletion of ribosomal rrn operons (~6 kbp) through either intrachromosomal or oligonucleotide recombination; and the in situ fusion of native genes to fluorescent reporter genes without additional perturbation. PMID:26332675

  18. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  19. Genomic integrity and the ageing brain.

    PubMed

    Chow, Hei-man; Herrup, Karl

    2015-11-01

    DNA damage is correlated with and may drive the ageing process. Neurons in the brain are postmitotic and are excluded from many forms of DNA repair; therefore, neurons are vulnerable to various neurodegenerative diseases. The challenges facing the field are to understand how and when neuronal DNA damage accumulates, how this loss of genomic integrity might serve as a 'time keeper' of nerve cell ageing and why this process manifests itself as different diseases in different individuals.

  20. Integrative Genomics and Computational Systems Medicine

    SciTech Connect

    McDermott, Jason E.; Huang, Yufei; Zhang, Bing; Xu, Hua; Zhao, Zhongming

    2014-01-01

    The exponential growth in generation of large amounts of genomic data from biological samples has driven the emerging field of systems medicine. This field is promising because it improves our understanding of disease processes at the systems level. However, the field is still in its young stage. There exists a great need for novel computational methods and approaches to effectively utilize and integrate various omics data.

  1. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational

  2. Adeno-Associated Virus Type 2 Wild-Type and Vector-Mediated Genomic Integration Profiles of Human Diploid Fibroblasts Analyzed by Third-Generation PacBio DNA Sequencing

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei

    2014-01-01

    ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will

  3. Integrating Computer-Mediated Communication Strategy Instruction

    ERIC Educational Resources Information Center

    McNeil, Levi

    2016-01-01

    Communication strategies (CSs) play important roles in resolving problematic second language interaction and facilitating language learning. While studies in face-to-face contexts demonstrate the benefits of communication strategy instruction (CSI), there have been few attempts to integrate computer-mediated communication and CSI. The study…

  4. Integrating Computer-Mediated Communication Strategy Instruction

    ERIC Educational Resources Information Center

    McNeil, Levi

    2016-01-01

    Communication strategies (CSs) play important roles in resolving problematic second language interaction and facilitating language learning. While studies in face-to-face contexts demonstrate the benefits of communication strategy instruction (CSI), there have been few attempts to integrate computer-mediated communication and CSI. The study…

  5. An Era of CRISPR/ Cas9 Mediated Plant Genome Editing.

    PubMed

    Khurshid, Haris; Jan, Sohail Ahmad; Shinwari, Zabta Khan; Jamal, Muhammad; Shah, Sabir Hussain

    2017-09-07

    Recently the engineered nucleases have revolutionized genome editing to perturb gene expression at specific sites in complex eukaryotic genomes. Three important classes of these genome editing tools are Moreover, the more recent type II Clustered Regularly Inter-spaced Short Palindromic Repeats/Crispr associated protein (CRISPR/Cas9) system has become the most favorite plant genome editing tool for its precision and RNA based specificity unlike its counterparts which rely on protein based specificity. Plasmid-mediated co-delivery of multiple sgRNAs and Cas9 to the Plant cell can simultaneously alter more than one target loci which enable multiplex genome editing. In this review, we discuss recent advancements in the CRISPR/ Cas9 technology mechanism, theory and its applications in plants and agriculture. We also suggest that the CRISPR/ Cas9 as an effective genome editing tool, has vast potential for crop improvement and studying gene regulation mechanism and chromatin remodeling.

  6. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    PubMed Central

    Carvalho, Claudia M. B.; Ramocki, Melissa B.; Pehlivan, Davut; Franco, Luis M.; Gonzaga-Jauregui, Claudia; Fang, Ping; McCall, Alanna; Pivnick, Eniko Karman; Hines-Dowell, Stacy; Seaver, Laurie; Friehling, Linda; Lee, Sansan; Smith, Rosemarie; del Gaudio, Daniela; Withers, Marjorie; Liu, Pengfei; Cheung, Sau Wai; Belmont, John W.; Zoghbi, Huda Y.; Hastings, P. J.; Lupski, James R.

    2011-01-01

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at both the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 12 unrelated subjects. Interestingly, only two novel breakpoint junctions were generated during each rearrangement formation. Remarkably, all the complex rearrangement products share the common genomic organization duplication-inverted triplication-duplication (DUP-TRP/INV-DUP) wherein the triplicated segment is inverted and located between directly oriented duplicated genomic segments. We provide evidence that the DUP-TRP/INV-DUP structures are mediated by inverted repeats that can be separated by over 300 kb; a genomic architecture that apparently leads to susceptibility to such complex rearrangements. A similar inverted repeat mediated mechanism may underlie structural variation in many other regions of the human genome. We propose a mechanism that involves both homology driven, via inverted repeats, and microhomologous/nonhomologous events. PMID:21964572

  7. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  8. Multidimensional Genome-wide Analyses Show Accurate FVIII Integration by ZFN in Primary Human Cells

    PubMed Central

    Sivalingam, Jaichandran; Kenanov, Dimitar; Han, Hao; Nirmal, Ajit Johnson; Ng, Wai Har; Lee, Sze Sing; Masilamani, Jeyakumar; Phan, Toan Thang; Maurer-Stroh, Sebastian; Kon, Oi Lian

    2016-01-01

    Costly coagulation factor VIII (FVIII) replacement therapy is a barrier to optimal clinical management of hemophilia A. Therapy using FVIII-secreting autologous primary cells is potentially efficacious and more affordable. Zinc finger nucleases (ZFN) mediate transgene integration into the AAVS1 locus but comprehensive evaluation of off-target genome effects is currently lacking. In light of serious adverse effects in clinical trials which employed genome-integrating viral vectors, this study evaluated potential genotoxicity of ZFN-mediated transgenesis using different techniques. We employed deep sequencing of predicted off-target sites, copy number analysis, whole-genome sequencing, and RNA-seq in primary human umbilical cord-lining epithelial cells (CLECs) with AAVS1 ZFN-mediated FVIII transgene integration. We combined molecular features to enhance the accuracy and activity of ZFN-mediated transgenesis. Our data showed a low frequency of ZFN-associated indels, no detectable off-target transgene integrations or chromosomal rearrangements. ZFN-modified CLECs had very few dysregulated transcripts and no evidence of activated oncogenic pathways. We also showed AAVS1 ZFN activity and durable FVIII transgene secretion in primary human dermal fibroblasts, bone marrow- and adipose tissue-derived stromal cells. Our study suggests that, with close attention to the molecular design of genome-modifying constructs, AAVS1 ZFN-mediated FVIII integration in several primary human cell types may be safe and efficacious. PMID:26689265

  9. Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma.

    PubMed

    2017-08-14

    We performed integrated genomic, transcriptomic, and proteomic profiling of 150 pancreatic ductal adenocarcinoma (PDAC) specimens, including samples with characteristic low neoplastic cellularity. Deep whole-exome sequencing revealed recurrent somatic mutations in KRAS, TP53, CDKN2A, SMAD4, RNF43, ARID1A, TGFβR2, GNAS, RREB1, and PBRM1. KRAS wild-type tumors harbored alterations in other oncogenic drivers, including GNAS, BRAF, CTNNB1, and additional RAS pathway genes. A subset of tumors harbored multiple KRAS mutations, with some showing evidence of biallelic mutations. Protein profiling identified a favorable prognosis subset with low epithelial-mesenchymal transition and high MTOR pathway scores. Associations of non-coding RNAs with tumor-specific mRNA subtypes were also identified. Our integrated multi-platform analysis reveals a complex molecular landscape of PDAC and provides a roadmap for precision medicine. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Integrated genomic characterization of endometrial carcinoma.

    PubMed

    Kandoth, Cyriac; Schultz, Nikolaus; Cherniack, Andrew D; Akbani, Rehan; Liu, Yuexin; Shen, Hui; Robertson, A Gordon; Pashtan, Itai; Shen, Ronglai; Benz, Christopher C; Yau, Christina; Laird, Peter W; Ding, Li; Zhang, Wei; Mills, Gordon B; Kucherlapati, Raju; Mardis, Elaine R; Levine, Douglas A

    2013-05-02

    We performed an integrated genomic, transcriptomic and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumours and ∼25% of high-grade endometrioid tumours had extensive copy number alterations, few DNA methylation changes, low oestrogen receptor/progesterone receptor levels, and frequent TP53 mutations. Most endometrioid tumours had few copy number alterations or TP53 mutations, but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A and KRAS and novel mutations in the SWI/SNF chromatin remodelling complex gene ARID5B. A subset of endometrioid tumours that we identified had a markedly increased transversion mutation frequency and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy-number low, and copy-number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may affect post-surgical adjuvant treatment for women with aggressive tumours.

  11. Integrated Genomic Characterization of Endometrial Carcinoma

    PubMed Central

    2013-01-01

    Summary We performed an integrated genomic, transcriptomic, and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumors and ~25% of high-grade endometrioid tumors have extensive copy number alterations, few DNA methylation changes, low ER/PR levels, and frequent TP53 mutations. Most endometrioid tumors have few copy number alterations or TP53 mutations but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A, KRAS and novel mutations in the SWI/SNF gene ARID5B. A subset of endometrioid tumors we identified had a dramatically increased transversion mutation frequency, and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy number low, and copy number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may impact post-surgical adjuvant treatment for women with aggressive tumors. PMID:23636398

  12. An integrated 3-Dimensional Genome Modeling Engine for data-driven simulation of spatial genome organization

    PubMed Central

    Szałaj, Przemysław; Tang, Zhonghui; Michalski, Paul; Pietal, Michal J.; Luo, Oscar J.; Sadowski, Michał; Li, Xingwang; Radew, Kamen; Ruan, Yijun; Plewczynski, Dariusz

    2016-01-01

    ChIA-PET is a high-throughput mapping technology that reveals long-range chromatin interactions and provides insights into the basic principles of spatial genome organization and gene regulation mediated by specific protein factors. Recently, we showed that a single ChIA-PET experiment provides information at all genomic scales of interest, from the high-resolution locations of binding sites and enriched chromatin interactions mediated by specific protein factors, to the low resolution of nonenriched interactions that reflect topological neighborhoods of higher-order chromosome folding. This multilevel nature of ChIA-PET data offers an opportunity to use multiscale 3D models to study structural-functional relationships at multiple length scales, but doing so requires a structural modeling platform. Here, we report the development of 3D-GNOME (3-Dimensional Genome Modeling Engine), a complete computational pipeline for 3D simulation using ChIA-PET data. 3D-GNOME consists of three integrated components: a graph-distance-based heat map normalization tool, a 3D modeling platform, and an interactive 3D visualization tool. Using ChIA-PET and Hi-C data derived from human B-lymphocytes, we demonstrate the effectiveness of 3D-GNOME in building 3D genome models at multiple levels, including the entire genome, individual chromosomes, and specific segments at megabase (Mb) and kilobase (kb) resolutions of single average and ensemble structures. Further incorporation of CTCF-motif orientation and high-resolution looping patterns in 3D simulation provided additional reliability of potential biologically plausible topological structures. PMID:27789526

  13. An integrated 3-Dimensional Genome Modeling Engine for data-driven simulation of spatial genome organization.

    PubMed

    Szałaj, Przemysław; Tang, Zhonghui; Michalski, Paul; Pietal, Michal J; Luo, Oscar J; Sadowski, Michał; Li, Xingwang; Radew, Kamen; Ruan, Yijun; Plewczynski, Dariusz

    2016-12-01

    ChIA-PET is a high-throughput mapping technology that reveals long-range chromatin interactions and provides insights into the basic principles of spatial genome organization and gene regulation mediated by specific protein factors. Recently, we showed that a single ChIA-PET experiment provides information at all genomic scales of interest, from the high-resolution locations of binding sites and enriched chromatin interactions mediated by specific protein factors, to the low resolution of nonenriched interactions that reflect topological neighborhoods of higher-order chromosome folding. This multilevel nature of ChIA-PET data offers an opportunity to use multiscale 3D models to study structural-functional relationships at multiple length scales, but doing so requires a structural modeling platform. Here, we report the development of 3D-GNOME (3-Dimensional Genome Modeling Engine), a complete computational pipeline for 3D simulation using ChIA-PET data. 3D-GNOME consists of three integrated components: a graph-distance-based heat map normalization tool, a 3D modeling platform, and an interactive 3D visualization tool. Using ChIA-PET and Hi-C data derived from human B-lymphocytes, we demonstrate the effectiveness of 3D-GNOME in building 3D genome models at multiple levels, including the entire genome, individual chromosomes, and specific segments at megabase (Mb) and kilobase (kb) resolutions of single average and ensemble structures. Further incorporation of CTCF-motif orientation and high-resolution looping patterns in 3D simulation provided additional reliability of potential biologically plausible topological structures.

  14. RNA-Mediated Epigenetic Programming of Genome Rearrangements

    PubMed Central

    Nowacki, Mariusz; Shetty, Keerthi; Landweber, Laura F.

    2012-01-01

    RNA, normally thought of as a conduit in gene expression, has a novel mode of action in ciliated protozoa. Maternal RNA templates provide both an organizing guide for DNA rearrangements and a template that can transport somatic mutations to the next generation. This opportunity for RNA-mediated genome rearrangement and DNA repair is profound in the ciliate Oxytricha, which deletes 95% of its germline genome during development in a process that severely fragments its chromosomes and then sorts and reorders the hundreds of thousands of pieces remaining. Oxytricha’s somatic nuclear genome is therefore an epigenome formed through RNA templates and signals arising from the previous generation. Furthermore, this mechanism of RNA-mediated epigenetic inheritance can function across multiple generations, and the discovery of maternal template RNA molecules has revealed new biological roles for RNA and has hinted at the power of RNA molecules to sculpt genomic information in cells. PMID:21801022

  15. Genome integrity, stem cells and hyaluronan

    PubMed Central

    Darzynkiewicz, Zbigniew; Balazs, Endre A.

    2012-01-01

    Faithful preservation of genome integrity is the critical mission of stem cells as well as of germ cells. Reviewed are the following mechanisms involved in protecting DNA in these cells: (a) The efflux machinery that can pump out variety of genotoxins in ATP-dependent manner; (b) the mechanisms maintaining minimal metabolic activity which reduces generation of reactive oxidants, by-products of aerobic respiration; (c) the role of hypoxic niche of stem cells providing a gradient of variable oxygen tension; (d) (e) the presence of hyaluronan (HA) and HA receptors on stem cells and in the niche; (f) the role of HA in protecting DNA from oxidative damage; (g) the specific function of HA in protecting DNA in stem cells; (h) the interactions of HA with sperm cells and oocytes that also may shield their DNA from oxidative damage, and (e) mechanisms by which HA exerts the anti-oxidant activity. While HA has multitude of functions its anti-oxidant capabilities are often overlooked but may be of significance in preservation of integrity of stem and germ cells genome. PMID:22383371

  16. Site-specific recombination in the chicken genome using Flipase recombinase-mediated cassette exchange.

    PubMed

    Lee, Hong Jo; Lee, Hyung Chul; Kim, Young Min; Hwang, Young Sun; Park, Young Hyun; Park, Tae Sub; Han, Jae Yong

    2016-02-01

    Targeted genome recombination has been applied in diverse research fields and has a wide range of possible applications. In particular, the discovery of specific loci in the genome that support robust and ubiquitous expression of integrated genes and the development of genome-editing technology have facilitated rapid advances in various scientific areas. In this study, we produced transgenic (TG) chickens that can induce recombinase-mediated gene cassette exchange (RMCE), one of the site-specific recombination technologies, and confirmed RMCE in TG chicken-derived cells. As a result, we established TG chicken lines that have, Flipase (Flp) recognition target (FRT) pairs in the chicken genome, mediated by piggyBac transposition. The transgene integration patterns were diverse in each TG chicken line, and the integration diversity resulted in diverse levels of expression of exogenous genes in each tissue of the TG chickens. In addition, the replaced gene cassette was expressed successfully and maintained by RMCE in the FRT predominant loci of TG chicken-derived cells. These results indicate that targeted genome recombination technology with RMCE could be adaptable to TG chicken models and that the technology would be applicable to specific gene regulation by cis-element insertion and customized expression of functional proteins at predicted levels without epigenetic influence.

  17. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  18. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish.

    PubMed

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-05-13

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish.

  19. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    PubMed Central

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-01-01

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish. PMID:27187373

  20. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  1. LINE-1 Retrotransposons: Mediators of Somatic Variation in Neuronal Genomes?

    PubMed Central

    Singer, Tatjana; McConnell, Michael J.; Marchetto, Maria C.N.; Coufal, Nicole G.; Gage, Fred H.

    2010-01-01

    LINE-1 (L1) elements are retrotransposons that insert extra copies of themselves throughout the genome using a “copy and paste” mechanism. L1s have contributed ~20% to total human genome content and are able to influence chromosome integrity and gene expression upon reinsertion. Recent studies show that L1 elements are active and “jumping” during neuronal differentiation. New somatic L1 insertions may generate “genomic plasticity” in neurons by causing variation in genomic DNA sequences and by altering the transcriptome of individual cells. Thus, L1-induced variation may affect neuronal plasticity and behavior. Here, we discuss potential consequences of L1-induced neuronal diversity and propose that a mechanism generating diversity in the brain could broaden the spectrum of behavioral phenotypes that can originate from any single genome. PMID:20471112

  2. TG1 integrase-based system for site-specific gene integration into bacterial genomes.

    PubMed

    Muroi, Tetsurou; Kokuzawa, Takaaki; Kihara, Yoshihiko; Kobayashi, Ryuichi; Hirano, Nobutaka; Takahashi, Hideo; Haruki, Mitsuru

    2013-05-01

    Serine-type phage integrases catalyze unidirectional site-specific recombination between the attachment sites, attP and attB, in the phage and host bacterial genomes, respectively; these integrases and DNA target sites function efficiently when transferred into heterologous cells. We previously developed an in vivo site-specific genomic integration system based on actinophage TG1 integrase that introduces ∼2-kbp DNA into an att site inserted into a heterologous Escherichia coli genome. Here, we analyzed the TG1 integrase-mediated integrations of att site-containing ∼10-kbp DNA into the corresponding att site pre-inserted into various genomic locations; moreover, we developed a system that introduces ∼10-kbp DNA into the genome with an efficiency of ∼10(4) transformants/μg DNA. Integrations of attB-containing DNA into an attP-containing genome were more efficient than integrations of attP-containing DNA into an attB-containing genome, and integrations targeting attP inserted near the replication origin, oriC, and the E. coli "centromere" analogue, migS, were more efficient than those targeting attP within other regions of the genome. Because the genomic region proximal to the oriC and migS sites is located at the extreme poles of the cell during chromosomal segregation, the oriC-migS region may be more exposed to the cytosol than are other regions of the E. coli chromosome. Thus, accessibility of pre-inserted attP to attB-containing incoming DNA may be crucial for the integration efficiency by serine-type integrases in heterologous cells. These results may be beneficial to the development of serine-type integrases-based genomic integration systems for various bacterial species.

  3. CrEdit: CRISPR mediated multi-loci gene integration in Saccharomyces cerevisiae.

    PubMed

    Ronda, Carlotta; Maury, Jérôme; Jakočiunas, Tadas; Jacobsen, Simo Abdessamad Baallal; Germann, Susanne Manuela; Harrison, Scott James; Borodina, Irina; Keasling, Jay D; Jensen, Michael Krogh; Nielsen, Alex Toftgaard

    2015-07-07

    One of the bottlenecks in production of biochemicals and pharmaceuticals in Saccharomyces cerevisiae is stable and homogeneous expression of pathway genes. Integration of genes into the genome of the production organism is often a preferred option when compared to expression from episomal vectors. Existing approaches for achieving stable simultaneous genome integrations of multiple DNA fragments often result in relatively low integration efficiencies and furthermore rely on the use of selection markers. Here, we have developed a novel method, CrEdit (CRISPR/Cas9 mediated genome Editing), which utilizes targeted double strand breaks caused by CRISPR/Cas9 to significantly increase the efficiency of homologous integration in order to edit and manipulate genomic DNA. Using CrEdit, the efficiency and locus specificity of targeted genome integrations reach close to 100% for single gene integration using short homology arms down to 60 base pairs both with and without selection. This enables direct and cost efficient inclusion of homology arms in PCR primers. As a proof of concept, a non-native β-carotene pathway was reconstructed in S. cerevisiae by simultaneous integration of three pathway genes into individual intergenic genomic sites. Using longer homology arms, we demonstrate highly efficient and locus-specific genome integration even without selection with up to 84% correct clones for simultaneous integration of three gene expression cassettes. The CrEdit approach enables fast and cost effective genome integration for engineering of S. cerevisiae. Since the choice of the targeting sites is flexible, CrEdit is a powerful tool for diverse genome engineering applications.

  4. CRISPR Mediated Genome Engineering and its Application in Industry.

    PubMed

    Kaboli, Saeed; Babazada, Hasan

    2017-09-07

    The CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9 (CRISPR-associated nuclease 9) method has been dramatically changing the field of genome engineering. It is a rapid, highly efficient and versatile tool for precise modification of genome that uses a guide RNA (gRNA) to target Cas9 to a specific sequence. This novel RNA-guided genome-editing technique has become a revolutionary tool in biomedical science and has many innovative applications in different fields. In this review, we briefly introduce the Cas9-mediated genome-editing tool, summarize the recent advances in CRISPR/Cas9 technology to engineer the genomes of a wide variety of organisms, and discuss their applications to treatment of fungal and viral disease. We also discuss advantageous of CRISPR/Cas9 technology to drug design, creation of animal model, and to food, agricultural and energy sciences. Adoption of the CRISPR/Cas9 technology in biomedical and biotechnological researches would create innovative applications of it not only for breeding of strains exhibiting desired traits for specific industrial and medical applications, but also for investigation of genome function.

  5. Dissecting direct reprogramming through integrative genomic analysis

    PubMed Central

    Mikkelsen, Tarjei S.; Hanna, Jacob; Zhang, Xiaolan; Ku, Manching; Wernig, Marius; Schorderet, Patrick; Bernstein, Bradley E.; Jaenisch, Rudolf; Lander, Eric S.; Meissner, Alexander

    2009-01-01

    Somatic cells can be reprogrammed to a pluripotent state through the ectopic expression of defined transcription factors. Understanding the mechanism and kinetics of this transformation may shed light on the nature of developmental potency and suggest strategies with improved efficiency or safety. Here we report an integrative genomic analysis of reprogramming of mouse fibroblasts and B lymphocytes. Lineage-committed cells show a complex response to the ectopic expression involving induction of genes downstream of individual reprogramming factors. Fully reprogrammed cells show gene expression and epigenetic states that are highly similar to embryonic stem cells. In contrast, stable partially reprogrammed cell lines show reactivation of a distinctive subset of stem-cell-related genes, incomplete repression of lineage-specifying transcription factors, and DNA hypermethylation at pluripotency-related loci. These observations suggest that some cells may become trapped in partially reprogrammed states owing to incomplete repression of transcription factors, and that DNA de-methylation is an inefficient step in the transition to pluripotency. We demonstrate that RNA inhibition of transcription factors can facilitate reprogramming, and that treatment with DNA methyltransferase inhibitors can improve the overall efficiency of the reprogramming process. PMID:18509334

  6. Nuclear pore complexes in the maintenance of genome integrity.

    PubMed

    Bukata, Lucas; Parker, Stephanie L; D'Angelo, Maximiliano A

    2013-06-01

    Maintaining genome integrity is crucial for successful organismal propagation and for cell and tissue homeostasis. Several processes contribute to safeguarding the genomic information of cells. These include accurate replication of genetic information, detection and repair of DNA damage, efficient segregation of chromosomes, protection of chromosome ends, and proper organization of genome architecture. Interestingly, recent evidence shows that nuclear pore complexes, the channels connecting the nucleus with the cytoplasm, play important roles in these processes suggesting that these multiprotein platforms are key regulators of genome integrity.

  7. MAR-Mediated transgene integration into permissive chromatin and increased expression by recombination pathway engineering.

    PubMed

    Kostyrko, Kaja; Neuenschwander, Samuel; Junier, Thomas; Regamey, Alexandre; Iseli, Christian; Schmid-Siegert, Emanuel; Bosshard, Sandra; Majocchi, Stefano; Le Fourn, Valérie; Girod, Pierre-Alain; Xenarios, Ioannis; Mermod, Nicolas

    2017-02-01

    Untargeted plasmid integration into mammalian cell genomes remains a poorly understood and inefficient process. The formation of plasmid concatemers and their genomic integration has been ascribed either to non-homologous end-joining (NHEJ) or homologous recombination (HR) DNA repair pathways. However, a direct involvement of these pathways has remained unclear. Here, we show that the silencing of many HR factors enhanced plasmid concatemer formation and stable expression of the gene of interest in Chinese hamster ovary (CHO) cells, while the inhibition of NHEJ had no effect. However, genomic integration was decreased by the silencing of specific HR components, such as Rad51, and DNA synthesis-dependent microhomology-mediated end-joining (SD-MMEJ) activities. Genome-wide analysis of the integration loci and junction sequences validated the prevalent use of the SD-MMEJ pathway for transgene integration close to cellular genes, an effect shared with matrix attachment region (MAR) DNA elements that stimulate plasmid integration and expression. Overall, we conclude that SD-MMEJ is the main mechanism driving the illegitimate genomic integration of foreign DNA in CHO cells, and we provide a recombination engineering approach that increases transgene integration and recombinant protein expression in these cells. Biotechnol. Bioeng. 2017;114: 384-396. © 2016 The Authors. Biotechnology and Bioengineering published by Wiley Periodicals, Inc.

  8. T-DNA integration in plants results from polymerase-θ-mediated DNA repair.

    PubMed

    van Kregten, Maartje; de Pater, Sylvia; Romeijn, Ron; van Schendel, Robin; Hooykaas, Paul J J; Tijsterman, Marcel

    2016-10-31

    Agrobacterium tumefaciens is a pathogenic bacterium, which transforms plants by transferring a discrete segment of its DNA, the T-DNA, to plant cells. The T-DNA then integrates into the plant genome. T-DNA biotechnology is widely exploited in the genetic engineering of model plants and crops. However, the molecular mechanism underlying T-DNA integration remains unknown(1). Here we demonstrate that in Arabidopsis thaliana T-DNA integration critically depends on polymerase theta (Pol θ). We find that TEBICHI/POLQ mutant plants (which have mutated Pol θ), although susceptible to Agrobacterium infection, are resistant to T-DNA integration. Characterization of >10,000 T-DNA-plant genome junctions reveals a distinct signature of Pol θ action and also indicates that 3' end capture at genomic breaks is the prevalent mechanism of T-DNA integration. The primer-template switching ability of Pol θ can explain the molecular patchwork known as filler DNA that is frequently observed at sites of integration. T-DNA integration signatures in other plant species closely resemble those of Arabidopsis, suggesting that Pol-θ-mediated integration is evolutionarily conserved. Thus, Pol θ provides the mechanism for T-DNA random integration into the plant genome, demonstrating a potential to disrupt random integration so as to improve the quality and biosafety of plant transgenesis.

  9. Hotspots of MLV integration in the hematopoietic tumor genome

    PubMed Central

    Tsuruyama, T; Hiratsuka, T; Yamada, N

    2017-01-01

    Extensive research has been performed regarding the integration sites of murine leukemia retrovirus (MLV) for the identification of proto-oncogenes. To date, the overlap of mutations within specific oligonucleotides across different tumor genomes has been regarded as a rare event; however, a recent study of MLV integration into the oncogene Zfp521 suggested the existence of a hotspot oligonucleotide for MLV integration. In the current review, we discuss the hotspots of MLV integration into several genes: c-Myc, Stat5a and N-myc, as well as ZFP521, as examined in tumor genomes. From this, MLV integration convergence within specific oligonucleotides is not necessarily a rare event. This short review aims to promote re-consideration of MLV integration within the tumor genome, which involves both well-known and potentially newly identified and novel mechanisms and specifications. PMID:27721401

  10. Integrated proteomic and genomic analysis of colorectal cancer

    Cancer.gov

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  11. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Zhou, Jizhong; He, Zhili

    2014-04-08

    As a part of the Shewanella Federation project, we have used integrated genomic, proteomic and computational technologies to study various aspects of energy metabolism of two Shewanella strains from a systems-level perspective.

  12. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    PubMed

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org.

  13. Single Crossover-Mediated Markerless Genome Engineering in Clostridium acetobutylicum.

    PubMed

    Lee, Sang-Hyun; Kim, Hyun Ju; Shin, Yong-An; Kim, Kyoung Heon; Lee, Sang Jun

    2016-04-28

    A novel genome-engineering tool in Clostridium acetobutylicum was developed based on singlecrossover homologous recombination. A small-sized non-replicable plasmid, pHKO1, was designed for efficient integration into the C. acetobutylicum genome. The integrated pHKO1 plasmid backbone, which included an antibiotic resistance gene, can be excised in vivo by Flp recombinase, leaving a single flippase recognition target sequence in the middle of the targeted gene. Since the pSHL-FLP plasmid, the carrier of the Flp recombinase gene, employed the segregationally unstable pAMβ1 replicon, the plasmid was rapidly cured from the mutant C. acetobutylicum. Consequently, our method makes it easier to engineer C. acetobutylicum.

  14. WheatGenome.info: an integrated database and portal for wheat genome information.

    PubMed

    Lai, Kaitao; Berkman, Paul J; Lorenc, Michal Tadeusz; Duran, Chris; Smits, Lars; Manoli, Sahana; Stiller, Jiri; Edwards, David

    2012-02-01

    Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.

  15. Principles and methods of integrative genomic analyses in cancer.

    PubMed

    Kristensen, Vessela N; Lingjærde, Ole Christian; Russnes, Hege G; Vollan, Hans Kristian M; Frigessi, Arnoldo; Børresen-Dale, Anne-Lise

    2014-05-01

    Combined analyses of molecular data, such as DNA copy-number alteration, mRNA and protein expression, point to biological functions and molecular pathways being deregulated in multiple cancers. Genomic, metabolomic and clinical data from various solid cancers and model systems are emerging and can be used to identify novel patient subgroups for tailored therapy and monitoring. The integrative genomics methodologies that are used to interpret these data require expertise in different disciplines, such as biology, medicine, mathematics, statistics and bioinformatics, and they can seem daunting. The objectives, methods and computational tools of integrative genomics that are available to date are reviewed here, as is their implementation in cancer research.

  16. Precise in-frame integration of exogenous DNA mediated by CRISPR/Cas9 system in zebrafish.

    PubMed

    Hisano, Yu; Sakuma, Tetsushi; Nakade, Shota; Ohga, Rie; Ota, Satoshi; Okamoto, Hitoshi; Yamamoto, Takashi; Kawahara, Atsuo

    2015-03-05

    The CRISPR/Cas9 system provides a powerful tool for genome editing in various model organisms, including zebrafish. The establishment of targeted gene-disrupted zebrafish (knockouts) is readily achieved by CRISPR/Cas9-mediated genome modification. Recently, exogenous DNA integration into the zebrafish genome via homology-independent DNA repair was reported, but this integration contained various mutations at the junctions of genomic and integrated DNA. Thus, precise genome modification into targeted genomic loci remains to be achieved. Here, we describe efficient, precise CRISPR/Cas9-mediated integration using a donor vector harbouring short homologous sequences (10-40 bp) flanking the genomic target locus. We succeeded in integrating with high efficiency an exogenous mCherry or eGFP gene into targeted genes (tyrosinase and krtt1c19e) in frame. We found the precise in-frame integration of exogenous DNA without backbone vector sequences when Cas9 cleavage sites were introduced at both sides of the left homology arm, the eGFP sequence and the right homology arm. Furthermore, we confirmed that this precise genome modification was heritable. This simple method enables precise targeted gene knock-in in zebrafish.

  17. [Investigation on the integrative course of genetics and genomics].

    PubMed

    Liu, Zhi-Xiang; Xu, Gang-Biao; Zeng, Chao-Zhen; Wang, Ai-Yun; Wu, Ruo-Yan

    2011-07-01

    Genomics is an important subdiscipline of genetics, and it forms a complete research system based on novel theories and techniques. Incorporating genomics in undergraduate curriculum is a response to the need of the development of genetics. The teaching of genomics has significant advantages on developing scientific thinking, enhances bioethics accomplishment, and professional interests in undergraduate students. The integration of genomics into genetics is in accordance with the principles of subject development and education. Related textbooks for undergraduate education are currently available in China, and it is feasible to set up a genetics and genomics integrative course by modifying teaching contents of the genetics course, selecting appropriate teaching approaches, and optimal application of the computer-assisted instruction.

  18. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma

    PubMed Central

    Zhao, Ling-Hao; Liu, Xiao; Yan, He-Xin; Li, Wei-Yang; Zeng, Xi; Yang, Yuan; Zhao, Jie; Liu, Shi-Ping; Zhuang, Xue-Han; Lin, Chuan; Qin, Chen-Jie; Zhao, Yi; Pan, Ze-Ya; Huang, Gang; Liu, Hui; Zhang, Jin; Wang, Ruo-Yu; Yang, Yun; Wen, Wen; Lv, Gui-Shuai; Zhang, Hui-Lu; Wu, Han; Huang, Shuai; Wang, Ming-Da; Tang, Liang; Cao, Hong-Zhi; Wang, Ling; Lee, Tin-Lap; Jiang, Hui; Tan, Ye-Xiong; Yuan, Sheng-Xian; Hou, Guo-Jun; Tao, Qi-Fei; Xu, Qin-Guo; Zhang, Xiu-Qing; Wu, Meng-Chao; Xu, Xun; Wang, Jun; Yang, Huan-Ming; Zhou, Wei-Ping; Wang, Hong-Yang

    2016-01-01

    Hepatitis B virus (HBV) can integrate into the human genome, contributing to genomic instability and hepatocarcinogenesis. Here by conducting high-throughput viral integration detection and RNA sequencing, we identify 4,225 HBV integration events in tumour and adjacent non-tumour samples from 426 patients with HCC. We show that HBV is prone to integrate into rare fragile sites and functional genomic regions including CpG islands. We observe a distinct pattern in the preferential sites of HBV integration between tumour and non-tumour tissues. HBV insertional sites are significantly enriched in the proximity of telomeres in tumours. Recurrent HBV target genes are identified with few that overlap. The overall HBV integration frequency is much higher in tumour genomes of males than in females, with a significant enrichment of integration into chromosome 17. Furthermore, a cirrhosis-dependent HBV integration pattern is observed, affecting distinct targeted genes. Our data suggest that HBV integration has a high potential to drive oncogenic transformation. PMID:27703150

  19. Integrated Microbial Genomes (IMG) System from the DOE Joint Genome Institute (JGI)

    DOE Data Explorer

    The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov. [Abstract from The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions; Victor M. Markowitz, Ernest Szeto, Krishna Palaniappan, Yuri Grechkin, Ken Chu, I-Min A. Chen, Inna Dubchak, Iain Anderson, Athanasios Lykidis, Konstantinos Mavromatis, Natalia N. Ivanova and Nikos C. Kyrpides; Nucleic Acids Research, 2008, Vol. 36. (Database Issue) See also the companion system, Integrated Microbial Genomes with Microbiome Samples.

  20. CTXφ contains a hybrid genome derived from tandemly integrated elements

    PubMed Central

    Davis, Brigid M.; Waldor, Matthew K.

    2000-01-01

    CTXφ is a filamentous, temperate bacteriophage whose genome includes ctxAB, the genes that encode cholera toxin. In toxigenic isolates of Vibrio cholerae, tandem arrays of prophage DNA, usually interspersed with the related genetic element RS1, are integrated site-specifically within the chromosome. We have discovered that these arrays routinely yield hybrid virions, composed of DNA from two adjacent prophages or from a prophage and a downstream RS1. Coding sequences are always derived from the 5′ prophage whereas most of an intergenic sequence, intergenic region 1, is always derived from the 3′ element. The presence of tandem elements is required for production of virions: V. cholerae strains that contain a solitary prophage rarely yield CTX virions, and the few virions detected result from imprecise excision of prophage DNA. Thus, generation of the replicative form of CTXφ, pCTX, a step that precedes production of virions, does not depend on reversal of the process for site-specific integration of CTXφ DNA into the V. cholerae chromosome. Production of pCTX also does not depend on RecA-mediated homologous recombination between adjacent prophages. We hypothesize that the CTXφ-specific proteins required for replication of pCTX can also function on a chromosomal substrate, and that, unlike the processes used by other integrating phages, production of pCTX and CTXφ does not require excision of the prophage from the chromosome. Use of this replication strategy maximizes vertical transmission of prophage DNA while still enabling dissemination of CTXφ to new hosts. PMID:10880564

  1. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.

    PubMed

    Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora

    2015-05-01

    Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders. © 2015 WILEY Periodicals, Inc.

  2. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.

  3. An integrated approach to structural genomics.

    PubMed

    Heinemann, U; Frevert, J; Hofmann, K; Illing, G; Maurer, C; Oschkinat, H; Saenger, W

    2000-01-01

    Structural genomics aims at determining a set of protein structures that will represent all domain folds present in the biosphere. These structures can be used as the basis for the homology modelling of the majority of all remaining protein domains or, indeed, proteins. Structural genomics therefore promises to provide a comprehensive structural description of the protein universe. To achieve this, a broad scientific effort is required. The Berlin-based "Protein Structure Factory" (PSF) plans to contribute to this effort by setting up a local infrastructure for the low-cost, high-throughput analysis of soluble human proteins. In close collaboration with the German Human Genome Project (DHGP) protein-coding genes will be expressed in Escherichia coli or yeast. Affinity-tagged proteins will be purified semi-automatically for biophysical characterization and structure analysis by X-ray diffraction methods and NMR spectroscopy. In all steps of the structure analysis process, possibilities for automation, parallelization and standardization will be explored. Major new facilities that are created for the PSF include a robotic station for large-scale protein crystallization, an NMR center and an experimental station for protein crystallography at the synchrotron storage ring BESSY II in Berlin.

  4. A physical map of the papaya genome with integrated genetic map and genome sequence

    PubMed Central

    2009-01-01

    Background Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we constructed a BAC-based physical map using a high information-content fingerprinting approach to assist whole genome shotgun sequence assembly. Results The physical map consists of 963 contigs, representing 9.4× genome equivalents, and was integrated with the genetic map and genome sequence using BAC end sequences and a sequence-tagged high-density genetic map. The estimated genome coverage of the physical map is about 95.8%, while 72.4% of the genome was aligned to the genetic map. A total of 1,181 high quality overgo (overlapping oligonucleotide) probes representing conserved sequences in Arabidopsis and genetically mapped loci in Brassica were anchored on the physical map, which provides a foundation for comparative genomics in the Brassicales. The integrated genetic and physical map aligned with the genome sequence revealed recombination hotspots as well as regions suppressed for recombination across the genome, particularly on the recently evolved sex chromosomes. Suppression of recombination spread to the adjacent region of the male specific region of the Y chromosome (MSY), and recombination rates were recovered gradually and then exceeded the genome average. Recombination hotspots were observed at about 10 Mb away on both sides of the MSY, showing 7-fold increase compared with the genome wide average, demonstrating the dynamics of recombination of the sex chromosomes. Conclusion A BAC-based physical map of papaya was constructed and integrated with the genetic map and genome sequence. The integrated map facilitated the draft genome assembly

  5. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M. ); Micheals, G.S.; Taylor, R. . Div. of Computer Resources and Technology)

    1992-01-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator's tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  6. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M.; Micheals, G.S.; Taylor, R.

    1992-12-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator`s tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  7. IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

    PubMed Central

    Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

    2009-01-01

    Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385

  8. Structures of the CRISPR genome integration complex.

    PubMed

    Wright, Addison V; Liu, Jun-Jie; Knott, Gavin J; Doxzen, Kevin W; Nogales, Eva; Doudna, Jennifer A

    2017-09-15

    CRISPR-Cas systems depend on the Cas1-Cas2 integrase to capture and integrate short foreign DNA fragments into the CRISPR locus, enabling adaptation to new viruses. We present crystal structures of Cas1-Cas2 bound to both donor and target DNA in intermediate and product integration complexes, as well as a cryo-electron microscopy structure of the full CRISPR locus integration complex, including the accessory protein IHF (integration host factor). The structures show unexpectedly that indirect sequence recognition dictates integration site selection by favoring deformation of the repeat and the flanking sequences. IHF binding bends the DNA sharply, bringing an upstream recognition motif into contact with Cas1 to increase both the specificity and efficiency of integration. These results explain how the Cas1-Cas2 CRISPR integrase recognizes a sequence-dependent DNA structure to ensure site-selective CRISPR array expansion during the initial step of bacterial adaptive immunity. Copyright © 2017, American Association for the Advancement of Science.

  9. Identifying potential cancer driver genes by genomic data integration

    PubMed Central

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-01-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis. PMID:24346768

  10. Identifying potential cancer driver genes by genomic data integration

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-12-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis.

  11. Transcriptionally active genome regions are preferred targets for retrovirus integration.

    PubMed Central

    Scherdin, U; Rhodes, K; Breindl, M

    1990-01-01

    We have analyzed the transcriptional activity of cellular target sequences for Moloney murine leukemia virus integration in mouse fibroblasts. At least five of the nine random, unselected integration target sequences studied showed direct evidence for transcriptional activity by hybridization to nuclear run-on transcripts prepared from uninfected cells. At least four of the sequences contained multiple recognition sites for several restriction enzymes that cut preferentially in CpG-rich islands, indicating integration into 5' or 3' ends or flanking regions of genes. Assuming that only a minor fraction (less than 20%) of the genome is transcribed in mammalian cells, we calculated the probability that this association of retroviral integration sites with transcribed sequences is due to chance to be very low (1.6 x 10(-2]. Thus, our results strongly suggest that transcriptionally active genome regions are preferred targets for retrovirus integration. Images PMID:2296087

  12. Integrated genomic analyses in bronchopulmonary dysplasia.

    PubMed

    Ambalavanan, Namasivayam; Cotten, C Michael; Page, Grier P; Carlo, Waldemar A; Murray, Jeffrey C; Bhattacharya, Soumyaroop; Mariani, Thomas J; Cuna, Alain C; Faye-Petersen, Ona M; Kelly, David; Higgins, Rosemary D

    2015-03-01

    To identify single-nucleotide polymorphisms (SNPs) and pathways associated with bronchopulmonary dysplasia (BPD) because O2 requirement at 36 weeks' postmenstrual age risk is strongly influenced by heritable factors. A genome-wide scan was conducted on 1.2 million genotyped SNPs, and an additional 7 million imputed SNPs, using a DNA repository of extremely low birth weight infants. Genome-wide association and gene set analysis was performed for BPD or death, severe BPD or death, and severe BPD in survivors. Specific targets were validated via the use of gene expression in BPD lung tissue and in mouse models. Of 751 infants analyzed, 428 developed BPD or died. No SNPs achieved genome-wide significance (P < 10(-8)), although multiple SNPs in adenosine deaminase, CD44, and other genes were just below P < 10(-6). Of approximately 8000 pathways, 75 were significant at false discovery rate (FDR) <0.1 and P < .001 for BPD/death, 95 for severe BPD/death, and 90 for severe BPD in survivors. The pathway with lowest FDR was miR-219 targets (P = 1.41E-08, FDR 9.5E-05) for BPD/death and phosphorous oxygen lyase activity (includes adenylate and guanylate cyclases) for both severe BPD/death (P = 5.68E-08, FDR 0.00019) and severe BPD in survivors (P = 3.91E-08, FDR 0.00013). Gene expression analysis confirmed significantly increased miR-219 and CD44 in BPD. Pathway analyses confirmed involvement of known pathways of lung development and repair (CD44, phosphorus oxygen lyase activity) and indicated novel molecules and pathways (adenosine deaminase, targets of miR-219) involved in genetic predisposition to BPD. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Nuclease-mediated genome editing: At the front-line of functional genomics technology.

    PubMed

    Sakuma, Tetsushi; Woltjen, Knut

    2014-01-01

    Genome editing with engineered endonucleases is rapidly becoming a staple method in developmental biology studies. Engineered nucleases permit random or designed genomic modification at precise loci through the stimulation of endogenous double-strand break repair. Homology-directed repair following targeted DNA damage is mediated by co-introduction of a custom repair template, allowing the derivation of knock-out and knock-in alleles in animal models previously refractory to classic gene targeting procedures. Currently there are three main types of customizable site-specific nucleases delineated by the source mechanism of DNA binding that guides nuclease activity to a genomic target: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). Among these genome engineering tools, characteristics such as the ease of design and construction, mechanism of inducing DNA damage, and DNA sequence specificity all differ, making their application complementary. By understanding the advantages and disadvantages of each method, one may make the best choice for their particular purpose.

  14. Integrated genomic characterization of papillary thyroid carcinoma.

    PubMed

    2014-10-23

    Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D, and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors, and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease.

  15. Integrated Genomic Characterization of Papillary Thyroid Carcinoma

    PubMed Central

    Agrawal, Nishant; Akbani, Rehan; Aksoy, B. Arman; Ally, Adrian; Arachchi, Harindra; Asa, Sylvia L.; Auman, J. Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B.; Behera, Madhusmita; Bernard, Brady; Beroukhim, Rameen; Bishop, Justin A.; Black, Aaron D.; Bodenheimer, Tom; Boice, Lori; Bootwalla, Moiz S.; Bowen, Jay; Bowlby, Reanne; Bristow, Christopher A.; Brookens, Robin; Brooks, Denise; Bryant, Robert; Buda, Elizabeth; Butterfield, Yaron S.N.; Carling, Tobias; Carlsen, Rebecca; Carter, Scott L.; Carty, Sally E.; Chan, Timothy A.; Chen, Amy Y.; Cherniack, Andrew D.; Cheung, Dorothy; Chin, Lynda; Cho, Juok; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Ciriello, Giovanni; Clarke, Amanda; Clayman, Gary L.; Cope, Leslie; Copland, John; Covington, Kyle; Danilova, Ludmila; Davidsen, Tanja; Demchok, John A.; DiCara, Daniel; Dhalla, Noreen; Dhir, Rajiv; Dookran, Sheliann S.; Dresdner, Gideon; Eldridge, Jonathan; Eley, Greg; El-Naggar, Adel K.; Eng, Stephanie; Fagin, James A.; Fennell, Timothy; Ferris, Robert L.; Fisher, Sheila; Frazer, Scott; Frick, Jessica; Gabriel, Stacey B.; Ganly, Ian; Gao, Jianjiong; Garraway, Levi A.; Gastier-Foster, Julie M.; Getz, Gad; Gehlenborg, Nils; Ghossein, Ronald; Gibbs, Richard A.; Giordano, Thomas J.; Gomez-Hernandez, Karen; Grimsby, Jonna; Gross, Benjamin; Guin, Ranabir; Hadjipanayis, Angela; Harper, Hollie A.; Hayes, D. Neil; Heiman, David I.; Herman, James G.; Hoadley, Katherine A.; Hofree, Matan; Holt, Robert A.; Hoyle, Alan P.; Huang, Franklin W.; Huang, Mei; Hutter, Carolyn M.; Ideker, Trey; Iype, Lisa; Jacobsen, Anders; Jefferys, Stuart R.; Jones, Corbin D.; Jones, Steven J.M.; Kasaian, Katayoon; Kebebew, Electron; Khuri, Fadlo R.; Kim, Jaegil; Kramer, Roger; Kreisberg, Richard; Kucherlapati, Raju; Kwiatkowski, David J.; Ladanyi, Marc; Lai, Phillip H.; Laird, Peter W.; Lander, Eric; Lawrence, Michael S.; Lee, Darlene; Lee, Eunjung; Lee, Semin; Lee, William; Leraas, Kristen M.; Lichtenberg, Tara M.; Lichtenstein, Lee; Lin, Pei; Ling, Shiyun; Liu, Jinze; Liu, Wenbin; Liu, Yingchun; LiVolsi, Virginia A.; Lu, Yiling; Ma, Yussanne; Mahadeshwar, Harshad S.; Marra, Marco A.; Mayo, Michael; McFadden, David G.; Meng, Shaowu; Meyerson, Matthew; Mieczkowski, Piotr A.; Miller, Michael; Mills, Gordon; Moore, Richard A.; Mose, Lisle E.; Mungall, Andrew J.; Murray, Bradley A.; Nikiforov, Yuri E.; Noble, Michael S.; Ojesina, Akinyemi I.; Owonikoko, Taofeek K.; Ozenberger, Bradley A.; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J.; Parker, Joel S.; Paull, Evan O.; Pedamallu, Chandra Sekhar; Perou, Charles M.; Prins, Jan F.; Protopopov, Alexei; Ramalingam, Suresh S.; Ramirez, Nilsa C.; Ramirez, Ricardo; Raphael, Benjamin J.; Rathmell, W. Kimryn; Ren, Xiaojia; Reynolds, Sheila M.; Rheinbay, Esther; Ringel, Matthew D.; Rivera, Michael; Roach, Jeffrey; Robertson, A. Gordon; Rosenberg, Mara W.; Rosenthall, Matthew; Sadeghi, Sara; Saksena, Gordon; Sander, Chris; Santoso, Netty; Schein, Jacqueline E.; Schultz, Nikolaus; Schumacher, Steven E.; Seethala, Raja R.; Seidman, Jonathan; Senbabaoglu, Yasin; Seth, Sahil; Sharpe, Samantha; Mills Shaw, Kenna R.; Shen, John P.; Shen, Ronglai; Sherman, Steven; Sheth, Margi; Shi, Yan; Shmulevich, Ilya; Sica, Gabriel L.; Simons, Janae V.; Sipahimalani, Payal; Smallridge, Robert C.; Sofia, Heidi J.; Soloway, Matthew G.; Song, Xingzhi; Sougnez, Carrie; Stewart, Chip; Stojanov, Petar; Stuart, Joshua M.; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Tarnuzzer, Roy; Taylor, Barry S.; Thiessen, Nina; Thorne, Leigh; Thorsson, Vésteinn; Tuttle, R. Michael; Umbricht, Christopher B.; Van Den Berg, David J.; Vandin, Fabio; Veluvolu, Umadevi; Verhaak, Roel G.W.; Vinco, Michelle; Voet, Doug; Walter, Vonn; Wang, Zhining; Waring, Scot; Weinberger, Paul M.; Weinstein, John N.; Weisenberger, Daniel J.; Wheeler, David; Wilkerson, Matthew D.; Wilson, Jocelyn; Williams, Michelle; Winer, Daniel A.; Wise, Lisa; Wu, Junyuan; Xi, Liu; Xu, Andrew W.; Yang, Liming; Yang, Lixing; Zack, Travis I.; Zeiger, Martha A.; Zeng, Dong; Zenklusen, Jean Claude; Zhao, Ni; Zhang, Hailei; Zhang, Jianhua; Zhang, Jiashan (Julia); Zhang, Wei; Zmuda, Erik; Zou., Lihua

    2014-01-01

    Summary Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  16. Repetitive genomic sequences as a substrate for homologous integration in the Rhizopus oryzae genome.

    PubMed

    Yuzbashev, Tigran V; Larina, Anna S; Vybornaya, Tatiana V; Yuzbasheva, Evgeniya Y; Gvilava, Ilia T; Sineoky, Sergey P

    2015-06-01

    The vast number of repetitive genomic elements was identified in the genome of Rhizopus oryzae. Such genomic repeats can be used as homologous regions for integration of plasmids. Here, we evaluated the use of two different repeats: the short (575 bp) rptZ, widely distributed (about 34 copies per genome) and the long (2053 bp) rptH, less prevalent (about 15 copies). The plasmid carrying rptZ integrated, but did so through a 2256-bp region of homology to the pyrG locus, a unique genomic sequence. Thus, the length of rptZ was below the minimal requirements for homologous strand exchange in this fungus. In contrast, rptH was used efficiently for homologous integration. The plasmid bearing this repeat integrated in multicopy fashion, with up to 25 copies arranged in tandem. The latter vector, pPyrG-H, could be a valuable tool for integration at homologous sequences, for such purposes as high-level expression of proteins. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  17. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  18. CRISPR-Cas9-Mediated Genome Editing in Leishmania donovani

    PubMed Central

    Zhang, Wen-Wei

    2015-01-01

    ABSTRACT The prokaryotic CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9, an RNA-guided endonuclease, has been shown to mediate efficient genome editing in a wide variety of organisms. In the present study, the CRISPR-Cas9 system has been adapted to Leishmania donovani, a protozoan parasite that causes fatal human visceral leishmaniasis. We introduced the Cas9 nuclease into L. donovani and generated guide RNA (gRNA) expression vectors by using the L. donovani rRNA promoter and the hepatitis delta virus (HDV) ribozyme. It is demonstrated within that L. donovani mainly used homology-directed repair (HDR) and microhomology-mediated end joining (MMEJ) to repair the Cas9 nuclease-created double-strand DNA break (DSB). The nonhomologous end-joining (NHEJ) pathway appears to be absent in L. donovani. With this CRISPR-Cas9 system, it was possible to generate knockouts without selection by insertion of an oligonucleotide donor with stop codons and 25-nucleotide homology arms into the Cas9 cleavage site. Likewise, we disrupted and precisely tagged endogenous genes by inserting a bleomycin drug selection marker and GFP gene into the Cas9 cleavage site. With the use of Hammerhead and HDV ribozymes, a double-gRNA expression vector that further improved gene-targeting efficiency was developed, and it was used to make precise deletion of the 3-kb miltefosine transporter gene (LdMT). In addition, this study identified a novel single point mutation caused by CRISPR-Cas9 in LdMT (M381T) that led to miltefosine resistance, a concern for the only available oral antileishmanial drug. Together, these results demonstrate that the CRISPR-Cas9 system represents an effective genome engineering tool for L. donovani. PMID:26199327

  19. Integration of cancer genomics with treatment selection: from the genome to predictive biomarkers

    PubMed Central

    Ow, Thomas J.; Sandulache, Vlad C.; Skinner, Heath D.; Myers, Jeffrey N.

    2013-01-01

    The field of cancer genomics is rapidly advancing as new technology provides detailed genetic and epigenetic profiling of human cancers. The amount of new data available describing the genetic make-up of tumors is paralleled by rapid advances in drug discovery and molecular therapy currently under investigation to treat these diseases. This review summarizes the challenges and approaches associated with the integration of genomic data into the development of new biomarkers in the management of cancer. PMID:24037788

  20. A first generation integrated map of the rainbow trout genome.

    PubMed

    Palti, Yniv; Genet, Carine; Luo, Ming-Cheng; Charlet, Aurélie; Gao, Guangtu; Hu, Yuqin; Castaño-Sánchez, Cecilia; Tabet-Canale, Kamila; Krieg, Francine; Yao, Jianbo; Vallejo, Roger L; Rexroad, Caird E

    2011-04-07

    Rainbow trout (Oncorhynchus mykiss) are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL) have been identified for production and life-history traits in rainbow trout. An integrated physical and genetic map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS) programs for improving rainbow trout aquaculture production. The first generation integrated map of the rainbow trout genome is composed of 238 BAC contigs anchored to chromosomes of the genetic map. It covers more than 10% of the genome across segments from all 29 chromosomes. Anchoring of 203 contigs to chromosomes of the National Center for Cool and Cold Water Aquaculture (NCCCWA) genetic map was achieved through mapping of 288 genetic markers derived from BAC end sequences (BES), screening of the BAC library with previously mapped markers and matching of SNPs with BES reads. In addition, 35 contigs were anchored to linkage groups of the INRA (French National Institute of Agricultural Research) genetic map through markers that were not informative for linkage analysis in the NCCCWA mapping panel. The ratio of physical to genetic linkage distances varied substantially among chromosomes and BAC contigs with an average of 3,033 Kb/cM. The integrated map described here provides a framework for a robust composite genome map for rainbow trout. This resource is needed for genomic analyses in this research model and economically important species and will facilitate comparative genome mapping with other salmonids and with model fish species. This resource will also facilitate efforts to assemble a whole-genome

  1. A first generation integrated map of the rainbow trout genome

    PubMed Central

    2011-01-01

    Background Rainbow trout (Oncorhynchus mykiss) are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL) have been identified for production and life-history traits in rainbow trout. An integrated physical and genetic map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS) programs for improving rainbow trout aquaculture production. Results The first generation integrated map of the rainbow trout genome is composed of 238 BAC contigs anchored to chromosomes of the genetic map. It covers more than 10% of the genome across segments from all 29 chromosomes. Anchoring of 203 contigs to chromosomes of the National Center for Cool and Cold Water Aquaculture (NCCCWA) genetic map was achieved through mapping of 288 genetic markers derived from BAC end sequences (BES), screening of the BAC library with previously mapped markers and matching of SNPs with BES reads. In addition, 35 contigs were anchored to linkage groups of the INRA (French National Institute of Agricultural Research) genetic map through markers that were not informative for linkage analysis in the NCCCWA mapping panel. The ratio of physical to genetic linkage distances varied substantially among chromosomes and BAC contigs with an average of 3,033 Kb/cM. Conclusions The integrated map described here provides a framework for a robust composite genome map for rainbow trout. This resource is needed for genomic analyses in this research model and economically important species and will facilitate comparative genome mapping with other salmonids and with model fish species. This resource will also facilitate efforts to

  2. Integrated Genomic Analyses of Ovarian Carcinoma

    PubMed Central

    2011-01-01

    Summary The Cancer Genome Atlas (TCGA) project has analyzed mRNA expression, miRNA expression, promoter methylation, and DNA copy number in 489 high-grade serous ovarian adenocarcinomas (HGS-OvCa) and the DNA sequences of exons from coding genes in 316 of these tumors. These results show that HGS-OvCa is characterized by TP53 mutations in almost all tumors (96%); low prevalence but statistically recurrent somatic mutations in 9 additional genes including NF1, BRCA1, BRCA2, RB1, and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three miRNA subtypes, four promoter methylation subtypes, a transcriptional signature associated with survival duration and shed new light on the impact on survival of tumors with BRCA1/2 and CCNE1 aberrations. Pathway analyses suggested that homologous recombination is defective in about half of tumors, and that Notch and FOXM1 signaling are involved in serous ovarian cancer pathophysiology. PMID:21720365

  3. Integrative clinical genomics of advanced prostate cancer

    PubMed Central

    Dan, Robinson; Van Allen, Eliezer M.; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J.; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim M.; Bradley, Robert K.; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P.; Hussain, Maha; Feng, Felix Y.; Tomlins, Scott A.; Cooney, Kathleen A.; Smith, David C.; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E.; Morris, Michael J.; Solomon, Stephen B.; Durack, Jeremy C.; Reuter, Victor E.; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T.; Bowden, Michaela; Balk, Stephen P.; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y.; Mostaghel, Elahe A.; Cheng, Heather H.; Mulcahy, Hyojeong; True, Lawrence D.; Plymate, Stephen R.; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Lopez, Raquel Perez; Demichelis, Francesca; Robinson, Brian D.; Schiffman, Marc A.; Nanus, David M.; Tagawa, Scott T.; Sigaras, Alexandros; Eng, Kenneth W.; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I.; Scher, Howard I.; Pienta, Kenneth J.; Kantoff, Philip; de Bono, Johann S.; Rubin, Mark A.; Nelson, Peter S.; Garraway, Levi A.; Sawyers, Charles L.; Chinnaiyan, Arul M.

    2015-01-01

    SUMMARY Toward development of a precision medicine framework for metastatic, castration resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53 and PTEN were frequent (40–60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified novel genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin and ZBTB16/PLZF. Aberrations of BRCA2, BRCA1 and ATM were observed at substantially higher frequencies (19.3% overall) than seen in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides evidence that clinical sequencing in mCRPC is feasible and could impact treatment decisions in significant numbers of affected individuals. PMID:26000489

  4. Integrated genomic characterization of oesophageal carcinoma.

    PubMed

    2017-01-12

    Oesophageal cancers are prominent worldwide; however, there are few targeted therapies and survival rates for these cancers remain dismal. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies.

  5. Integrated translational genomics for analysis of complex traits in sorghum

    USDA-ARS?s Scientific Manuscript database

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  6. Integrative genomics to dissect retinoid functions.

    PubMed

    Mendoza-Parra, Marco-Antonio; Gronemeyer, Hinrich

    2014-01-01

    Retinoids and rexinoids, as all other ligands of the nuclear receptor (NR) family, act as ligand-regulated trans-acting transcription factors that bind to cis-acting DNA regulatory elements in the promoter regions of target genes (for reviews see [12, 22, 23, 26, 36]). Ligand binding modulates the communication functions of the receptor with the intracellular environment, which essentially entails receptor-protein and receptor-DNA or receptor-chromatin interactions. In this communication network, the receptor simultaneously serves as both intracellular sensor and regulator of cell/organ functions. Receptors are "intelligent" mediators of the information encoded in the chemical structure of a nuclear receptor ligand, as they interpret this information in the context of cellular identity and cell-physiological status and convert it into a dynamic chain of receptor-protein and receptor-DNA interactions. To process input and output information, they are composed of a modular structure with several domains that have evolved to exert particular molecular recognition functions. As detailed in other chapters in this volume, the main functional domains are the DNA-binding (DBD) and ligand-binding (LBD) [5-7, 38, 56, 71]. The LBD serves as a dual input-output information processor. Inputs, such as ligand binding or receptor phosphorylations, induce allosteric changes in receptor surfaces that serve as docking sites for outputs, such as subunits of transcription and epigenetic machineries or enzyme complexes. The complexity of input and output signals and their interdependencies is far from being understood.

  7. Why Mitochondria Must Fuse to Maintain Their Genome Integrity

    PubMed Central

    Vidoni, Sara; Zanna, Claudia; Sarzi, Emmanuelle

    2013-01-01

    Abstract Significance: The maintenance of mitochondrial genome integrity is a major challenge for cells to sustain energy production by respiration. Recent Advances: Recently, mitochondrial membrane dynamics emerged as a key process contributing to prevent mitochondrial DNA (mtDNA) alterations. Indeed, both fundamental and clinical data suggest that disruption of mitochondrial fusion, related to mutations in the OPA1, MFN2, PINK1, and PARK2 genes, leads to the accumulation of mutations in the mitochondrial genome. Critical Issues: We discuss here the possibility that mitochondrial fusion acts as a direct mechanism to prevent the generation of altered mtDNA and to eliminate mutated deleterious genomes either by trans-complementation or by mitophagy. Future Directions: Finally, we conclude this review with a short evolutionary comparison between the mechanisms involved in mitochondrial and bacterial modes of genome distribution and plasticity, highlighting possible common conserved processes required for the maintenance of their genome integrity, which should inspire our future investigations. Antioxid. Redox Signal. 19, 379–388. PMID:23350575

  8. Integrating genomic resources of flatfish (Pleuronectiformes) to boost aquaculture production.

    PubMed

    Robledo, Diego; Hermida, Miguel; Rubiolo, Juan A; Fernández, Carlos; Blanco, Andrés; Bouza, Carmen; Martínez, Paulino

    2017-03-01

    Flatfish have a high market acceptance thus representing a profitable aquaculture production. The main farmed species is the turbot (Scophthalmus maximus) followed by Japanese flounder (Paralichthys olivaceous) and tongue sole (Cynoglossus semilaevis), but other species like Atlantic halibut (Hippoglossus hippoglossus), Senegalese sole (Solea senegalensis) and common sole (Solea solea) also register an important production and are very promising for farming. Important genomic resources are available for most of these species including whole genome sequencing projects, genetic maps and transcriptomes. In this work, we integrate all available genomic information of these species within a common framework, taking as reference the whole assembled genomes of turbot and tongue sole (>210× coverage). New insights related to the genetic basis of productive traits and new data useful to understand the evolutionary origin and diversification of this group were obtained. Despite a general 1:1 chromosome syntenic relationship between species, the comparison of turbot and tongue sole genomes showed huge intrachromosomic reorganizations. The integration of available mapping information supported specific chromosome fusions along flatfish evolution and facilitated the comparison between species of previously reported genetic associations for productive traits. When comparing transcriptomic resources of the six species, a common set of ~2500 othologues and ~150 common miRNAs were identified, and specific sets of putative missing genes were detected in flatfish transcriptomes, likely reflecting their evolutionary diversification. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  10. DemaDb: an integrated dematiaceous fungal genomes database.

    PubMed

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my.

  11. Restriction enzyme-mediated integration used to produce pathogenicity mutants of Colletotrichum graminicola.

    PubMed

    Thon, M R; Nuckles, E M; Vaillancourt, L J

    2000-12-01

    We have developed a restriction enzyme-mediated insertional mutagenesis (REMI) system for the maize pathogen Colletotrichum graminicola. In this report, we demonstrate the utility of a REMI-based mutagenesis approach to identify novel pathogenicity genes. Use of REMI increased transformation efficiency by as much as 27-fold over transformations with linearized plasmid alone. Ninety-nine transformants were examined by Southern analysis, and 51% contained simple integrations consisting of one copy of the vector integrated at a single site in the genome. All appeared to have a plasmid integration at a unique site. Sequencing across the integration sites of six transformants demonstrated that in all cases the plasmid integration occurred at the corresponding restriction enzyme-recognition site. We used an in vitro bioassay to identify two pathogenicity mutants among 660 transformants. Genomic DNA flanking the plasmid integration sites was used to identify corresponding cosmids in a wild-type genomic library. The pathogenicity of one of the mutants was restored when it was transformed with the cosmids.

  12. Megx.net: integrated database resource for marine ecological genomics

    PubMed Central

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  13. GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

    PubMed Central

    Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191

  14. GEMINI: integrative exploration of genetic variation and genome annotations.

    PubMed

    Paila, Umadevi; Chapman, Brad A; Kirchner, Rory; Quinlan, Aaron R

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.

  15. Recommendations for the integration of genomics into clinical practice.

    PubMed

    Bowdin, Sarah; Gilbert, Adel; Bedoukian, Emma; Carew, Christopher; Adam, Margaret P; Belmont, John; Bernhardt, Barbara; Biesecker, Leslie; Bjornsson, Hans T; Blitzer, Miriam; D'Alessandro, Lisa C A; Deardorff, Matthew A; Demmer, Laurie; Elliott, Alison; Feldman, Gerald L; Glass, Ian A; Herman, Gail; Hindorff, Lucia; Hisama, Fuki; Hudgins, Louanne; Innes, A Micheil; Jackson, Laird; Jarvik, Gail; Kim, Raymond; Korf, Bruce; Ledbetter, David H; Li, Mindy; Liston, Eriskay; Marshall, Christian; Medne, Livija; Meyn, M Stephen; Monfared, Nasim; Morton, Cynthia; Mulvihill, John J; Plon, Sharon E; Rehm, Heidi; Roberts, Amy; Shuman, Cheryl; Spinner, Nancy B; Stavropoulos, D James; Valverde, Kathleen; Waggoner, Darrel J; Wilkens, Alisha; Cohn, Ronald D; Krantz, Ian D

    2016-11-01

    The introduction of diagnostic clinical genome and exome sequencing (CGES) is changing the scope of practice for clinical geneticists. Many large institutions are making a significant investment in infrastructure and technology, allowing clinicians to access CGES, especially as health-care coverage begins to extend to clinically indicated genomic sequencing-based tests. Translating and realizing the comprehensive clinical benefits of genomic medicine remain a key challenge for the current and future care of patients. With the increasing application of CGES, it is necessary for geneticists and other health-care providers to understand its benefits and limitations in order to interpret the clinical relevance of genomic variants identified in the context of health and disease. New, collaborative working relationships with specialists across diverse disciplines (e.g., clinicians, laboratorians, bioinformaticians) will undoubtedly be key attributes of the future practice of clinical genetics and may serve as an example for other specialties in medicine. These new skills and relationships will also inform the development of the future model of clinical genetics training curricula. To address the evolving role of the clinical geneticist in the rapidly changing climate of genomic medicine, two Clinical Genetics Think Tank meetings were held that brought together physicians, laboratorians, scientists, genetic counselors, trainees, and patients with experience in clinical genetics, genetic diagnostics, and genetics education. This article provides recommendations that will guide the integration of genomics into clinical practice.Genet Med 18 11, 1075-1084.

  16. Recommendations for the Integration of Genomics into Clinical Practice

    PubMed Central

    Bowdin, Sarah; Gilbert, Adel; Bedoukian, Emma; Carew, Christopher; Adam, Margaret P; Belmont, John; Bernhardt, Barbara; Biesecker, Leslie; Bjornsson, Hans T.; Blitzer, Miriam; D’Alessandro, Lisa C. A.; Deardorff, Matthew A.; Demmer, Laurie; Elliott, Alison; Feldman, Gerald L.; Glass, Ian A.; Herman, Gail; Hindorff, Lucia; Hisama, Fuki; Hudgins, Louanne; Innes, A. Micheil; Jackson, Laird; Jarvik, Gail; Kim, Raymond; Korf, Bruce; Ledbetter, David H.; Li, Mindy; Liston, Eriskay; Marshall, Christian; Medne, Livija; Meyn, M. Stephen; Monfared, Nasim; Morton, Cynthia; Mulvihill, John J.; Plon, Sharon E.; Rehm, Heidi; Roberts, Amy; Shuman, Cheryl; Spinner, Nancy B.; Stavropoulos, D. James; Valverde, Kathleen; Waggoner, Darrel J.; Wilkens, Alisha; Cohn, Ronald D.; Krantz, Ian D.

    2017-01-01

    The introduction of diagnostic clinical genome and exome sequencing (CGES) is changing the scope of practice for clinical geneticists. Many large institutions are making a significant investment in infrastructure and technology, allowing clinicians to access CGES especially as health care coverage begins to extend to clinically indicated genomic sequencing-based tests. Translating and realizing the comprehensive clinical benefits of genomic medicine remains a key challenge for the current and future care of patients. With the increasing application of CGES, it is necessary for geneticists and other health care providers to understand its benefits and limitations, in order to interpret the clinical relevance of genomic variants identified in the context of health and disease. Establishing new, collaborative working relationships with specialists across diverse disciplines (e.g., clinicians, laboratorians, bioinformaticians) will undoubtedly be key attributes of the future practice of clinical genetics and may serve as an example for other specialties in medicine. These new skills and relationships will also inform the development of the future model of clinical genetics training curricula. To address the evolving role of the clinical geneticist in the rapidly changing climate of genomic medicine, two Clinical Genetics Think Tank meetings were held which brought together physicians, laboratorians, scientists, genetic counselors, trainees and patients with experience in clinical genetics, genetic diagnostics, and genetics education. This paper provides recommendations that will guide the integration of genomics into clinical practice. PMID:27171546

  17. iCAGES: integrated CAncer GEnome Score for comprehensively prioritizing driver genes in personal cancer genomes.

    PubMed

    Dong, Chengliang; Guo, Yunfei; Yang, Hui; He, Zeyu; Liu, Xiaoming; Wang, Kai

    2016-12-22

    Cancer results from the acquisition of somatic driver mutations. Several computational tools can predict driver genes from population-scale genomic data, but tools for analyzing personal cancer genomes are underdeveloped. Here we developed iCAGES, a novel statistical framework that infers driver variants by integrating contributions from coding, non-coding, and structural variants, identifies driver genes by combining genomic information and prior biological knowledge, then generates prioritized drug treatment. Analysis on The Cancer Genome Atlas (TCGA) data showed that iCAGES predicts whether patients respond to drug treatment (P = 0.006 by Fisher's exact test) and long-term survival (P = 0.003 from Cox regression). iCAGES is available at http://icages.wglab.org .

  18. Chromosomally Integrated Human Herpesvirus 6: Models of Viral Genome Release from the Telomere and Impacts on Human Health

    PubMed Central

    Wood, Michael L.; Royle, Nicola J.

    2017-01-01

    Human herpesvirus 6A and 6B, alongside some other herpesviruses, have the striking capacity to integrate into telomeres, the terminal repeated regions of chromosomes. The chromosomally integrated forms, ciHHV-6A and ciHHV-6B, are proposed to be a state of latency and it has been shown that they can both be inherited if integration occurs in the germ line. The first step in full viral reactivation must be the release of the integrated viral genome from the telomere and here we propose various models of this release involving transcription of the viral genome, replication fork collapse, and t-circle mediated release. In this review, we also discuss the relationship between ciHHV-6 and the telomere carrying the insertion, particularly how the presence and subsequent partial or complete release of the ciHHV-6 genome may affect telomere dynamics and the risk of disease. PMID:28704957

  19. inGeno – an integrated genome and ortholog viewer for improved genome to genome comparisons

    PubMed Central

    Liang, Chunguang; Dandekar, Thomas

    2006-01-01

    Background Systematic genome comparisons are an important tool to reveal gene functions, pathogenic features, metabolic pathways and genome evolution in the era of post-genomics. Furthermore, such comparisons provide important clues for vaccines and drug development. Existing genome comparison software often lacks accurate information on orthologs, the function of similar genes identified and genome-wide reports and lists on specific functions. All these features and further analyses are provided here in the context of a modular software tool "inGeno" written in Java with Biojava subroutines. Results InGeno provides a user-friendly interactive visualization platform for sequence comparisons (comprehensive reciprocal protein – protein comparisons) between complete genome sequences and all associated annotations and features. The comparison data can be acquired from several different sequence analysis programs in flexible formats. Automatic dot-plot analysis includes output reduction, filtering, ortholog testing and linear regression, followed by smart clustering (local collinear blocks; LCBs) to reveal similar genome regions. Further, the system provides genome alignment and visualization editor, collinear relationships and strain-specific islands. Specific annotations and functions are parsed, recognized, clustered, logically concatenated and visualized and summarized in reports. Conclusion As shown in this study, inGeno can be applied to study and compare in particular prokaryotic genomes against each other (gram positive and negative as well as close and more distantly related species) and has been proven to be sensitive and accurate. This modular software is user-friendly and easily accommodates new routines to meet specific user-defined requirements. PMID:17054788

  20. Mutation Detection with Next-Generation Resequencing through a Mediator Genome

    SciTech Connect

    Wurtzel, Omri; Dori-Bachash, Mally; Pietrokovski, Shmuel; Jurkevitch, Edouard; Sorek, Rotem; Ben-Jacob, Eshel

    2010-12-31

    The affordability of next generation sequencing (NGS) is transforming the field of mutation analysis in bacteria. The genetic basis for phenotype alteration can be identified directly by sequencing the entire genome of the mutant and comparing it to the wild-type (WT) genome, thus identifying acquired mutations. A major limitation for this approach is the need for an a-priori sequenced reference genome for the WT organism, as the short reads of most current NGS approaches usually prohibit de-novo genome assembly. To overcome this limitation we propose a general framework that utilizes the genome of relative organisms as mediators for comparing WT and mutant bacteria. Under this framework, both mutant and WT genomes are sequenced with NGS, and the short sequencing reads are mapped to the mediator genome. Variations between the mutant and the mediator that recur in the WT are ignored, thus pinpointing the differences between the mutant and the WT. To validate this approach we sequenced the genome of Bdellovibrio bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent mutant, and compared both to the mediator species Bdellovibrio bacteriovorus HD100. Although the mutant and the mediator sequences differed in more than 28,000 nucleotide positions, our approach enabled pinpointing the single causative mutation. Experimental validation in 53 additional mutants further established the implicated gene. Our approach extends the applicability of NGS-based mutant analyses beyond the domain of available reference genomes.

  1. Precise Genome Modification via Sequence-Specific Nucleases-Mediated Gene Targeting for Crop Improvement.

    PubMed

    Sun, Yongwei; Li, Jingying; Xia, Lanqin

    2016-01-01

    Genome editing technologies enable precise modifications of DNA sequences in vivo and offer a great promise for harnessing plant genes in crop improvement. The precise manipulation of plant genomes relies on the induction of DNA double-strand breaks by sequence-specific nucleases (SSNs) to initiate DNA repair reactions that are based on either non-homologous end joining (NHEJ) or homology-directed repair (HDR). While complete knock-outs and loss-of-function mutations generated by NHEJ are very valuable in defining gene functions, their applications in crop improvement are somewhat limited because many agriculturally important traits are conferred by random point mutations or indels at specific loci in either the genes' encoding or promoter regions. Therefore, genome modification through SSNs-mediated HDR for gene targeting (GT) that enables either gene replacement or knock-in will provide an unprecedented ability to facilitate plant breeding by allowing introduction of precise point mutations and new gene functions, or integration of foreign genes at specific and desired "safe" harbor in a predefined manner. The emergence of three programmable SSNs, such as zinc finger nucleases, transcriptional activator-like effector nucleases, and the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems has revolutionized genome modification in plants in a more controlled manner. However, while targeted mutagenesis is becoming routine in plants, the potential of GT technology has not been well realized for traits improvement in crops, mainly due to the fact that NHEJ predominates DNA repair process in somatic cells and competes with the HDR pathway, and thus HDR-mediated GT is a relative rare event in plants. Here, we review recent research findings mainly focusing on development and applications of precise GT in plants using three SSNs systems described above, and the potential mechanisms underlying HDR events in plant

  2. Precise Genome Modification via Sequence-Specific Nucleases-Mediated Gene Targeting for Crop Improvement

    PubMed Central

    Sun, Yongwei; Li, Jingying; Xia, Lanqin

    2016-01-01

    Genome editing technologies enable precise modifications of DNA sequences in vivo and offer a great promise for harnessing plant genes in crop improvement. The precise manipulation of plant genomes relies on the induction of DNA double-strand breaks by sequence-specific nucleases (SSNs) to initiate DNA repair reactions that are based on either non-homologous end joining (NHEJ) or homology-directed repair (HDR). While complete knock-outs and loss-of-function mutations generated by NHEJ are very valuable in defining gene functions, their applications in crop improvement are somewhat limited because many agriculturally important traits are conferred by random point mutations or indels at specific loci in either the genes’ encoding or promoter regions. Therefore, genome modification through SSNs-mediated HDR for gene targeting (GT) that enables either gene replacement or knock-in will provide an unprecedented ability to facilitate plant breeding by allowing introduction of precise point mutations and new gene functions, or integration of foreign genes at specific and desired “safe” harbor in a predefined manner. The emergence of three programmable SSNs, such as zinc finger nucleases, transcriptional activator-like effector nucleases, and the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems has revolutionized genome modification in plants in a more controlled manner. However, while targeted mutagenesis is becoming routine in plants, the potential of GT technology has not been well realized for traits improvement in crops, mainly due to the fact that NHEJ predominates DNA repair process in somatic cells and competes with the HDR pathway, and thus HDR-mediated GT is a relative rare event in plants. Here, we review recent research findings mainly focusing on development and applications of precise GT in plants using three SSNs systems described above, and the potential mechanisms underlying HDR events in

  3. G protein-coupled receptors: extranuclear mediators for the non-genomic actions of steroids.

    PubMed

    Wang, Chen; Liu, Yi; Cao, Ji-Min

    2014-09-01

    Steroids hormones possess two distinct actions, a delayed genomic effect and a rapid non-genomic effect. Rapid steroid-triggered signaling is mediated by specific receptors localized most often to the plasma membrane. The nature of these receptors is of great interest and accumulated data suggest that G protein-coupled receptors (GPCRs) are appealing candidates. Increasing evidence regarding the interaction between steroids and specific membrane proteins, as well as the involvement of G protein and corresponding downstream signaling, have led to identification of physiologically relevant GPCRs as steroid extranuclear receptors. Examples include G protein-coupled receptor 30 (GPR30) for estrogen, membrane progestin receptor for progesterone, G protein-coupled receptor family C group 6 member A (GPRC6A) and zinc transporter member 9 (ZIP9) for androgen, and trace amine associated receptor 1 (TAAR1) for thyroid hormone. These receptor-mediated biological effects have been extended to reproductive development, cardiovascular function, neuroendocrinology and cancer pathophysiology. However, although great progress have been achieved, there are still important questions that need to be answered, including the identities of GPCRs responsible for the remaining steroids (e.g., glucocorticoid), the structural basis of steroids and GPCRs' interaction and the integration of extranuclear and nuclear signaling to the final physiological function. Here, we reviewed the several significant developments in this field and highlighted a hypothesis that attempts to explain the general interaction between steroids and GPCRs.

  4. HOWDY: an integrated database system for human genome research

    PubMed Central

    Hirakawa, Mika

    2002-01-01

    HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search. PMID:11752279

  5. Enhancers as information integration hubs in development: lessons from genomics

    PubMed Central

    Buecker, Christa; Wysocka, Joanna

    2016-01-01

    Transcriptional enhancers are the primary determinants of tissue-specific gene expression. Although the majority of our current knowledge of enhancer elements comes from detailed analyses of individual loci, recent progress in epigenomics has led to the development of methods for comprehensive and conservation-independent annotation of cell type-specific enhancers. Here, we discuss the advantages and limitations of different genomic approaches to enhancer mapping and summarize observations that have been afforded by the genome-wide views of enhancer landscapes, with a focus on development. We propose that enhancers serve as information integration hubs, at which instructions encoded by the genome are read in the context of a specific cellular state, signaling milieu and chromatin environment, allowing for exquisitely precise spatiotemporal control of gene expression during embryogenesis. PMID:22487374

  6. Integrated Assessment of Genomic Correlates of Protein Evolutionary Rate

    PubMed Central

    Xia, Yu; Franzosa, Eric A.; Gerstein, Mark B.

    2009-01-01

    Rates of evolution differ widely among proteins, but the causes and consequences of such differences remain under debate. With the advent of high-throughput functional genomics, it is now possible to rigorously assess the genomic correlates of protein evolutionary rate. However, dissecting the correlations among evolutionary rate and these genomic features remains a major challenge. Here, we use an integrated probabilistic modeling approach to study genomic correlates of protein evolutionary rate in Saccharomyces cerevisiae. We measure and rank degrees of association between (i) an approximate measure of protein evolutionary rate with high genome coverage, and (ii) a diverse list of protein properties (sequence, structural, functional, network, and phenotypic). We observe, among many statistically significant correlations, that slowly evolving proteins tend to be regulated by more transcription factors, deficient in predicted structural disorder, involved in characteristic biological functions (such as translation), biased in amino acid composition, and are generally more abundant, more essential, and enriched for interaction partners. Many of these results are in agreement with recent studies. In addition, we assess information contribution of different subsets of these protein properties in the task of predicting slowly evolving proteins. We employ a logistic regression model on binned data that is able to account for intercorrelation, non-linearity, and heterogeneity within features. Our model considers features both individually and in natural ensembles (“meta-features”) in order to assess joint information contribution and degree of contribution independence. Meta-features based on protein abundance and amino acid composition make strong, partially independent contributions to the task of predicting slowly evolving proteins; other meta-features make additional minor contributions. The combination of all meta-features yields predictions comparable to those

  7. Using biological networks to integrate, visualize and analyze genomics data.

    PubMed

    Charitou, Theodosia; Bryan, Kenneth; Lynn, David J

    2016-03-31

    Network biology is a rapidly developing area of biomedical research and reflects the current view that complex phenotypes, such as disease susceptibility, are not the result of single gene mutations that act in isolation but are rather due to the perturbation of a gene's network context. Understanding the topology of these molecular interaction networks and identifying the molecules that play central roles in their structure and regulation is a key to understanding complex systems. The falling cost of next-generation sequencing is now enabling researchers to routinely catalogue the molecular components of these networks at a genome-wide scale and over a large number of different conditions. In this review, we describe how to use publicly available bioinformatics tools to integrate genome-wide 'omics' data into a network of experimentally-supported molecular interactions. In addition, we describe how to visualize and analyze these networks to identify topological features of likely functional relevance, including network hubs, bottlenecks and modules. We show that network biology provides a powerful conceptual approach to integrate and find patterns in genome-wide genomic data but we also discuss the limitations and caveats of these methods, of which researchers adopting these methods must remain aware.

  8. Integration of maternal genome into the neonate genome through breast milk mRNA transcripts and reverse transcriptase.

    PubMed

    Irmak, M Kemal; Oztas, Yesim; Oztas, Emin

    2012-06-07

    Human milk samples contain microvesicles similar to the retroviruses. These microvesicles contain mRNA transcripts and possess reverse transcriptase activity. They contain about 14,000 transcripts representing the milk transcriptome. Microvesicles are also enriched with proteins related to "caveolar-mediated endocytosis signaling" pathway. It has recently been reported that microvesicles could be transferred to other cells by endocytosis and their RNA content can be translated and be functional in their new location. A significant percentage of the mammalian genome appears to be the product of reverse transcription, containing sequences whose characteristics point to RNA as a template precursor. These are mobile elements that move by way of transposition and are called retrotransposons. We thought that retrotransposons may stem from about 14,000 transcriptome of breast milk microvesicles, and reviewed the literature.The enhanced acceptance of maternal allografts in children who were breast-fed and tolerance to the maternal MHC antigens after breastfeeding may stem from RNAs of the breast milk microvesicles that can be taken up by the breastfed infant and receiving maternal genomic information. We conclude that milk microvesicles may transfer genetic signals from mother to neonate during breastfeeding. Moreover, transfer of wild type RNA from a healthy wet-nurse to the suckling neonate through the milk microvesicles and its subsequent reverse transcription and integration into the neonate genome could result in permanent correction of the clinical manifestations in genetic diseases.

  9. Multidimensional Integrative Genomics Approaches to Dissecting Cardiovascular Disease

    PubMed Central

    Arneson, Douglas; Shu, Le; Tsai, Brandon; Barrere-Cain, Rio; Sun, Christine; Yang, Xia

    2017-01-01

    Elucidating the mechanisms of complex diseases such as cardiovascular disease (CVD) remains a significant challenge due to multidimensional alterations at molecular, cellular, tissue, and organ levels. To better understand CVD and offer insights into the underlying mechanisms and potential therapeutic strategies, data from multiple omics types (genomics, epigenomics, transcriptomics, metabolomics, proteomics, microbiomics) from both humans and model organisms have become available. However, individual omics data types capture only a fraction of the molecular mechanisms. To address this challenge, there have been numerous efforts to develop integrative genomics methods that can leverage multidimensional information from diverse data types to derive comprehensive molecular insights. In this review, we summarize recent methodological advances in multidimensional omics integration, exemplify their applications in cardiovascular research, and pinpoint challenges and future directions in this incipient field. PMID:28289683

  10. The Tousled-Like Kinases as Guardians of Genome Integrity.

    PubMed

    De Benedetti, Arrigo

    2012-01-01

    The Tousled-like kinases (TLKs) function in processes of chromatin assembly, including replication, transcription, repair, and chromosome segregation. TLKs interact specifically (and phosphorylate) with the chromatin assembly factor Asf1, a histone H3-H4 chaperone, histone H3 itself at Ser10, and also Rad9, a key protein involved in DNA repair and cell cycle signaling following DNA damage. These interactions are believed to be responsible for the action of TLKs in double-stranded break repair and radioprotection and also in the propagation of the DNA damage response. Hence, I propose that TLKs play key roles in maintenance of genome integrity in many organisms of both kingdoms. In this paper, I highlight key issues of the known roles of these proteins, particularly in the context of DNA repair (IR and UV), their possible relevance to genome integrity and cancer development, and as possible targets for intervention in cancer management.

  11. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  12. INTEGRATE: gene fusion discovery using whole genome and transcriptome data.

    PubMed

    Zhang, Jin; White, Nicole M; Schmidt, Heather K; Fulton, Robert S; Tomlinson, Chad; Warren, Wesley C; Wilson, Richard K; Maher, Christopher A

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use.

  13. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  14. Construction of an integrated database to support genomic sequence analysis

    SciTech Connect

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  15. Improved transgene integration into the Chinese hamster ovary cell genome using the Cre-loxP system.

    PubMed

    Inao, Takanori; Kawabe, Yoshinori; Yamashiro, Takuro; Kameyama, Yujiro; Wang, Xue; Ito, Akira; Kamihira, Masamichi

    2015-07-01

    Genetic engineering of cellular genomes has provided useful tools for biomedical and pharmaceutical studies such as the generation of transgenic animals and producer cells of biopharmaceutical proteins. Gene integration using site-specific recombinases enables precise transgene insertion into predetermined genomic sites if the target site sequence is introduced into a specific chromosomal locus. We previously developed an accumulative site-specific gene integration system (AGIS) using Cre and mutated loxPs. The system enabled the repeated integration of multiple transgenes into a predetermined locus of a genome. In this study, we explored applicable mutated loxP pairs for AGIS to improve the integration efficiency. The integration efficiencies of 52 mutated loxP sequences, including novel sequences, were measured using an in vitro evaluation system. Among mutated loxP pairs that exhibited a high integration efficiency, the applicability of the selected pairs to AGIS was confirmed for transgene integration into the Chinese hamster ovary cell genome. The newly found mutated loxP pairs should be useful for Cre-mediated integration of transgenes and AGIS.

  16. Computational and molecular tools for scalable rAAV-mediated genome editing

    PubMed Central

    Stoimenov, Ivaylo; Ali, Muhammad Akhtar; Pandzic, Tatjana; Sjöblom, Tobias

    2015-01-01

    The rapid discovery of potential driver mutations through large-scale mutational analyses of human cancers generates a need to characterize their cellular phenotypes. Among the techniques for genome editing, recombinant adeno-associated virus (rAAV)-mediated gene targeting is suited for knock-in of single nucleotide substitutions and to a lesser degree for gene knock-outs. However, the generation of gene targeting constructs and the targeting process is time-consuming and labor-intense. To facilitate rAAV-mediated gene targeting, we developed the first software and complementary automation-friendly vector tools to generate optimized targeting constructs for editing human protein encoding genes. By computational approaches, rAAV constructs for editing ∼71% of bases in protein-coding exons were designed. Similarly, ∼81% of genes were predicted to be targetable by rAAV-mediated knock-out. A Gateway-based cloning system for facile generation of rAAV constructs suitable for robotic automation was developed and used in successful generation of targeting constructs. Together, these tools enable automated rAAV targeting construct design, generation as well as enrichment and expansion of targeted cells with desired integrations. PMID:25488813

  17. The Conjugative Relaxase TrwC Promotes Integration of Foreign DNA in the Human Genome.

    PubMed

    González-Prieto, Coral; Gabriel, Richard; Dehio, Christoph; Schmidt, Manfred; Llosa, Matxalen

    2017-06-15

    Bacterial conjugation is a mechanism of horizontal DNA transfer. The relaxase TrwC of the conjugative plasmid R388 cleaves one strand of the transferred DNA at the oriT gene, covalently attaches to it, and leads the single-stranded DNA (ssDNA) into the recipient cell. In addition, TrwC catalyzes site-specific integration of the transferred DNA into its target sequence present in the genome of the recipient bacterium. Here, we report the analysis of the efficiency and specificity of the integrase activity of TrwC in human cells, using the type IV secretion system of the human pathogen Bartonella henselae to introduce relaxase-DNA complexes. Compared to Mob relaxase from plasmid pBGR1, we found that TrwC mediated a 10-fold increase in the rate of plasmid DNA transfer to human cells and a 100-fold increase in the rate of chromosomal integration of the transferred DNA. We used linear amplification-mediated PCR and plasmid rescue to characterize the integration pattern in the human genome. DNA sequence analysis revealed mostly reconstituted oriT sequences, indicating that TrwC is active and recircularizes transferred DNA in human cells. One TrwC-mediated site-specific integration event was detected, proving that TrwC is capable of mediating site-specific integration in the human genome, albeit with very low efficiency compared to the rate of random integration. Our results suggest that TrwC may stabilize the plasmid DNA molecules in the nucleus of the human cell, probably by recircularization of the transferred DNA strand. This stabilization would increase the opportunities for integration of the DNA by the host machinery.IMPORTANCE Different biotechnological applications, including gene therapy strategies, require permanent modification of target cells. Long-term expression is achieved either by extrachromosomal persistence or by integration of the introduced DNA. Here, we studied the utility of conjugative relaxase TrwC, a bacterial protein with site

  18. An integrated genomics approach identifies drivers of proliferation in luminal-subtype human breast cancer.

    PubMed

    Gatza, Michael L; Silva, Grace O; Parker, Joel S; Fan, Cheng; Perou, Charles M

    2014-10-01

    Elucidating the molecular drivers of human breast cancers requires a strategy that is capable of integrating multiple forms of data and an ability to interpret the functional consequences of a given genetic aberration. Here we present an integrated genomic strategy based on the use of gene expression signatures of oncogenic pathway activity (n = 52) as a framework to analyze DNA copy number alterations in combination with data from a genome-wide RNA-mediated interference screen. We identify specific DNA amplifications and essential genes within these amplicons representing key genetic drivers, including known and new regulators of oncogenesis. The genes identified include eight that are essential for cell proliferation (FGD5, METTL6, CPT1A, DTX3, MRPS23, EIF2S2, EIF6 and SLC2A10) and are uniquely amplified in patients with highly proliferative luminal breast tumors, a clinical subset of patients for which few therapeutic options are effective. This general strategy has the potential to identify therapeutic targets within amplicons through an integrated use of genomic data sets.

  19. TALEN-mediated genome engineering to generate targeted mice.

    PubMed

    Sommer, Daniel; Peters, Annika E; Baumgart, Ann-Kathrin; Beyer, Marc

    2015-02-01

    Genetic mouse models are critical for biomedical research to understand gene function and pathophysiology. In the last years, the generation of genetic mouse models has been revolutionized by the emergence of transcription activator-like effector nucleases (TALENs). TALENs are programmable, sequence-specific DNA-binding proteins fused to a non-specific endonuclease domain used as powerful tools for site-specific induction of DNA double-strand breaks. These result in disruption of the gene product of the targeted locus by mutations induced during repair by error-prone non-homologous end-joining. Alternatively, these DNA double-strand breaks can be exploited to integrate a user-defined sequence by homologous recombination if an appropriate repair plasmid is provided. In this review, we highlight the major technological improvements for genome editing in murine oocytes which have been achieved using TALENs, discuss current limitations of the technology, suggest strategies to broadly apply TALENs, and describe possible future directions to facilitate gene editing in murine oocytes.

  20. MarinegenomicsDB: an integrated genome viewer for community-based annotation of genomes.

    PubMed

    Koyanagi, Ryo; Takeuchi, Takeshi; Hisata, Kanako; Gyoja, Fuki; Shoguchi, Eiichi; Satoh, Nori; Kawashima, Takeshi

    2013-10-01

    We constructed a web-based genome annotation platform, MarinegenomicsDB, to integrate genome data from various marine organisms including the pearl oyster Pinctada fucata and the coral Acropora digitifera. This newly developed viewer application provides open access to published data and a user-friendly environment for community-based manual gene annotation. Development on a flexible framework enables easy expansion of the website on demand. To date, more than 2000 genes have been annotated using this system. In the future, the website will be expanded to host a wider variety of data, more species, and different types of genome-wide analyses. The website is available at the following URL: http://marinegenomics.oist.jp.

  1. Cre/lox-Recombinase-Mediated Cassette Exchange for Reversible Site-Specific Genomic Targeting of the Disease Vector, Aedes aegypti.

    PubMed

    Häcker, Irina; Harrell Ii, Robert A; Eichner, Gerrit; Pilitt, Kristina L; O'Brochta, David A; Handler, Alfred M; Schetelig, Marc F

    2017-03-07

    Site-specific genome modification (SSM) is an important tool for mosquito functional genomics and comparative gene expression studies, which contribute to a better understanding of mosquito biology and are thus a key to finding new strategies to eliminate vector-borne diseases. Moreover, it allows for the creation of advanced transgenic strains for vector control programs. SSM circumvents the drawbacks of transposon-mediated transgenesis, where random transgene integration into the host genome results in insertional mutagenesis and variable position effects. We applied the Cre/lox recombinase-mediated cassette exchange (RMCE) system to Aedes aegypti, the vector of dengue, chikungunya, and Zika viruses. In this context we created four target site lines for RMCE and evaluated their fitness costs. Cre-RMCE is functional in a two-step mechanism and with good efficiency in Ae. aegypti. The advantages of Cre-RMCE over existing site-specific modification systems for Ae. aegypti, phiC31-RMCE and CRISPR, originate in the preservation of the recombination sites, which 1) allows successive modifications and rapid expansion or adaptation of existing systems by repeated targeting of the same site; and 2) provides reversibility, thus allowing the excision of undesired sequences. Thereby, Cre-RMCE complements existing genomic modification tools, adding flexibility and versatility to vector genome targeting.

  2. Cre/lox-Recombinase-Mediated Cassette Exchange for Reversible Site-Specific Genomic Targeting of the Disease Vector, Aedes aegypti

    PubMed Central

    Häcker, Irina; Harrell II, Robert A.; Eichner, Gerrit; Pilitt, Kristina L.; O’Brochta, David A.; Handler, Alfred M.; Schetelig, Marc F.

    2017-01-01

    Site-specific genome modification (SSM) is an important tool for mosquito functional genomics and comparative gene expression studies, which contribute to a better understanding of mosquito biology and are thus a key to finding new strategies to eliminate vector-borne diseases. Moreover, it allows for the creation of advanced transgenic strains for vector control programs. SSM circumvents the drawbacks of transposon-mediated transgenesis, where random transgene integration into the host genome results in insertional mutagenesis and variable position effects. We applied the Cre/lox recombinase-mediated cassette exchange (RMCE) system to Aedes aegypti, the vector of dengue, chikungunya, and Zika viruses. In this context we created four target site lines for RMCE and evaluated their fitness costs. Cre-RMCE is functional in a two-step mechanism and with good efficiency in Ae. aegypti. The advantages of Cre-RMCE over existing site-specific modification systems for Ae. aegypti, phiC31-RMCE and CRISPR, originate in the preservation of the recombination sites, which 1) allows successive modifications and rapid expansion or adaptation of existing systems by repeated targeting of the same site; and 2) provides reversibility, thus allowing the excision of undesired sequences. Thereby, Cre-RMCE complements existing genomic modification tools, adding flexibility and versatility to vector genome targeting. PMID:28266580

  3. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Andrei L. Osterman, Ph.D.

    2012-12-17

    Integration of bioinformatics and experimental techniques was applied to mapping and characterization of the key components (pathways, enzymes, transporters, regulators) of the core metabolic machinery in Shewanella oneidensis and related species with main focus was on metabolic and regulatory pathways involved in utilization of various carbon and energy sources. Among the main accomplishments reflected in ten joint publications with other participants of Shewanella Federation are: (i) A systems-level reconstruction of carbohydrate utilization pathways in the genus of Shewanella (19 species). This analysis yielded reconstruction of 18 sugar utilization pathways including 10 novel pathway variants and prediction of > 60 novel protein families of enzymes, transporters and regulators involved in these pathways. Selected functional predictions were verified by focused biochemical and genetic experiments. Observed growth phenotypes were consistent with bioinformatic predictions providing strong validation of the technology and (ii) Global genomic reconstruction of transcriptional regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors, 8 riboswitches and 6 translational attenuators. Of those, 45 regulons were inferred directly from the genome context analysis, whereas others were propagated from previously characterized regulons in other species. Selected regulatory predictions were experimentally tested. Integration of this analysis with microarray data revealed overall consistency and provided additional layer of interactions between regulons. All the results were captured in the new database RegPrecise, which is a joint development with the LBNL team. A more detailed analysis of the individual subsystems, pathways and regulons in Shewanella spp included bioinfiormatics-based prediction and experimental characterization of: (i) N-Acetylglucosamine catabolic pathway; (ii)Lactate utilization machinery; (iii) Novel Nrt

  4. Integrating hospital information systems in healthcare institutions: a mediation architecture.

    PubMed

    El Azami, Ikram; Cherkaoui Malki, Mohammed Ouçamah; Tahon, Christian

    2012-10-01

    Many studies have examined the integration of information systems into healthcare institutions, leading to several standards in the healthcare domain (CORBAmed: Common Object Request Broker Architecture in Medicine; HL7: Health Level Seven International; DICOM: Digital Imaging and Communications in Medicine; and IHE: Integrating the Healthcare Enterprise). Due to the existence of a wide diversity of heterogeneous systems, three essential factors are necessary to fully integrate a system: data, functions and workflow. However, most of the previous studies have dealt with only one or two of these factors and this makes the system integration unsatisfactory. In this paper, we propose a flexible, scalable architecture for Hospital Information Systems (HIS). Our main purpose is to provide a practical solution to insure HIS interoperability so that healthcare institutions can communicate without being obliged to change their local information systems and without altering the tasks of the healthcare professionals. Our architecture is a mediation architecture with 3 levels: 1) a database level, 2) a middleware level and 3) a user interface level. The mediation is based on two central components: the Mediator and the Adapter. Using the XML format allows us to establish a structured, secured exchange of healthcare data. The notion of medical ontology is introduced to solve semantic conflicts and to unify the language used for the exchange. Our mediation architecture provides an effective, promising model that promotes the integration of hospital information systems that are autonomous, heterogeneous, semantically interoperable and platform-independent.

  5. PP2A Controls Genome Integrity by Integrating Nutrient-Sensing and Metabolic Pathways with the DNA Damage Response.

    PubMed

    Ferrari, Elisa; Bruhn, Christopher; Peretti, Marta; Cassani, Corinne; Carotenuto, Walter Vincenzo; Elgendy, Mohamed; Shubassi, Ghadeer; Lucca, Chiara; Bermejo, Rodrigo; Varasi, Mario; Minucci, Saverio; Longhese, Maria Pia; Foiani, Marco

    2017-07-20

    Mec1(ATR) mediates the DNA damage response (DDR), integrating chromosomal signals and mechanical stimuli. We show that the PP2A phosphatases, ceramide-activated enzymes, couple cell metabolism with the DDR. Using genomic screens, metabolic analysis, and genetic and pharmacological studies, we found that PP2A attenuates the DDR and that three metabolic circuits influence the DDR by modulating PP2A activity. Irc21, a putative cytochrome b5 reductase that promotes the condensation reaction generating dihydroceramides (DHCs), and Ppm1, a PP2A methyltransferase, counteract the DDR by activating PP2A; conversely, the nutrient-sensing TORC1-Tap42 axis sustains DDR activation by inhibiting PP2A. Loss-of-function mutations in IRC21, PPM1, and PP2A and hyperactive tap42 alleles rescue mec1 mutants. Ceramides synergize with rapamycin, a TORC1 inhibitor, in counteracting the DDR. Hence, PP2A integrates nutrient-sensing and metabolic pathways to attenuate the Mec1(ATR) response. Our observations imply that metabolic changes affect genome integrity and may help with exploiting therapeutic options and repositioning known drugs. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  6. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/

  7. Replication termination at eukaryotic chromosomes is mediated by Top2 and occurs at genomic loci containing pausing elements.

    PubMed

    Fachinetti, Daniele; Bermejo, Rodrigo; Cocito, Andrea; Minardi, Simone; Katou, Yuki; Kanoh, Yutaka; Shirahige, Katsuhiko; Azvolinsky, Anna; Zakian, Virginia A; Foiani, Marco

    2010-08-27

    Chromosome replication initiates at multiple replicons and terminates when forks converge. In E. coli, the Tus-TER complex mediates polar fork converging at the terminator region, and aberrant termination events challenge chromosome integrity and segregation. Since in eukaryotes, termination is less characterized, we used budding yeast to identify the factors assisting fork fusion at replicating chromosomes. Using genomic and mechanistic studies, we have identified and characterized 71 chromosomal termination regions (TERs). TERs contain fork pausing elements that influence fork progression and merging. The Rrm3 DNA helicase assists fork progression across TERs, counteracting the accumulation of X-shaped structures. The Top2 DNA topoisomerase associates at TERs in S phase, and G2/M facilitates fork fusion and prevents DNA breaks and genome rearrangements at TERs. We propose that in eukaryotes, replication fork barriers, Rrm3, and Top2 coordinate replication fork progression and fusion at TERs, thus counteracting abnormal genomic transitions.

  8. Visualization of RNA structure models within the Integrative Genomics Viewer.

    PubMed

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  9. Preventing Replication Fork Collapse to Maintain Genome Integrity

    PubMed Central

    Cortez, David

    2015-01-01

    Billions of base pairs of DNA must be replicated trillions of times in a human lifetime. Complete and accurate replication once and only once per cell division cycle is essential to maintain genome integrity and prevent disease. Impediments to replication fork progression including difficult to replicate DNA sequences, conflicts with transcription, and DNA damage further add to the genome maintenance challenge. These obstacles frequently cause fork stalling, but only rarely cause a failure to complete replication. Robust mechanisms ensure that stalled forks remain stable and capable of either resuming DNA synthesis or being rescued by converging forks. However, when failures do happen the fork collapses leading to genome rearrangements, cell death and disease. Despite intense interest, the mechanisms to repair damaged replication forks, stabilize them, and ensure successful replication remain only partly understood. Different models of fork collapse have been proposed with varying descriptions of what happens to the DNA and replisome. Here, I will define fork collapse and describe what is known about how the replication checkpoint prevents it to maintain genome stability. PMID:25957489

  10. Site-Specific Integration of Foreign DNA into Minimal Bacterial and Human Target Sequences Mediated by a Conjugative Relaxase

    PubMed Central

    Agúndez, Leticia; González-Prieto, Coral; Machón, Cristina; Llosa, Matxalen

    2012-01-01

    Background Bacterial conjugation is a mechanism for horizontal DNA transfer between bacteria which requires cell to cell contact, usually mediated by self-transmissible plasmids. A protein known as relaxase is responsible for the processing of DNA during bacterial conjugation. TrwC, the relaxase of conjugative plasmid R388, is also able to catalyze site-specific integration of the transferred DNA into a copy of its target, the origin of transfer (oriT), present in a recipient plasmid. This reaction confers TrwC a high biotechnological potential as a tool for genomic engineering. Methodology/Principal Findings We have characterized this reaction by conjugal mobilization of a suicide plasmid to a recipient cell with an oriT-containing plasmid, selecting for the cointegrates. Proteins TrwA and IHF enhanced integration frequency. TrwC could also catalyze integration when it is expressed from the recipient cell. Both Y18 and Y26 catalytic tyrosil residues were essential to perform the reaction, while TrwC DNA helicase activity was dispensable. The target DNA could be reduced to 17 bp encompassing TrwC nicking and binding sites. Two human genomic sequences resembling the 17 bp segment were accepted as targets for TrwC-mediated site-specific integration. TrwC could also integrate the incoming DNA molecule into an oriT copy present in the recipient chromosome. Conclusions/Significance The results support a model for TrwC-mediated site-specific integration. This reaction may allow R388 to integrate into the genome of non-permissive hosts upon conjugative transfer. Also, the ability to act on target sequences present in the human genome underscores the biotechnological potential of conjugative relaxase TrwC as a site-specific integrase for genomic modification of human cells. PMID:22292089

  11. High-level Genomic Integration, Epigenetic Changes, and Expression of Sleeping Beauty Transgene

    PubMed Central

    Zhu, Jianhui; Park, Chang Won; Sjeklocha, Lucas; Kren, Betsy T.; Steer, Clifford J.

    2010-01-01

    Sleeping Beauty transposon (SB-Tn) has emerged as an important nonviral vector for integrating transgenes into mammalian genomes. We report here a novel dual fluorescent reporter cis SB-Tn system that permitted nonselective fluorescent-activated cell sorting for SB-Tn-transduced K562 erythroid cells. Using an internal ribosome entry site element, the green fluorescent protein (eGFP) was linked to the SB10 transposase gene as an indirect marker for the robust expression of SB10 transposase. Flourescence-activated cell sorting (FACS) by eGFP resulted in significant enrichment (> 60%) of cells exhibiting SB-Tn-mediated genomic insertions and long-term expression of a DsRed transgene. The hybrid erythroid-specific promoter of DsRed transgene was verified in erythroid or megakaryocyte differentiation of K562 cells. Bisulfite-mediated genomic analyses identified different DNA methylation patterns between DsRed+ and DsRed− cell clones, suggesting a critical role in transgene expression. Moreover, although the host genomic copy of the promoter element showed no CpG methylation, the same sequence carried by the transgene was markedly hypermethylated. Additional evidence also suggested a role for histone deacetylation in the regulation of DsRed transgene. The presence of SB transgene affected the expression of neighboring host genes at distances > 45 kb. Our data suggested that fluorescent reporter cis SB-Tn system can be used to enrich mammalian cells harboring SB-mediated transgene insertions. The epigenetic modification detected on the DsRed transgene demonstrated that transgenes inserted by SB could also be selectively modified by host cellular epigenetic systems. In addition, long-range activation of host genes must now be recognized as a potential consequence of an inserted transgene cassette containing enhancer elements. PMID:20041635

  12. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

    PubMed

    Moss, Nathan A; Bertin, Matthew J; Kleigrewe, Karin; Leão, Tiago F; Gerwick, Lena; Gerwick, William H

    2016-03-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques.

  13. Integration of competing ancillary assertions in genome assembly

    SciTech Connect

    Burks, C.; Parsons, R.J.; Engle, M.L.

    1994-12-31

    Assembly of genomic sequences and maps relies on a primary set of experimental data (e.g., the sequences of individual DNA fragments, or hybridization fingerprints of individual clone inserts), but almost always also relies on several streams of related but distinct kinds of data for completeness and accuracy of the final construction. These secondary data sets, which we term ancillary information, usually contain errors (as do the primary data sets, therefore creating the possibility of conflict between data sets), often arise from different experimental protocols and correspond to different scales of measurement, and occasionally include non-quantitative statements about the data. We present an approach for integration of ancillary assertions in the optimization of genome assembly, based on simultaneous balancing among the primary and secondary data sets, and include specific examples in the context of assembling DNA sequencing fragments to reconstruct a parent sequence.

  14. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery

    PubMed Central

    Bertin, Matthew J.; Kleigrewe, Karin; Leão, Tiago F.; Gerwick, Lena

    2016-01-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313

  15. On the road with WRAP53β: guardian of Cajal bodies and genome integrity.

    PubMed

    Henriksson, Sofia; Farnebo, Marianne

    2015-01-01

    The WRAP53 gene encodes both an antisense transcript (WRAP53α) that stabilizes the tumor suppressor p53 and a protein (WRAP53β) involved in maintenance of Cajal bodies, telomere elongation and DNA repair. WRAP53β is one of many proteins containing WD40 domains, known to mediate a variety of cellular processes. These proteins lack enzymatic activity, acting instead as platforms for the assembly of large complexes of proteins and RNAs thus facilitating their interactions. WRAP53β mediates site-specific interactions between Cajal body factors and DNA repair proteins. Moreover, dysfunction of this protein has been linked to premature aging, cancer and neurodegeneration. Here we summarize the current state of knowledge concerning the multifaceted roles of WRAP53β in intracellular trafficking, formation of the Cajal body, DNA repair and maintenance of genomic integrity and discuss potential crosstalk between these processes.

  16. GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research

    PubMed Central

    Jung, Sook; Jesudurai, Christopher; Staton, Margaret; Du, Zhidian; Ficklin, Stephen; Cho, Ilhyung; Abbott, Albert; Tomkins, Jeffrey; Main, Dorrie

    2004-01-01

    Background Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. Description The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. Conclusions The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at . PMID:15357877

  17. Integration Preferences of Wildtype AAV-2 for Consensus Rep-Binding Sites at Numerous Loci in the Human Genome

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Lutter, Timo; Weger, Stefan; Winter, Kerstin; Hammer, Eva-Maria; Cathomen, Toni; Reinert, Knut; Heilbronn, Regine

    2010-01-01

    Adeno-associated virus type 2 (AAV) is known to establish latency by preferential integration in human chromosome 19q13.42. The AAV non-structural protein Rep appears to target a site called AAVS1 by simultaneously binding to Rep-binding sites (RBS) present on the AAV genome and within AAVS1. In the absence of Rep, as is the case with AAV vectors, chromosomal integration is rare and random. For a genome-wide survey of wildtype AAV integration a linker-selection-mediated (LSM)-PCR strategy was designed to retrieve AAV-chromosomal junctions. DNA sequence determination revealed wildtype AAV integration sites scattered over the entire human genome. The bioinformatic analysis of these integration sites compared to those of rep-deficient AAV vectors revealed a highly significant overrepresentation of integration events near to consensus RBS. Integration hotspots included AAVS1 with 10% of total events. Novel hotspots near consensus RBS were identified on chromosome 5p13.3 denoted AAVS2 and on chromsome 3p24.3 denoted AAVS3. AAVS2 displayed seven independent junctions clustered within only 14 bp of a consensus RBS which proved to bind Rep in vitro similar to the RBS in AAVS3. Expression of Rep in the presence of rep-deficient AAV vectors shifted targeting preferences from random integration back to the neighbourhood of consensus RBS at hotspots and numerous additional sites in the human genome. In summary, targeted AAV integration is not as specific for AAVS1 as previously assumed. Rather, Rep targets AAV to integrate into open chromatin regions in the reach of various, consensus RBS homologues in the human genome. PMID:20628575

  18. Integration preferences of wildtype AAV-2 for consensus rep-binding sites at numerous loci in the human genome.

    PubMed

    Hüser, Daniela; Gogol-Döring, Andreas; Lutter, Timo; Weger, Stefan; Winter, Kerstin; Hammer, Eva-Maria; Cathomen, Toni; Reinert, Knut; Heilbronn, Regine

    2010-07-08

    Adeno-associated virus type 2 (AAV) is known to establish latency by preferential integration in human chromosome 19q13.42. The AAV non-structural protein Rep appears to target a site called AAVS1 by simultaneously binding to Rep-binding sites (RBS) present on the AAV genome and within AAVS1. In the absence of Rep, as is the case with AAV vectors, chromosomal integration is rare and random. For a genome-wide survey of wildtype AAV integration a linker-selection-mediated (LSM)-PCR strategy was designed to retrieve AAV-chromosomal junctions. DNA sequence determination revealed wildtype AAV integration sites scattered over the entire human genome. The bioinformatic analysis of these integration sites compared to those of rep-deficient AAV vectors revealed a highly significant overrepresentation of integration events near to consensus RBS. Integration hotspots included AAVS1 with 10% of total events. Novel hotspots near consensus RBS were identified on chromosome 5p13.3 denoted AAVS2 and on chromsome 3p24.3 denoted AAVS3. AAVS2 displayed seven independent junctions clustered within only 14 bp of a consensus RBS which proved to bind Rep in vitro similar to the RBS in AAVS3. Expression of Rep in the presence of rep-deficient AAV vectors shifted targeting preferences from random integration back to the neighbourhood of consensus RBS at hotspots and numerous additional sites in the human genome. In summary, targeted AAV integration is not as specific for AAVS1 as previously assumed. Rather, Rep targets AAV to integrate into open chromatin regions in the reach of various, consensus RBS homologues in the human genome.

  19. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation.

    PubMed

    Brenna, Andrea; Montanini, Barbara; Muggiano, Eleonora; Proietto, Marco; Filetici, Patrizia; Ottonello, Simone; Ballario, Paola

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi ("truffles") with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites.

  20. Survey of Nursing Integration of Genomics Into Nursing Practice

    PubMed Central

    Calzone, Kathleen A.; Jenkins, Jean; Yates, Jan; Cusack, Georgie; Wallen, Gwenyth R.; Liewehr, David J.; Steinberg, Seth M.; McBride, Colleen

    2012-01-01

    Purpose Translating clinically valid genomic discoveries into practice is hinged not only on technologic advances, but also on nurses—the largest global contingent of health providers—acquiring requisite competencies to apply these discoveries in clinical care. The study aim was to assess practicing nurse attitudes, practices, receptivity, confidence, and competency of integrating genomics into nursing practice. Design A convenience sample of practicing nurses was recruited to complete an online survey that assessed domains from Roger’s Diffusion of Innovations Theory and used family history utilization as the basis for competency assessment. Methods Results were tabulated and analyzed using descriptive statistical techniques. Findings Two-hundred-thirty-nine licensed registered nurses, 22 to 72 years of age, with a median of 20 years in practice, responded, for an overall response rate of 28%. Most were White (83%), female (92%), and held baccalaureate degrees (56%). Seventy-one percent considered genetics to be very important to nursing practice; however, 81% rated their understanding of the genetics of common diseases as poor or fair. Per-question response rates varied widely. Instrument assessment indicated that modifications were necessary to decrease respondent burden. Conclusions Respondents’ perceived genomic competency was inadequate, family history was not routinely utilized in care delivery, and the extent of family history varied widely. However, most nurses indicated interest in pursuing continuing genomic education. Clinical Relevance Findings from this study can lead to the development of targeted education that will facilitate optimal workforce preparation for the ongoing influx of genetics and genomics information, technologies, and targeted therapies into the healthcare arena. This pilot study provides a foundation on which to build the next step, which includes a national nursing workforce study. PMID:23205780

  1. An integrative computational approach for prioritization of genomic variants

    SciTech Connect

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad; Huang, Qingyang

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. This study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.

  2. An Integrative Computational Approach for Prioritization of Genomic Variants

    PubMed Central

    Wang, Sheng; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  3. An integrative computational approach for prioritization of genomic variants

    DOE PAGES

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; ...

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. This study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less

  4. Integrated Genome-Based Studies of Shewanella Echophysiology

    SciTech Connect

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high

  5. Integrative Genomics Identifies Gene Signature Associated with Melanoma Ulceration

    PubMed Central

    Toth, Reka; Vizkeleti, Laura; Herandez-Vargas, Hector; Lazar, Viktoria; Emri, Gabriella; Szatmari, Istvan; Herceg, Zdenko; Adany, Roza; Balazs, Margit

    2013-01-01

    Background Despite the extensive research approaches applied to characterise malignant melanoma, no specific molecular markers are available that are clearly related to the progression of this disease. In this study, our aims were to define a gene expression signature associated with the clinical outcome of melanoma patients and to provide an integrative interpretation of the gene expression -, copy number alterations -, and promoter methylation patterns that contribute to clinically relevant molecular functional alterations. Methods Gene expression profiles were determined using the Affymetrix U133 Plus2.0 array. The NimbleGen Human CGH Whole-Genome Tiling array was used to define CNAs, and the Illumina GoldenGate Methylation platform was applied to characterise the methylation patterns of overlapping genes. Results We identified two subclasses of primary melanoma: one representing patients with better prognoses and the other being characteristic of patients with unfavourable outcomes. We assigned 1,080 genes as being significantly correlated with ulceration, 987 genes were downregulated and significantly enriched in the p53, Nf-kappaB, and WNT/beta-catenin pathways. Through integrated genome analysis, we defined 150 downregulated genes whose expression correlated with copy number losses in ulcerated samples. These genes were significantly enriched on chromosome 6q and 10q, which contained a total of 36 genes. Ten of these genes were downregulated and involved in cell-cell and cell-matrix adhesion or apoptosis. The expression and methylation patterns of additional genes exhibited an inverse correlation, suggesting that transcriptional silencing of these genes is driven by epigenetic events. Conclusion Using an integrative genomic approach, we were able to identify functionally relevant molecular hotspots characterised by copy number losses and promoter hypermethylation in distinct molecular subtypes of melanoma that contribute to specific transcriptomic silencing

  6. CRISPR/Cas9-mediated genome modification in the mollusc, Crepidula fornicata.

    PubMed

    Perry, Kimberly J; Henry, Jonathan Q

    2015-02-01

    The discovery and application of the CRISPR/Cas9 genome editing method has greatly enhanced the ease with which transgenic manipulation can occur. We applied this technology to the mollusc, Crepidula fornicata, and have successfully created transgenic embryos expressing mCherry fused to endogenous β-catenin. Specific integration of the fluorescent reporter was achieved by homologous recombination with a β-catenin-specific donor DNA containing the mCherry coding sequence. This fluorescent gene knock-in strategy permits in vivo observations of β-catenin expression during embryonic development and represents the first demonstration of CRISPR/Cas9-mediated transgenesis in the Lophotrochozoa superphylum. The CRISPR/Cas9 method is a powerful and economical tool for genome modification and presents an option for analysis of gene expression in not only major model systems, but also in those more diverse species that may not have been amenable to the classic methods of transgenesis. This approach will allow one to generate transgenic lines of snails for future studies.

  7. Integrated Genomic Map from Uropathogenic Escherichia coli J96

    PubMed Central

    Melkerson-Watson, Lyla J.; Rode, Christopher K.; Zhang, Lixin; Foxman, Betsy; Bloch, Craig A.

    2000-01-01

    Escherichia coli J96 is a uropathogen having both broad similarities to and striking differences from nonpathogenic, laboratory E. coli K-12. Strain J96 contains three large (>100-kb) unique genomic segments integrated on the chromosome; two are recognized as pathogenicity islands containing urovirulence genes. Additionally, the strain possesses a fourth smaller accessory segment of 28 kb and two deletions relative to strain K-12. We report an integrated physical and genetic map of the 5,120-kb J96 genome. The chromosome contains 26 NotI, 13 BlnI, and 7 I-CeuI macrorestriction sites. Macrorestriction mapping was rapidly accomplished by a novel transposon-based procedure: analysis of modified minitransposon insertions served to align the overlapping macrorestriction fragments generated by three different enzymes (each sharing a common cleavage site within the insert), thus integrating the three different digestion patterns and ordering the fragments. The resulting map, generated from a total of 54 mini-Tn10 insertions, was supplemented with auxanography and Southern analysis to indicate the positions of insertionally disrupted aminosynthetic genes and cloned virulence genes, respectively. Thus, it contains not only physical, macrorestriction landmarks but also the loci for eight housekeeping genes shared with strain K-12 and eight acknowledged urovirulence genes; the latter confirmed clustering of virulence genes at the large unique accessory chromosomal segments. The 115-kb J96 plasmid was resolved by pulsed-field gel electrophoresis in NotI digests. However, because the plasmid lacks restriction sites for the enzymes BlnI and I-CeuI, it was visualized in BlnI and I-CeuI digests only of derivatives carrying plasmid inserts artificially introducing these sites. Owing to an I-SceI site on the transposon, the plasmid could also be visualized and sized from plasmid insertion mutants after digestion with this enzyme. The insertional strains generated in construction of

  8. Integrative Genomic Characterization and a Genomic Staging System for Gastrointestinal Stromal Tumors

    PubMed Central

    Ylipää, Antti; Hunt, Kelly K.; Yang, Jilong; Lazar, Alexander J. F.; Torres, Keila E.; Lev, Dina Chelouche; Nykter, Matti; Pollock, Raphael E.; Trent, Jonathan; Zhang, Wei

    2010-01-01

    Gastrointestinal stromal tumors (GISTs) were historically grouped with leiomyosarcomas (LMSs) based on their morphological similarities, but recently they have been unequivocally established as a distinct type of sarcoma based on the molecular features and response to imatinib treatment. To gain further insight into the genomic differences between GISTs and LMSs, we mapped gene copy number aberrations (CNAs) in 42 GISTs and 30 LMSs and integrated them with gene expression profiles. Our studies revealed distinct patterns of CNAs between GISTs and LMSs. Losses in chromosomes 1p, 14q, 15q, and 22q were significantly more frequent in GISTs than in LMSs (P < 0.001), whereas losses in chromosomes 10 and 16 as well as gains in 1q, 14q, and 15q (P < 0.001) were more common in LMSs. By integrating CNAs with gene expression data and clinical information, we found several clinically relevant CNAs that were prognostic of survival in patients with GIST. Furthermore, GISTs were categorized into four groups according to an accumulating pattern of genetic alterations. Many key cellular pathways were differently expressed in the four groups and the patients had increasingly worse prognosis as the extent of genomic alterations increased. These findings lead us to propose a new tumor-progression genetic staging system termed Genomic Instability Stage (GIS) to complement the current prognostic predictive system based on tumor size, mitotic index (MI), and KIT mutation. PMID:20818650

  9. Genome-wide analyses of LINE–LINE-mediated nonallelic homologous recombination

    PubMed Central

    Startek, Michał; Szafranski, Przemyslaw; Gambin, Tomasz; Campbell, Ian M.; Hixson, Patricia; Shaw, Chad A.; Stankiewicz, Paweł; Gambin, Anna

    2015-01-01

    Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE–LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE–LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE–LINE rearrangements. Our data indicate that LINE–LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability. PMID:25613453

  10. The Proteins API: accessing key integrated protein and genome information.

    PubMed

    Nightingale, Andrew; Antunes, Ricardo; Alpi, Emanuele; Bursteinas, Borisas; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd; Martin, Maria

    2017-04-05

    The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to 'talk' to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc).

  11. The Proteins API: accessing key integrated protein and genome information

    PubMed Central

    Antunes, Ricardo; Alpi, Emanuele; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd

    2017-01-01

    Abstract The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to ‘talk’ to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). PMID:28383659

  12. The Npl3 hnRNP prevents R-loop-mediated transcription-replication conflicts and genome instability.

    PubMed

    Santos-Pereira, José M; Herrero, Ana B; García-Rubio, María L; Marín, Antonio; Moreno, Sergio; Aguilera, Andrés

    2013-11-15

    Transcription is a major obstacle for replication fork (RF) progression and a cause of genome instability. Part of this instability is mediated by cotranscriptional R loops, which are believed to increase by suboptimal assembly of the nascent messenger ribonucleoprotein particle (mRNP). However, no clear evidence exists that heterogeneous nuclear RNPs (hnRNPs), the basic mRNP components, prevent R-loop stabilization. Here we show that yeast Npl3, the most abundant RNA-binding hnRNP, prevents R-loop-mediated genome instability. npl3Δ cells show transcription-dependent and R-loop-dependent hyperrecombination and genome-wide replication obstacles as determined by accumulation of the Rrm3 helicase. Such obstacles preferentially occur at long and highly expressed genes, to which Npl3 is preferentially bound in wild-type cells, and are reduced by RNase H1 overexpression. The resulting replication stress confers hypersensitivity to double-strand break-inducing agents. Therefore, our work demonstrates that mRNP factors are critical for genome integrity and opens the option of using them as therapeutic targets in anti-cancer treatment.

  13. The Npl3 hnRNP prevents R-loop-mediated transcription–replication conflicts and genome instability

    PubMed Central

    Santos-Pereira, José M.; Herrero, Ana B.; García-Rubio, María L.; Marín, Antonio; Moreno, Sergio; Aguilera, Andrés

    2013-01-01

    Transcription is a major obstacle for replication fork (RF) progression and a cause of genome instability. Part of this instability is mediated by cotranscriptional R loops, which are believed to increase by suboptimal assembly of the nascent messenger ribonucleoprotein particle (mRNP). However, no clear evidence exists that heterogeneous nuclear RNPs (hnRNPs), the basic mRNP components, prevent R-loop stabilization. Here we show that yeast Npl3, the most abundant RNA-binding hnRNP, prevents R-loop-mediated genome instability. npl3Δ cells show transcription-dependent and R-loop-dependent hyperrecombination and genome-wide replication obstacles as determined by accumulation of the Rrm3 helicase. Such obstacles preferentially occur at long and highly expressed genes, to which Npl3 is preferentially bound in wild-type cells, and are reduced by RNase H1 overexpression. The resulting replication stress confers hypersensitivity to double-strand break-inducing agents. Therefore, our work demonstrates that mRNP factors are critical for genome integrity and opens the option of using them as therapeutic targets in anti-cancer treatment. PMID:24240235

  14. Differences in Vector Genome Processing and Illegitimate Integration of Non-Integrating Lentiviral Vectors

    PubMed Central

    Shaw, Aaron M.; Joseph, Guiandre L.; Jasti, Aparna C.; Sastry-Dent, Lakshmi; Witting, Scott; Cornetta, Kenneth

    2016-01-01

    A variety of mutations in lentiviral vector expression systems have been shown to generate a non-integrating phenotype. We studied a novel 12 base-pair U3-LTR integrase attachment site deletion (U3-LTR att site) mutant and found similar physical titers to the previously reported integrase catalytic core mutant IN/D116N. Both mutations led to a greater than two log reduction in vector integration; with IN/D116N providing lower illegitimate integration frequency, while the U3-LTR att site mutant provided a higher level of transgene expression. The improved expression of the U3-LTR att site mutant could not be explained solely based on an observed modest increase in integration frequency. In evaluating processing, we noted significant differences in unintegrated vector forms, with the U3-LTR att site mutant leading to a predominance of 1-LTR circles. The mutations also differed in the manner of illegitimate integration. The U3-LTR att site mutant vector demonstrated integrase-mediated integration at the intact U5-LTR att site and non-integrase mediated integration at the mutated U3-LTR att site. Finally, we combined a variety of mutations and modifications and assessed transgene expression and integration frequency to show that combining modifications can improve the potential clinical utility of non-integrating lentiviral vectors. PMID:27682478

  15. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    SciTech Connect

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  16. Bacteriophage WO Can Mediate Horizontal Gene Transfer in Endosymbiotic Wolbachia Genomes

    PubMed Central

    Wang, Guan H.; Sun, Bao F.; Xiong, Tuan L.; Wang, Yan K.; Murfin, Kristen E.; Xiao, Jin H.; Huang, Da W.

    2016-01-01

    Phage-mediated horizontal gene transfer (HGT) is common in free-living bacteria, and many transferred genes can play a significant role in their new bacterial hosts. However, there are few reports concerning phage-mediated HGT in endosymbionts (obligate intracellular bacteria within animal or plant hosts), such as Wolbachia. The Wolbachia-infecting temperate phage WO can actively shift among Wolbachia genomes and has the potential to mediate HGT between Wolbachia strains. In the present study, we extend previous findings by validating that the phage WO can mediate transfer of non-phage genes. To do so, we utilized bioinformatic, phylogenetic, and molecular analyses based on all sequenced Wolbachia and phage WO genomes. Our results show that the phage WO can mediate HGT between Wolbachia strains, regardless of whether the transferred genes originate from Wolbachia or other unrelated bacteria. PMID:27965627

  17. Integrated cytogenetics and genomics analysis of transposable elements in the Nile tilapia, Oreochromis niloticus.

    PubMed

    Valente, Guilherme; Kocher, Thomas; Eickbush, Thomas; Simões, Rafael P; Martins, Cesar

    2016-06-01

    Integration of cytogenetics and genomics has become essential to a better view of architecture and function of genomes. Although the advances on genomic sequencing have contributed to study genes and genomes, the repetitive DNA fraction of the genome is still enigmatic and poorly understood. Among repeated DNAs, transposable elements (TEs) are major components of eukaryotic chromatin and their investigation has been hindered even after the availability of whole sequenced genomes. The cytogenetic mapping of TEs in chromosomes has proved to be of high value to integrate information from the micro level of nucleotide sequence to a cytological view of chromosomes. Different TEs have been cytogenetically mapped in cichlids; however, neither details about their genomic arrangement nor appropriated copy number are well defined by these approaches. The current study integrates TEs distribution in Nile tilapia Oreochromis niloticus genome based on cytogenetic and genomics/bioinformatics approach. The results showed that some elements are not randomly distributed and that some are genomic dependent on each other. Moreover, we found extensive overlap between genomics and cytogenetics data and that tandem duplication may be the major mechanism responsible for the genomic dynamics of TEs here analyzed. This paper provides insights in the genomic organization of TEs under an integrated view based on cytogenetics and genomics.

  18. CRISPR-Cas-mediated targeted genome editing in human cells.

    PubMed

    Yang, Luhan; Mali, Prashant; Kim-Kiselak, Caroline; Church, George

    2014-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) systems have evolved as an adaptive surveillance and defense mechanism in bacteria and archaea that uses short RNAs to direct degradation of foreign genetic elements. Here, we present our protocol for utilizing the S. pyogenes type II bacterial CRISPR system to achieve sequence-specific genome alterations in human cells. In principle, any genomic sequence of the form N₁₉NGG can be targeted with the generation of custom guide RNA (gRNA) which functions to direct the Cas9 protein to genomic targets and induce DNA cleavage. Here, we describe our methods for designing and generating gRNA expression constructs either singly or in a multiplexed manner, as well as optimized protocols for the delivery of Cas9-gRNA components into human cells. Genomic alterations at the target site are then introduced either through nonhomologous end joining (NHEJ) or through homologous recombination (HR) in the presence of an appropriate donor sequence. This RNA-guided editing tool offers greater ease of customization and synthesis in comparison to existing sequence-specific endonucleases and promises to become a highly versatile and multiplexable human genome engineering platform.

  19. TALEN-mediated genome editing: prospects and perspectives

    SciTech Connect

    Wright, DA; Li, T; Yang, B; Spalding, MH

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges, and future prospects of this quickly evolving area of study and application.

  20. TALEN-mediated genome editing: prospects and perspectives.

    PubMed

    Wright, David A; Li, Ting; Yang, Bing; Spalding, Martin H

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges and future prospects of this quickly evolving area of study and application.

  1. University of California San Francisco (UCSF-2): Integrative Genomic Approaches in Neuroblastoma (NBL) | Office of Cancer Genomics

    Cancer.gov

    The CTD2 Center at University of California San Francisco (UCSF-2) used an integrative genomics approach to reveal unidentified mRNA splicing patterns in neuroblastoma. Read the abstract Experimental Approaches Read the detailed Experimental Approaches

  2. Integrated genome-wide analysis of genomic changes and gene regulation in human adrenocortical tissue samples.

    PubMed

    Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S; Kebebew, Electron

    2015-10-30

    To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics.

  3. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  4. IMG 4 version of the integrated microbial genomes comparative analysis system

    PubMed Central

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  5. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  6. An integrative characterization of recurrent molecular aberrations in glioblastoma genomes.

    PubMed

    Sintupisut, Nardnisa; Liu, Pei-Ling; Yeang, Chen-Hsiang

    2013-10-01

    Glioblastoma multiforme (GBM) is the most common and malignant primary brain tumor in adults. Decades of investigations and the recent effort of the Cancer Genome Atlas (TCGA) project have mapped many molecular alterations in GBM cells. Alterations on DNAs may dysregulate gene expressions and drive malignancy of tumors. It is thus important to uncover causal and statistical dependency between 'effector' molecular aberrations and 'target' gene expressions in GBMs. A rich collection of prior studies attempted to combine copy number variation (CNV) and mRNA expression data. However, systematic methods to integrate multiple types of cancer genomic data-gene mutations, single nucleotide polymorphisms, CNVs, DNA methylations, mRNA and microRNA expressions and clinical information-are relatively scarce. We proposed an algorithm to build 'association modules' linking effector molecular aberrations and target gene expressions and applied the module-finding algorithm to the integrated TCGA GBM data sets. The inferred association modules were validated by six tests using external information and datasets of central nervous system tumors: (i) indication of prognostic effects among patients; (ii) coherence of target gene expressions; (iii) retention of effector-target associations in external data sets; (iv) recurrence of effector molecular aberrations in GBM; (v) functional enrichment of target genes; and (vi) co-citations between effectors and targets. Modules associated with well-known molecular aberrations of GBM-such as chromosome 7 amplifications, chromosome 10 deletions, EGFR and NF1 mutations-passed the majority of the validation tests. Furthermore, several modules associated with less well-reported molecular aberrations-such as chromosome 11 CNVs, CD40, PLXNB1 and GSTM1 methylations, and mir-21 expressions-were also validated by external information. In particular, modules constituting trans-acting effects with chromosome 11 CNVs and cis-acting effects with chromosome

  7. Bilayer-thickness-mediated interactions between integral membrane proteins.

    PubMed

    Kahraman, Osman; Koch, Peter D; Klug, William S; Haselwandter, Christoph A

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  8. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data

    PubMed Central

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org. PMID:17932055

  9. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data.

    PubMed

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org.

  10. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    USDA-ARS?s Scientific Manuscript database

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two brea...

  11. Integration-defective lentiviral vector mediates efficient gene editing through homology-directed repair in human embryonic stem cells.

    PubMed

    Wang, Yebo; Wang, Yingjia; Chang, Tammy; Huang, He; Yee, Jiing-Kuan

    2016-11-28

    Human embryonic stem cells (hESCs) are used as platforms for disease study, drug screening and cell-based therapy. To facilitate these applications, it is frequently necessary to genetically manipulate the hESC genome. Gene editing with engineered nucleases enables site-specific genetic modification of the human genome through homology-directed repair (HDR). However, the frequency of HDR remains low in hESCs. We combined efficient expression of engineered nucleases and integration-defective lentiviral vector (IDLV) transduction for donor template delivery to mediate HDR in hESC line WA09. This strategy led to highly efficient HDR with more than 80% of the selected WA09 clones harboring the transgene inserted at the targeted genomic locus. However, certain portions of the HDR clones contained the concatemeric IDLV genomic structure at the target site, probably resulted from recombination of the IDLV genomic input before HDR with the target. We found that the integrase protein of IDLV mediated the highly efficient HDR through the recruitment of a cellular protein, LEDGF/p75. This study demonstrates that IDLV-mediated HDR is a powerful and broadly applicable technology to carry out site-specific gene modification in hESCs.

  12. Integration-defective lentiviral vector mediates efficient gene editing through homology-directed repair in human embryonic stem cells

    PubMed Central

    Wang, Yebo; Wang, Yingjia; Chang, Tammy

    2017-01-01

    Abstract Human embryonic stem cells (hESCs) are used as platforms for disease study, drug screening and cell-based therapy. To facilitate these applications, it is frequently necessary to genetically manipulate the hESC genome. Gene editing with engineered nucleases enables site-specific genetic modification of the human genome through homology-directed repair (HDR). However, the frequency of HDR remains low in hESCs. We combined efficient expression of engineered nucleases and integration-defective lentiviral vector (IDLV) transduction for donor template delivery to mediate HDR in hESC line WA09. This strategy led to highly efficient HDR with more than 80% of the selected WA09 clones harboring the transgene inserted at the targeted genomic locus. However, certain portions of the HDR clones contained the concatemeric IDLV genomic structure at the target site, probably resulted from recombination of the IDLV genomic input before HDR with the target. We found that the integrase protein of IDLV mediated the highly efficient HDR through the recruitment of a cellular protein, LEDGF/p75. This study demonstrates that IDLV-mediated HDR is a powerful and broadly applicable technology to carry out site-specific gene modification in hESCs. PMID:27899664

  13. Oryzabase. An integrated biological and genome information database for rice.

    PubMed

    Kurata, Nori; Yamazaki, Yukiko

    2006-01-01

    The aim of Oryzabase is to create a comprehensive view of rice (Oryza sativa) as a model monocot plant by integrating biological data with molecular genomic information (http://www.shigen.nig.ac.jp/rice/oryzabase/top/top.jsp). The database contains information about rice development and anatomy, rice mutants, and genetic resources, especially for wild varieties of rice. The anatomical description of rice development is unique and is the first known representation for rice. Developmental and anatomical descriptions include in situ gene expression data serving as stage and tissue markers. The systematic presentation of a large number of rice mutant and mutant trait genes is indispensable, as is description of research in wild strains, core collections, and their detailed characterization. Several genetic, physical, and expression maps with full genome and cDNA sequences are also combined with biological data in Oryzabase. These datasets, when pooled together, could provide a useful tool for gaining greater knowledge about the life cycle of rice, the relationship between phenotype and gene function, and rice genetic diversity. For exchanging community information, Oryzabase publishes the Rice Genetics Newsletter organized by the Rice Genetics Cooperative and provides a mailing service, rice-e-net/rice-net.

  14. CISA: Contig Integrator for Sequence Assembly of Bacterial Genomes

    PubMed Central

    Lin, Shin-Hung; Liao, Yu-Chieh

    2013-01-01

    A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different sets of contigs for bacterial genomes. A tool, named CISA, has been developed to integrate the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. The source code of CISA and examples of its use are available at http://sb.nhri.org.tw/CISA/. PMID:23556006

  15. CISA: contig integrator for sequence assembly of bacterial genomes.

    PubMed

    Lin, Shin-Hung; Liao, Yu-Chieh

    2013-01-01

    A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different sets of contigs for bacterial genomes. A tool, named CISA, has been developed to integrate the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. The source code of CISA and examples of its use are available at http://sb.nhri.org.tw/CISA/.

  16. An affinity-based genome walking method to find transgene integration loci in transgenic genome.

    PubMed

    Thirulogachandar, V; Pandey, Prachi; Vaishnavi, C S; Reddy, Malireddy K

    2011-09-15

    Identifying a good transgenic event from the pool of putative transgenics is crucial for further characterization. In transgenic plants, the transgene can integrate in either single or multiple locations by disrupting the endogenes and/or in heterochromatin regions causing the positional effect. Apart from this, to protect the unauthorized use of transgenic plants, the signature of transgene integration for every commercial transgenic event needs to be characterized. Here we show an affinity-based genome walking method, named locus-finding (LF) PCR (polymerase chain reaction), to determine the transgene flanking sequences of rice plants transformed by Agrobacterium tumefaciens. LF PCR includes a primary PCR by a degenerated primer and transfer DNA (T-DNA)-specific primer, a nested PCR, and a method of enriching the desired amplicons by using a biotin-tagged primer that is complementary to the T-DNA. This enrichment technique separates the single strands of desired amplicons from the off-target amplicons, reducing the template complexity by several orders of magnitude. We analyzed eight transgenic rice plants and found the transgene integration loci in three different chromosomes. The characteristic illegitimate recombination of the Agrobacterium sp. was also observed from the sequenced integration loci. We believe that the LF PCR should be an indispensable technique in transgenic analysis.

  17. Integrative pathway genomics of lung function and airflow obstruction

    PubMed Central

    Gharib, Sina A.; Loth, Daan W.; Soler Artigas, María; Birkland, Timothy P.; Wilk, Jemma B.; Wain, Louise V.; Brody, Jennifer A.; Obeidat, Ma'en; Hancock, Dana B.; Tang, Wenbo; Rawal, Rajesh; Boezen, H. Marike; Imboden, Medea; Huffman, Jennifer E.; Lahousse, Lies; Alves, Alexessander C.; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C.; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M.; Strachan, David P.; Deary, Ian J.; Hofman, Albert; Gläser, Sven; Wilson, James F.; North, Kari E.; Zhao, Jing Hua; Heckbert, Susan R.; Jarvis, Deborah L.; Probst-Hensch, Nicole; Schulz, Holger; Barr, R. Graham; Jarvelin, Marjo-Riitta; O'Connor, George T.; Kähönen, Mika; Cassano, Patricia A.; Hysi, Pirro G.; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M.; Hall, Ian P.; Parks, William C.; Tobin, Martin D.; London, Stephanie J.

    2015-01-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease. PMID:26395457

  18. Integrated genomic and epigenomic analysis of breast cancer brain metastasis.

    PubMed

    Salhia, Bodour; Kiefer, Jeff; Ross, Julianna T D; Metapally, Raghu; Martinez, Rae Anne; Johnson, Kyle N; DiPerna, Danielle M; Paquette, Kimberly M; Jung, Sungwon; Nasser, Sara; Wallstrom, Garrick; Tembe, Waibhav; Baker, Angela; Carpten, John; Resau, Jim; Ryken, Timothy; Sibenaller, Zita; Petricoin, Emanuel F; Liotta, Lance A; Ramanathan, Ramesh K; Berens, Michael E; Tran, Nhan L

    2014-01-01

    The brain is a common site of metastatic disease in patients with breast cancer, which has few therapeutic options and dismal outcomes. The purpose of our study was to identify common and rare events that underlie breast cancer brain metastasis. We performed deep genomic profiling, which integrated gene copy number, gene expression and DNA methylation datasets on a collection of breast brain metastases. We identified frequent large chromosomal gains in 1q, 5p, 8q, 11q, and 20q and frequent broad-level deletions involving 8p, 17p, 21p and Xq. Frequently amplified and overexpressed genes included ATAD2, BRAF, DERL1, DNMTRB and NEK2A. The ATM, CRYAB and HSPB2 genes were commonly deleted and underexpressed. Knowledge mining revealed enrichment in cell cycle and G2/M transition pathways, which contained AURKA, AURKB and FOXM1. Using the PAM50 breast cancer intrinsic classifier, Luminal B, Her2+/ER negative, and basal-like tumors were identified as the most commonly represented breast cancer subtypes in our brain metastasis cohort. While overall methylation levels were increased in breast cancer brain metastasis, basal-like brain metastases were associated with significantly lower levels of methylation. Integrating DNA methylation data with gene expression revealed defects in cell migration and adhesion due to hypermethylation and downregulation of PENK, EDN3, and ITGAM. Hypomethylation and upregulation of KRT8 likely affects adhesion and permeability. Genomic and epigenomic profiling of breast brain metastasis has provided insight into the somatic events underlying this disease, which have potential in forming the basis of future therapeutic strategies.

  19. Integrated Genomic and Epigenomic Analysis of Breast Cancer Brain Metastasis

    PubMed Central

    Salhia, Bodour; Kiefer, Jeff; Ross, Julianna T. D.; Metapally, Raghu; Martinez, Rae Anne; Johnson, Kyle N.; DiPerna, Danielle M.; Paquette, Kimberly M.; Jung, Sungwon; Nasser, Sara; Wallstrom, Garrick; Tembe, Waibhav; Baker, Angela; Carpten, John; Resau, Jim; Ryken, Timothy; Sibenaller, Zita; Petricoin, Emanuel F.; Liotta, Lance A.; Ramanathan, Ramesh K.; Berens, Michael E.; Tran, Nhan L.

    2014-01-01

    The brain is a common site of metastatic disease in patients with breast cancer, which has few therapeutic options and dismal outcomes. The purpose of our study was to identify common and rare events that underlie breast cancer brain metastasis. We performed deep genomic profiling, which integrated gene copy number, gene expression and DNA methylation datasets on a collection of breast brain metastases. We identified frequent large chromosomal gains in 1q, 5p, 8q, 11q, and 20q and frequent broad-level deletions involving 8p, 17p, 21p and Xq. Frequently amplified and overexpressed genes included ATAD2, BRAF, DERL1, DNMTRB and NEK2A. The ATM, CRYAB and HSPB2 genes were commonly deleted and underexpressed. Knowledge mining revealed enrichment in cell cycle and G2/M transition pathways, which contained AURKA, AURKB and FOXM1. Using the PAM50 breast cancer intrinsic classifier, Luminal B, Her2+/ER negative, and basal-like tumors were identified as the most commonly represented breast cancer subtypes in our brain metastasis cohort. While overall methylation levels were increased in breast cancer brain metastasis, basal-like brain metastases were associated with significantly lower levels of methylation. Integrating DNA methylation data with gene expression revealed defects in cell migration and adhesion due to hypermethylation and downregulation of PENK, EDN3, and ITGAM. Hypomethylation and upregulation of KRT8 likely affects adhesion and permeability. Genomic and epigenomic profiling of breast brain metastasis has provided insight into the somatic events underlying this disease, which have potential in forming the basis of future therapeutic strategies. PMID:24489661

  20. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease.

  1. CRISPR/Cas9-mediated genome editing in plants.

    PubMed

    Liu, Xuejun; Xie, Chuanxiao; Si, Huaijun; Yang, Jinxiao

    2017-05-15

    The increasing burden of the world's population on agriculture necessitates the development of more robust crops. As the amount of information from sequenced crop genomes increases, technology can be used to investigate the function of genes in detail and to design improved crops at the molecular level. Recently, an RNA-programmed genome-editing system composed of a clustered regularly interspaced short palindromic repeats (CRISPR)-encoded guide RNA and the nuclease Cas9 has provided a powerful platform to achieve these goals. By combining versatile tools to study and modify plants at different molecular levels, the CRISPR/Cas9 system is paving the way towards a new horizon for basic research and crop development. In this review, the accomplishments, problems and improvements of this technology in plants, including target sequence cleavage, knock-in/gene replacement, transcriptional regulation, epigenetic modification, off-target effects, delivery system and potential applications, will be highlighted. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Examination of host genome for the presence of integrated fragments of Solenopsis invicta virus 1

    USDA-ARS?s Scientific Manuscript database

    A series of oligonucleotide primer pairs covering the entire genome of Solenopsis invicta virus 1 (SINV-1) were used to probe the Solenopsis invicta genome for integrated fragments of the viral genome. All of the oligonucleotide primer sets yielded amplicons of anticipated size from cDNA created f...

  3. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    SciTech Connect

    Baliga, Nitin S

    2011-05-26

    applied to the manually curated training set. Applying this method to the data representing around a quarter of the fraction space for water soluble proteins in D. vulgaris, we obtained 854 reliable pair wise interactions. Further, we have developed algorithms to analyze and assign significance to protein interaction data from bait pull-down experiments and integrate these data with other systems biology data through associative biclustering in a parallel computing environment. We will 'fill-in' missing information in these interaction data using a 'Transitive Closure' algorithm and subsequently use 'Between Commonality Decomposition' algorithm to discover complexes within these large graphs of protein interactions. To characterize the metabolic activities of proteins and their complexes we are developing algorithms to deconvolute pure mass spectra, estimate chemical formula for m/z values, and fit isotopic fine structure to metabolomics data. We have discovered that in comparison to isotopic pattern fitting methods restricting the chemical formula by these two dimensions actually facilitates unique solutions for chemical formula generators. To understand how microbial functions are regulated we have developed complementary algorithms for reconstructing gene regulatory networks (GRNs). Whereas the network inference algorithms cMonkey and Inferelator developed enable de novo reconstruction of predictive models for GRNs from diverse systems biology data, the RegPrecise and RegPredict framework developed uses evolutionary comparisons of genomes from closely related organisms to reconstruct conserved regulons. We have integrated the two complementary algorithms to rapidly generate comprehensive models for gene regulation of understudied organisms. Our preliminary analyses of these reconstructed GRNs have revealed novel regulatory mechanisms and cis-regulatory motifs, as well asothers that are conserved across species. Finally, we are supporting scientific efforts in ENIGMA

  4. Directed genomic integration, gene replacement, and integrative gene expression in Streptococcus thermophilus.

    PubMed Central

    Mollet, B; Knol, J; Poolman, B; Marciset, O; Delley, M

    1993-01-01

    Several pGEM5- and pUC19-derived plasmids containing a selectable erythromycin resistance marker were integrated into the chromosome of Streptococcus thermophilus at the loci of the lactose-metabolizing genes. Integration occurred via homologous recombination and resulted in cointegrates between plasmid and genome, flanked by the homologous DNA used for integration. Selective pressure on the plasmid-located erythromycin resistance gene resulted in multiple amplifications of the integrated plasmid. Release of this selective pressure, however, gave way to homologous resolution of the cointegrate structures. By integration and subsequent resolution, we were able to replace the chromosomal lacZ gene with a modified copy carrying an in vitro-generated deletion. In the same way, we integrated a promoterless chloramphenicol acetyltransferase (cat) gene between the chromosomal lacS and lacZ genes of the lactose operon. The inserted cat gene became a functional part of the operon and was expressed and regulated accordingly. Selective pressure on the essential lacS and lacZ genes under normal growth conditions in milk ensures the maintenance and expression of the integrated gene. As there are only minimal repeated DNA sequences (an NdeI site) flanking the inserted cat gene, it was stably maintained even in the absence of lactose, i.e., when grown on sucrose or glucose. The methodology represents a stable system in which to express and regulate foreign genes in S. thermophilus, which could qualify in the future for an application with food. Images PMID:8331064

  5. CRISPR/Cas9-mediated genome editing of Epstein-Barr virus in human cells.

    PubMed

    Yuen, Kit-San; Chan, Chi-Ping; Wong, Nok-Hei Mickey; Ho, Chau-Ha; Ho, Ting-Hin; Lei, Ting; Deng, Wen; Tsao, Sai Wah; Chen, Honglin; Kok, Kin-Hang; Jin, Dong-Yan

    2015-03-01

    The CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR-associated 9) system is a highly efficient and powerful tool for RNA-guided editing of the cellular genome. Whether CRISPR/Cas9 can also cleave the genome of DNA viruses such as Epstein-Barr virus (EBV), which undergo episomal replication in human cells, remains to be established. Here, we reported on CRISPR/Cas9-mediated editing of the EBV genome in human cells. Two guide RNAs (gRNAs) were used to direct a targeted deletion of 558 bp in the promoter region of BART (BamHI A rightward transcript) which encodes viral microRNAs (miRNAs). Targeted editing was achieved in several human epithelial cell lines latently infected with EBV, including nasopharyngeal carcinoma C666-1 cells. CRISPR/Cas9-mediated editing of the EBV genome was efficient. A recombinant virus with the desired deletion was obtained after puromycin selection of cells expressing Cas9 and gRNAs. No off-target cleavage was found by deep sequencing. The loss of BART miRNA expression and activity was verified, supporting the BART promoter as the major promoter of BART RNA. Although CRISPR/Cas9-mediated editing of the multicopy episome of EBV in infected HEK293 cells was mostly incomplete, viruses could be recovered and introduced into other cells at low m.o.i. Recombinant viruses with an edited genome could be further isolated through single-cell sorting. Finally, a DsRed selectable marker was successfully introduced into the EBV genome during the course of CRISPR/Cas9-mediated editing. Taken together, our work provided not only the first genetic evidence that the BART promoter drives the expression of the BART transcript, but also a new and efficient method for targeted editing of EBV genome in human cells.

  6. High-Content Genome-Wide RNAi Screen Reveals CCR3 as a Key Mediator of Neuronal Cell Death.

    PubMed

    Zhang, Jianmin; Wang, Huaishan; Sherbini, Omar; Ling-Lin Pai, Emily; Kang, Sung-Ung; Kwon, Ji-Sun; Yang, Jia; He, Wei; Wang, Hong; Eacker, Stephen M; Chi, Zhikai; Mao, Xiaobo; Xu, Jinchong; Jiang, Haisong; Andrabi, Shaida A; Dawson, Ted M; Dawson, Valina L

    2016-01-01

    Neuronal loss caused by ischemic injury, trauma, or disease can lead to devastating consequences for the individual. With the goal of limiting neuronal loss, a number of cell death pathways have been studied, but there may be additional contributors to neuronal death that are yet unknown. To identify previously unknown cell death mediators, we performed a high-content genome-wide screening of short, interfering RNA (siRNA) with an siRNA library in murine neural stem cells after exposure to N-methyl-N-nitroso-N'-nitroguanidine (MNNG), which leads to DNA damage and cell death. Eighty genes were identified as key mediators for cell death. Among them, 14 are known cell death mediators and 66 have not previously been linked to cell death pathways. Using an integrated approach with functional and bioinformatics analysis, we provide possible molecular networks, interconnected pathways, and/or protein complexes that may participate in cell death. Of the 66 genes, we selected CCR3 for further evaluation and found that CCR3 is a mediator of neuronal injury. CCR3 inhibition or deletion protects murine cortical cultures from oxygen-glucose deprivation-induced cell death, and CCR3 deletion in mice provides protection from ischemia in vivo. Taken together, our findings suggest that CCR3 is a previously unknown mediator of cell death. Future identification of the neural cell death network in which CCR3 participates will enhance our understanding of the molecular mechanisms of neural cell death.

  7. High-Content Genome-Wide RNAi Screen Reveals CCR3 as a Key Mediator of Neuronal Cell Death

    PubMed Central

    Wang, Huaishan; Sherbini, Omar; Ling-lin Pai, Emily; Kwon, Ji-Sun; He, Wei; Wang, Hong; Chi, Zhikai; Xu, Jinchong; Jiang, Haisong; Andrabi, Shaida A.

    2016-01-01

    Neuronal loss caused by ischemic injury, trauma, or disease can lead to devastating consequences for the individual. With the goal of limiting neuronal loss, a number of cell death pathways have been studied, but there may be additional contributors to neuronal death that are yet unknown. To identify previously unknown cell death mediators, we performed a high-content genome-wide screening of short, interfering RNA (siRNA) with an siRNA library in murine neural stem cells after exposure to N-methyl-N-nitroso-N′-nitroguanidine (MNNG), which leads to DNA damage and cell death. Eighty genes were identified as key mediators for cell death. Among them, 14 are known cell death mediators and 66 have not previously been linked to cell death pathways. Using an integrated approach with functional and bioinformatics analysis, we provide possible molecular networks, interconnected pathways, and/or protein complexes that may participate in cell death. Of the 66 genes, we selected CCR3 for further evaluation and found that CCR3 is a mediator of neuronal injury. CCR3 inhibition or deletion protects murine cortical cultures from oxygen-glucose deprivation–induced cell death, and CCR3 deletion in mice provides protection from ischemia in vivo. Taken together, our findings suggest that CCR3 is a previously unknown mediator of cell death. Future identification of the neural cell death network in which CCR3 participates will enhance our understanding of the molecular mechanisms of neural cell death. PMID:27822494

  8. Integrated genomic and molecular characterization of cervical cancer.

    PubMed

    2017-03-16

    Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, one of the largest comprehensive genomic studies of cervical cancer to date. We observed notable APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered amplifications in immune targets CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2), and the BCAR4 long non-coding RNA, which has been associated with response to lapatinib. Integration of human papilloma virus (HPV) was observed in all HPV18-related samples and 76% of HPV16-related samples, and was associated with structural aberrations and increased target-gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumours with relatively high frequencies of KRAS, ARID1A and PTEN mutations. Integrative clustering of 178 samples identified keratin-low squamous, keratin-high squamous and adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers.

  9. Integrated Genomic Biomarkers to Identify Aggressive Disease in African Americans with Prostate Cancer

    DTIC Science & Technology

    2016-09-01

    AWARD NUMBER: W81XWH-15-1-0395 TITLE: Integrated Genomic Biomarkers to Identify Aggressive Disease in African Americans with Prostate Cancer...2015- 31 Aug 2016 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Integrated Genomic Biomarkers to Identify Aggressive Disease In African Americans with

  10. Accessing integrated genomic data using GenoBase: A tutorial, Part 1

    SciTech Connect

    Overbeek, R.; Price, M.

    1993-01-01

    GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.

  11. Accessing integrated genomic data using GenoBase: A tutorial, Part 1

    SciTech Connect

    Overbeek, R.; Price, M.

    1993-01-01

    GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.

  12. The integrated web service and genome database for agricultural plants with biotechnology information.

    PubMed

    Kim, Changkug; Park, Dongsuk; Seol, Youngjoo; Hahn, Jangho

    2011-01-01

    The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.

  13. Genomic RNA folding mediates assembly of human parechovirus.

    PubMed

    Shakeel, Shabih; Dykeman, Eric C; White, Simon J; Ora, Ari; Cockburn, Joseph J B; Butcher, Sarah J; Stockley, Peter G; Twarock, Reidun

    2017-12-01

    Assembly of the major viral pathogens of the Picornaviridae family is poorly understood. Human parechovirus 1 is an example of such viruses that contains 60 short regions of ordered RNA density making identical contacts with the protein shell. We show here via a combination of RNA-based systematic evolution of ligands by exponential enrichment, bioinformatics analysis and reverse genetics that these RNA segments are bound to the coat proteins in a sequence-specific manner. Disruption of either the RNA coat protein recognition motif or its contact amino acid residues is deleterious for viral assembly. The data are consistent with RNA packaging signals playing essential roles in virion assembly. Their binding sites on the coat proteins are evolutionarily conserved across the Parechovirus genus, suggesting that they represent potential broad-spectrum anti-viral targets.The mechanism underlying packaging of genomic RNA into viral particles is not well understood for human parechoviruses. Here the authors identify short RNA motifs in the parechovirus genome that bind capsid proteins, providing approximately 60 specific interactions for virion assembly.

  14. Integration of molecular functions at the ecosystemic level: breakthroughs and future goals of environmental genomics and post-genomics

    PubMed Central

    Vandenkoornhuyse, Philippe; Dufresne, Alexis; Quaiser, Achim; Gouesbet, Gwenola; Binet, Françoise; Francez, André-Jean; Mahé, Stéphane; Bormans, Myriam; Lagadeuc, Yvan; Couée, Ivan

    2010-01-01

    Environmental genomics and genome-wide expression approaches deal with large-scale sequence-based information obtained from environmental samples, at organismal, population or community levels. To date, environmental genomics, transcriptomics and proteomics are arguably the most powerful approaches to discover completely novel ecological functions and to link organismal capabilities, organism–environment interactions, functional diversity, ecosystem processes, evolution and Earth history. Thus, environmental genomics is not merely a toolbox of new technologies but also a source of novel ecological concepts and hypotheses. By removing previous dichotomies between ecophysiology, population ecology, community ecology and ecosystem functioning, environmental genomics enables the integration of sequence-based information into higher ecological and evolutionary levels. However, environmental genomics, along with transcriptomics and proteomics, must involve pluridisciplinary research, such as new developments in bioinformatics, in order to integrate high-throughput molecular biology techniques into ecology. In this review, the validity of environmental genomics and post-genomics for studying ecosystem functioning is discussed in terms of major advances and expectations, as well as in terms of potential hurdles and limitations. Novel avenues for improving the use of these approaches to test theory-driven ecological hypotheses are also explored. PMID:20426792

  15. Causes and Consequences of Genetic Background Effects Illuminated by Integrative Genomic Analysis

    PubMed Central

    Chandler, Christopher H.; Chari, Sudarshan; Dworkin, Ian

    2014-01-01

    The phenotypic consequences of individual mutations are modulated by the wild-type genetic background in which they occur. Although such background dependence is widely observed, we do not know whether general patterns across species and traits exist or about the mechanisms underlying it. We also lack knowledge on how mutations interact with genetic background to influence gene expression and how this in turn mediates mutant phenotypes. Furthermore, how genetic background influences patterns of epistasis remains unclear. To investigate the genetic basis and genomic consequences of genetic background dependence of the scallopedE3 allele on the Drosophila melanogaster wing, we generated multiple novel genome-level datasets from a mapping-by-introgression experiment and a tagged RNA gene expression dataset. In addition we used whole genome resequencing of the parental lines—two commonly used laboratory strains—to predict polymorphic transcription factor binding sites for SD. We integrated these data with previously published genomic datasets from expression microarrays and a modifier mutation screen. By searching for genes showing a congruent signal across multiple datasets, we were able to identify a robust set of candidate loci contributing to the background-dependent effects of mutations in sd. We also show that the majority of background-dependent modifiers previously reported are caused by higher-order epistasis, not quantitative noncomplementation. These findings provide a useful foundation for more detailed investigations of genetic background dependence in this system, and this approach is likely to prove useful in exploring the genetic basis of other traits as well. PMID:24504186

  16. Causes and consequences of genetic background effects illuminated by integrative genomic analysis.

    PubMed

    Chandler, Christopher H; Chari, Sudarshan; Tack, David; Dworkin, Ian

    2014-04-01

    The phenotypic consequences of individual mutations are modulated by the wild-type genetic background in which they occur. Although such background dependence is widely observed, we do not know whether general patterns across species and traits exist or about the mechanisms underlying it. We also lack knowledge on how mutations interact with genetic background to influence gene expression and how this in turn mediates mutant phenotypes. Furthermore, how genetic background influences patterns of epistasis remains unclear. To investigate the genetic basis and genomic consequences of genetic background dependence of the scalloped(E3) allele on the Drosophila melanogaster wing, we generated multiple novel genome-level datasets from a mapping-by-introgression experiment and a tagged RNA gene expression dataset. In addition we used whole genome resequencing of the parental lines-two commonly used laboratory strains-to predict polymorphic transcription factor binding sites for SD. We integrated these data with previously published genomic datasets from expression microarrays and a modifier mutation screen. By searching for genes showing a congruent signal across multiple datasets, we were able to identify a robust set of candidate loci contributing to the background-dependent effects of mutations in sd. We also show that the majority of background-dependent modifiers previously reported are caused by higher-order epistasis, not quantitative noncomplementation. These findings provide a useful foundation for more detailed investigations of genetic background dependence in this system, and this approach is likely to prove useful in exploring the genetic basis of other traits as well.

  17. Genomic rearrangements by LINE-1 insertion-mediated deletion in the human and chimpanzee lineages

    PubMed Central

    Han, Kyudong; Sen, Shurjo K.; Wang, Jianxin; Callinan, Pauline A.; Lee, Jungnam; Cordaux, Richard; Liang, Ping; Batzer, Mark A.

    2005-01-01

    Long INterspersed Elements (LINE-1s or L1s) are abundant non-LTR retrotransposons in mammalian genomes that are capable of insertional mutagenesis. They have been associated with target site deletions upon insertion in cell culture studies of retrotransposition. Here, we report 50 deletion events in the human and chimpanzee genomes directly linked to the insertion of L1 elements, resulting in the loss of ∼18 kb of sequence from the human genome and ∼15 kb from the chimpanzee genome. Our data suggest that during the primate radiation, L1 insertions may have deleted up to 7.5 Mb of target genomic sequences. While the results of our in vivo analysis differ from those of previous cell culture assays of L1 insertion-mediated deletions in terms of the size and rate of sequence deletion, evolutionary factors can reconcile the differences. We report a pattern of genomic deletion sizes similar to those created during the retrotransposition of Alu elements. Our study provides support for the existence of different mechanisms for small and large L1-mediated deletions, and we present a model for the correlation of L1 element size and the corresponding deletion size. In addition, we show that internal rearrangements can modify L1 structure during retrotransposition events associated with large deletions. PMID:16034026

  18. Integrative genomic profiling reveals conserved genetic mechanisms for tumorigenesis in common entities of non-Hodgkin's lymphoma.

    PubMed

    Green, Michael R; Aya-Bonilla, Carlos; Gandhi, Maher K; Lea, Rod A; Wellwood, Jeremy; Wood, Peter; Marlton, Paula; Griffiths, Lyn R

    2011-05-01

    Recent developments in genomic technologies have resulted in increased understanding of pathogenic mechanisms and emphasized the importance of central survival pathways. Here, we use a novel bioinformatic based integrative genomic profiling approach to elucidate conserved mechanisms of lymphomagenesis in the three commonest non-Hodgkin's lymphoma (NHL) entities: diffuse large B-cell lymphoma, follicular lymphoma, and B-cell chronic lymphocytic leukemia. By integrating genome-wide DNA copy number analysis and transcriptome profiling of tumor cohorts, we identified genetic lesions present in each entity and highlighted their likely target genes. This revealed a significant enrichment of components of both the apoptosis pathway and the mitogen activated protein kinase pathway, including amplification of the MAP3K12 locus in all three entities, within the set of genes targeted by genetic alterations in these diseases. Furthermore, amplification of 12p13.33 was identified in all three entities and found to target the FOXM1 oncogene. Amplification of FOXM1 was subsequently found to be associated with an increased MYC oncogenic signaling signature, and siRNA-mediated knock-down of FOXM1 resulted in decreased MYC expression and induced G2 arrest. Together, these findings underscore genetic alteration of the MAPK and apoptosis pathways, and genetic amplification of FOXM1 as conserved mechanisms of lymphomagenesis in common NHL entities. Integrative genomic profiling identifies common central survival mechanisms and highlights them as attractive targets for directed therapy. 2011 Wiley-Liss, Inc.

  19. A Genome-wide siRNA Screen Reveals Diverse Cellular Processes and Pathways that Mediate Genome Stability

    PubMed Central

    Paulsen, Renee D.; Soni, Deena V.; Wollman, Roy; Hahn, Angela T.; Yee, Muh-Ching; Guan, Anna; Hesley, Jayne A.; Miller, Steven C.; Cromwell, Evan F.; Solow-Cordero, David E.; Meyer, Tobias; Cimprich, Karlene A.

    2009-01-01

    SUMMARY Signaling pathways that respond to DNA damage are essential for the maintenance of genome stability and are linked to many diseases, including cancer. Here, a genome-wide siRNA screen was employed to identify novel genes involved in genome stabilization by monitoring phosphorylation of the histone variant H2AX, an early mark of DNA damage. We identified hundreds of genes whose down-regulation led to elevated levels of H2AX phosphorylation (γH2AX) and revealed new links to cellular complexes and to genes with unclassified functions. We demonstrate a widespread role for mRNA processing factors in preventing DNA damage, which in some cases is caused by aberrant RNA-DNA structures. Furthermore, we connect increased γH2AX levels to the neurological disorder, Charcot-Marie-Tooth (CMT) syndrome, and we find a role for several CMT proteins in the DNA damage response. These data indicate that preservation of genome stability is mediated by a larger network of biological processes than previously appreciated. PMID:19647519

  20. High copy and stable expression of the xylanase XynHB in Saccharomyces cerevisiae by rDNA-mediated integration.

    PubMed

    Fang, Cheng; Wang, Qinhong; Selvaraj, Jonathan Nimal; Zhou, Yuling; Ma, Lixin; Zhang, Guimin; Ma, Yanhe

    2017-08-18

    Xylanase is a widely-used additive in baking industry for enhancing dough and bread quality. Several xylanases used in baking industry were expressed in different systems, but their expression in antibiotic free vector system is highly essential and safe. In the present study, an alternative rDNA-mediated technology was developed to increase the copy number of target gene by integrating it into Saccharomyces cerevisiae genome. A xylanase-encoding gene xynHB from Bacillus sp. was cloned into pHBM367H and integrated into S. cerevisiae genome through rDNA-mediated recombination. Exogenous XynHB expressed by recombinant S. cerevisiae strain A13 exhibited higher degradation activity towards xylan than other transformants. The real-time PCR analysis on A13 genome revealed the presence of 13.64 copies of xynHB gene. Though no antibiotics have been used, the genetic stability and the xylanase activity of xynHB remained stable up to 1,011 generations of cultivation. S. cerevisiae strain A13 expressing xylanase reduced the required kneading time and increased the height and diameter of the dough size, which would be safe and effective in baking industry as no antibiotics-resistance risk. The new effective rDNA-mediated technology without using antibiotics here provides a way to clone other food related industrial enzymes for applications.

  1. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  2. Assessing the integration of genomic medicine in genetic counseling training programs.

    PubMed

    Profato, Jessica; Gordon, Erynn S; Dixon, Shannan; Kwan, Andrea

    2014-08-01

    Medical genetics has entered a period of transition from genetics to genomics. Genetic counselors (GCs) may take on roles in the clinical implementation of genomics. This study explores the perspectives of program directors (PDs) on including genomic medicine in GC training programs, as well as the status of this integration. Study methods included an online survey, an optional one-on-one telephone interview, and an optional curricula content analysis. The majority of respondents (15/16) reported that it is important to include genomic medicine in program curricula. Most topics of genomic medicine are either "currently taught" or "under development" in all participating programs. Interview data from five PDs and one faculty member supported the survey data. Integrating genomics in training programs is challenging, and it is essential to develop genomics resources for curricula.

  3. The 3M complex maintains microtubule and genome integrity

    PubMed Central

    Yan, Jun; Yan, Feng; Li, Zhijun; Sinnott, Becky; Cappell, Kathryn M.; Yu, Yanbao; Mo, Jinyao; Duncan, Joseph A.; Chen, Xian; Cormier-Daire, Valerie; Whitehurst, Angelique W.; Xiong, Yue

    2014-01-01

    SUMMARY CUL7, OBSL1, and CCDC8 genes are mutated in a mutually exclusive manner in 3M and other growth retardation syndromes. The mechanism underlying the function of the three 3M genes in development is not known. We found that OBSL1 and CCDC8 form a complex with CUL7 and regulate the level and centrosomal localization of CUL7, respectively. CUL7 depletion results in altered microtubule dynamics, prometaphase arrest, tetraploidy and mitotic cell death. These defects are recaptured in CUL7 mutated 3M cells and can be rescued by wild-type, but not 3M patients-derived CUL7 mutants. Depletion of either OBSL1 or CCDC8 results in similar defects and sensitizes cells to microtubule damage as loss of CUL7 function. Microtubule damage reduces the level of CCDC8 that is required for the centrosomal localization of CUL7. We propose that CUL7, OBSL1, and CCDC8 proteins form a 3M complex that functions in maintaining microtubule and genome integrity and normal development. PMID:24793695

  4. CRISPR/Cas9-mediated genome engineering of CHO cell factories: Application and perspectives.

    PubMed

    Lee, Jae Seong; Grav, Lise Marie; Lewis, Nathan E; Faustrup Kildegaard, Helene

    2015-07-01

    Chinese hamster ovary (CHO) cells are the most widely used production host for therapeutic proteins. With the recent emergence of CHO genome sequences, CHO cell line engineering has taken on a new aspect through targeted genome editing. The bacterial clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system enables rapid, easy and efficient engineering of mammalian genomes. It has a wide range of applications from modification of individual genes to genome-wide screening or regulation of genes. Facile genome editing using CRISPR/Cas9 empowers researchers in the CHO community to elucidate the mechanistic basis behind high level production of proteins and product quality attributes of interest. In this review, we describe the basis of CRISPR/Cas9-mediated genome editing and its application for development of next generation CHO cell factories while highlighting both future perspectives and challenges. As one of the main drivers for the CHO systems biology era, genome engineering with CRISPR/Cas9 will pave the way for rational design of CHO cell factories. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Phenotype-Genotype Integrator (PheGenI): synthesizing genome-wide association study (GWAS) data with existing genomic resources.

    PubMed

    Ramos, Erin M; Hoffman, Douglas; Junkins, Heather A; Maglott, Donna; Phan, Lon; Sherry, Stephen T; Feolo, Mike; Hindorff, Lucia A

    2014-01-01

    Rapidly accumulating data from genome-wide association studies (GWASs) and other large-scale studies are most useful when synthesized with existing databases. To address this opportunity, we developed the Phenotype-Genotype Integrator (PheGenI), a user-friendly web interface that integrates various National Center for Biotechnology Information (NCBI) genomic databases with association data from the National Human Genome Research Institute GWAS Catalog and supports downloads of search results. Here, we describe the rationale for and development of this resource. Integrating over 66,000 association records with extensive single nucleotide polymorphism (SNP), gene, and expression quantitative trait loci data already available from the NCBI, PheGenI enables deeper investigation and interrogation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between genetic variation and human diseases.

  6. Application of oocyte cryopreservation technology in TALEN-mediated mouse genome editing.

    PubMed

    Nakagawa, Yoshiko; Sakuma, Tetsushi; Nakagata, Naomi; Yamasaki, Sho; Takeda, Naoki; Ohmuraya, Masaki; Yamamoto, Takashi

    2014-01-01

    Reproductive engineering techniques, such as in vitro fertilization (IVF) and cryopreservation of embryos or spermatozoa, are essential for preservation, reproduction, and transportation of genetically engineered mice. However, it has not yet been elucidated whether these techniques can be applied for the generation of genome-edited mice using engineered nucleases such as transcription activator-like effector nucleases (TALENs). Here, we demonstrate the usefulness of frozen oocytes fertilized in vitro using frozen sperm for TALEN-mediated genome editing in mice. We examined side-by-side comparisons concerning sperm (fresh vs. frozen), fertilization method (mating vs. IVF), and fertilized oocytes (fresh vs. frozen) for the source of oocytes used for TALEN injection; we found that fertilized oocytes created under all tested conditions were applicable for TALEN-mediated mutagenesis. In addition, we investigated whether the ages in weeks of parental female mice can affect the efficiency of gene modification, by comparing 5-week-old and 8-12-week-old mice as the source of oocytes used for TALEN injection. The genome editing efficiency of an endogenous gene was consistently 95-100% when either 5-week-old or 8-12-week-old mice were used with or without freezing the oocytes. Thus, our report describes the availability of freeze-thawed oocytes and oocytes from female mice at various weeks of age for TALEN-mediated genome editing, thus boosting the convenience of such innovative gene targeting strategies.

  7. Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing

    PubMed Central

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E.; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology. PMID:23193293

  8. Human Ageing Genomic Resources: integrated databases and tools for the biology and genetics of ageing.

    PubMed

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology.

  9. Integrative genomic analysis implicates limited peripheral adipose storage capacity in the pathogenesis of human insulin resistance.

    PubMed

    Lotta, Luca A; Gulati, Pawan; Day, Felix R; Payne, Felicity; Ongen, Halit; van de Bunt, Martijn; Gaulton, Kyle J; Eicher, John D; Sharp, Stephen J; Luan, Jian'an; De Lucia Rolfe, Emanuella; Stewart, Isobel D; Wheeler, Eleanor; Willems, Sara M; Adams, Claire; Yaghootkar, Hanieh; Forouhi, Nita G; Khaw, Kay-Tee; Johnson, Andrew D; Semple, Robert K; Frayling, Timothy; Perry, John R B; Dermitzakis, Emmanouil; McCarthy, Mark I; Barroso, Inês; Wareham, Nicholas J; Savage, David B; Langenberg, Claudia; O'Rahilly, Stephen; Scott, Robert A

    2017-01-01

    Insulin resistance is a key mediator of obesity-related cardiometabolic disease, yet the mechanisms underlying this link remain obscure. Using an integrative genomic approach, we identify 53 genomic regions associated with insulin resistance phenotypes (higher fasting insulin levels adjusted for BMI, lower HDL cholesterol levels and higher triglyceride levels) and provide evidence that their link with higher cardiometabolic risk is underpinned by an association with lower adipose mass in peripheral compartments. Using these 53 loci, we show a polygenic contribution to familial partial lipodystrophy type 1, a severe form of insulin resistance, and highlight shared molecular mechanisms in common/mild and rare/severe insulin resistance. Population-level genetic analyses combined with experiments in cellular models implicate CCDC92, DNAH10 and L3MBTL3 as previously unrecognized molecules influencing adipocyte differentiation. Our findings support the notion that limited storage capacity of peripheral adipose tissue is an important etiological component in insulin-resistant cardiometabolic disease and highlight genes and mechanisms underpinning this link.

  10. Cas9-mediated genome editing in the methanogenic archaeon Methanosarcina acetivorans.

    PubMed

    Nayak, Dipti D; Metcalf, William W

    2017-03-14

    Although Cas9-mediated genome editing has proven to be a powerful genetic tool in eukaryotes, its application in Bacteria has been limited because of inefficient targeting or repair; and its application to Archaea has yet to be reported. Here we describe the development of a Cas9-mediated genome-editing tool that allows facile genetic manipulation of the slow-growing methanogenic archaeon Methanosarcina acetivorans Introduction of both insertions and deletions by homology-directed repair was remarkably efficient and precise, occurring at a frequency of approximately 20% relative to the transformation efficiency, with the desired mutation being found in essentially all transformants examined. Off-target activity was not observed. We also observed that multiple single-guide RNAs could be expressed in the same transcript, reducing the size of mutagenic plasmids and simultaneously simplifying their design. Cas9-mediated genome editing reduces the time needed to construct mutants by more than half (3 vs. 8 wk) and allows simultaneous construction of double mutants with high efficiency, exponentially decreasing the time needed for complex strain constructions. Furthermore, coexpression the nonhomologous end-joining (NHEJ) machinery from the closely related archaeon, Methanocella paludicola, allowed efficient Cas9-mediated genome editing without the need for a repair template. The NHEJ-dependent mutations included deletions ranging from 75 to 2.7 kb in length, most of which appear to have occurred at regions of naturally occurring microhomology. The combination of homology-directed repair-dependent and NHEJ-dependent genome-editing tools comprises a powerful genetic system that enables facile insertion and deletion of genes, rational modification of gene expression, and testing of gene essentiality.

  11. Integrating genetics and genomics into nursing curricula: you can do it too!

    PubMed

    Daack-Hirsch, Sandra; Jackson, Barbara; Belchez, Chito A; Elder, Betty; Hurley, Roxanne; Kerr, Peg; Nissen, Mary Kay

    2013-12-01

    Rapid advances in knowledge and technology related to genomics cross health care disciplines and touch almost every aspect of patient care. The ability to sequence a genome holds the promise that health care can be personalized. Health care professionals are faced with a gap in the ability to use the rapidly expanding technology and knowledge related to genomics in practice. Yet, nurses are key to bridging the gap between genomic discoveries and the human experience of illness. This article presents a case study documenting the experience of five nursing schools/colleges of nursing as they work to integrate genetics and genomics into their curricula.

  12. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement

    PubMed Central

    Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  13. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement.

    PubMed

    Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K

    2016-04-18

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.

  14. Goldmine integrates information placing genomic ranges into meaningful biological contexts

    PubMed Central

    Bhasin, Jeffrey M.; Ting, Angela H.

    2016-01-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  15. Stakeholder engagement: a key component of integrating genomic information into electronic health records

    PubMed Central

    Hartzler, Andrea; McCarty, Catherine A.; Rasmussen, Luke V.; Williams, Marc S.; Brilliant, Murray; Bowton, Erica A.; Clayton, Ellen Wright; Faucett, William A.; Ferryman, Kadija; Field, Julie R.; Fullerton, Stephanie M.; Horowitz, Carol R.; Koenig, Barbara A.; McCormick, Jennifer B.; Ralston, James D.; Sanderson, Saskia C.; Smith, Maureen E.; Trinidad, Susan Brown

    2014-01-01

    Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine. PMID:24030437

  16. SIRT7 promotes genome integrity and modulates non-homologous end joining DNA repair.

    PubMed

    Vazquez, Berta N; Thackray, Joshua K; Simonet, Nicolas G; Kane-Goldsmith, Noriko; Martinez-Redondo, Paloma; Nguyen, Trang; Bunting, Samuel; Vaquero, Alejandro; Tischfield, Jay A; Serrano, Lourdes

    2016-07-15

    Sirtuins, a family of protein deacetylases, promote cellular homeostasis by mediating communication between cells and environment. The enzymatic activity of the mammalian sirtuin SIRT7 targets acetylated lysine in the N-terminal tail of histone H3 (H3K18Ac), thus modulating chromatin structure and transcriptional competency. SIRT7 deletion is associated with reduced lifespan in mice through unknown mechanisms. Here, we show that SirT7-knockout mice suffer from partial embryonic lethality and a progeroid-like phenotype. Consistently, SIRT7-deficient cells display increased replication stress and impaired DNA repair. SIRT7 is recruited in a PARP1-dependent manner to sites of DNA damage, where it modulates H3K18Ac levels. H3K18Ac in turn affects recruitment of the damage response factor 53BP1 to DNA double-strand breaks (DSBs), thereby influencing the efficiency of non-homologous end joining (NHEJ). These results reveal a direct role for SIRT7 in DSB repair and establish a functional link between SIRT7-mediated H3K18 deacetylation and the maintenance of genome integrity. © 2016 The Authors. Published under the terms of the CC BY 4.0 license.

  17. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome.

    PubMed

    Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

    2015-06-19

    Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors.

  18. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    PubMed

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.

  19. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources

    PubMed Central

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/ PMID:26589635

  20. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  1. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas

    PubMed Central

    2015-01-01

    BACKGROUND Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. METHODS We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. RESULTS Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. CONCLUSIONS The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q

  2. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas.

    PubMed

    Brat, Daniel J; Verhaak, Roel G W; Aldape, Kenneth D; Yung, W K Alfred; Salama, Sofie R; Cooper, Lee A D; Rheinbay, Esther; Miller, C Ryan; Vitucci, Mark; Morozova, Olena; Robertson, A Gordon; Noushmehr, Houtan; Laird, Peter W; Cherniack, Andrew D; Akbani, Rehan; Huse, Jason T; Ciriello, Giovanni; Poisson, Laila M; Barnholtz-Sloan, Jill S; Berger, Mitchel S; Brennan, Cameron; Colen, Rivka R; Colman, Howard; Flanders, Adam E; Giannini, Caterina; Grifford, Mia; Iavarone, Antonio; Jain, Rajan; Joseph, Isaac; Kim, Jaegil; Kasaian, Katayoon; Mikkelsen, Tom; Murray, Bradley A; O'Neill, Brian Patrick; Pachter, Lior; Parsons, Donald W; Sougnez, Carrie; Sulman, Erik P; Vandenberg, Scott R; Van Meir, Erwin G; von Deimling, Andreas; Zhang, Hailei; Crain, Daniel; Lau, Kevin; Mallery, David; Morris, Scott; Paulauskis, Joseph; Penny, Robert; Shelton, Troy; Sherman, Mark; Yena, Peggy; Black, Aaron; Bowen, Jay; Dicostanzo, Katie; Gastier-Foster, Julie; Leraas, Kristen M; Lichtenberg, Tara M; Pierson, Christopher R; Ramirez, Nilsa C; Taylor, Cynthia; Weaver, Stephanie; Wise, Lisa; Zmuda, Erik; Davidsen, Tanja; Demchok, John A; Eley, Greg; Ferguson, Martin L; Hutter, Carolyn M; Mills Shaw, Kenna R; Ozenberger, Bradley A; Sheth, Margi; Sofia, Heidi J; Tarnuzzer, Roy; Wang, Zhining; Yang, Liming; Zenklusen, Jean Claude; Ayala, Brenda; Baboud, Julien; Chudamani, Sudha; Jensen, Mark A; Liu, Jia; Pihl, Todd; Raman, Rohini; Wan, Yunhu; Wu, Ye; Ally, Adrian; Auman, J Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B; Beroukhim, Rameen; Bootwalla, Moiz S; Bowlby, Reanne; Bristow, Christopher A; Brooks, Denise; Butterfield, Yaron; Carlsen, Rebecca; Carter, Scott; Chin, Lynda; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Clarke, Amanda; Coetzee, Simon G; Dhalla, Noreen; Fennell, Tim; Fisher, Sheila; Gabriel, Stacey; Getz, Gad; Gibbs, Richard; Guin, Ranabir; Hadjipanayis, Angela; Hayes, D Neil; Hinoue, Toshinori; Hoadley, Katherine; Holt, Robert A; Hoyle, Alan P; Jefferys, Stuart R; Jones, Steven; Jones, Corbin D; Kucherlapati, Raju; Lai, Phillip H; Lander, Eric; Lee, Semin; Lichtenstein, Lee; Ma, Yussanne; Maglinte, Dennis T; Mahadeshwar, Harshad S; Marra, Marco A; Mayo, Michael; Meng, Shaowu; Meyerson, Matthew L; Mieczkowski, Piotr A; Moore, Richard A; Mose, Lisle E; Mungall, Andrew J; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J; Parker, Joel S; Perou, Charles M; Protopopov, Alexei; Ren, Xiaojia; Roach, Jeffrey; Sabedot, Thaís S; Schein, Jacqueline; Schumacher, Steven E; Seidman, Jonathan G; Seth, Sahil; Shen, Hui; Simons, Janae V; Sipahimalani, Payal; Soloway, Matthew G; Song, Xingzhi; Sun, Huandong; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Thiessen, Nina; Triche, Timothy; Van Den Berg, David J; Veluvolu, Umadevi; Waring, Scot; Weisenberger, Daniel J; Wilkerson, Matthew D; Wong, Tina; Wu, Junyuan; Xi, Liu; Xu, Andrew W; Yang, Lixing; Zack, Travis I; Zhang, Jianhua; Aksoy, B Arman; Arachchi, Harindra; Benz, Chris; Bernard, Brady; Carlin, Daniel; Cho, Juok; DiCara, Daniel; Frazer, Scott; Fuller, Gregory N; Gao, JianJiong; Gehlenborg, Nils; Haussler, David; Heiman, David I; Iype, Lisa; Jacobsen, Anders; Ju, Zhenlin; Katzman, Sol; Kim, Hoon; Knijnenburg, Theo; Kreisberg, Richard Bailey; Lawrence, Michael S; Lee, William; Leinonen, Kalle; Lin, Pei; Ling, Shiyun; Liu, Wenbin; Liu, Yingchun; Liu, Yuexin; Lu, Yiling; Mills, Gordon; Ng, Sam; Noble, Michael S; Paull, Evan; Rao, Arvind; Reynolds, Sheila; Saksena, Gordon; Sanborn, Zack; Sander, Chris; Schultz, Nikolaus; Senbabaoglu, Yasin; Shen, Ronglai; Shmulevich, Ilya; Sinha, Rileen; Stuart, Josh; Sumer, S Onur; Sun, Yichao; Tasman, Natalie; Taylor, Barry S; Voet, Doug; Weinhold, Nils; Weinstein, John N; Yang, Da; Yoshihara, Kosuke; Zheng, Siyuan; Zhang, Wei; Zou, Lihua; Abel, Ty; Sadeghi, Sara; Cohen, Mark L; Eschbacher, Jenny; Hattab, Eyas M; Raghunathan, Aditya; Schniederjan, Matthew J; Aziz, Dina; Barnett, Gene; Barrett, Wendi; Bigner, Darell D; Boice, Lori; Brewer, Cathy; Calatozzolo, Chiara; Campos, Benito; Carlotti, Carlos Gilberto; Chan, Timothy A; Cuppini, Lucia; Curley, Erin; Cuzzubbo, Stefania; Devine, Karen; DiMeco, Francesco; Duell, Rebecca; Elder, J Bradley; Fehrenbach, Ashley; Finocchiaro, Gaetano; Friedman, William; Fulop, Jordonna; Gardner, Johanna; Hermes, Beth; Herold-Mende, Christel; Jungk, Christine; Kendler, Ady; Lehman, Norman L; Lipp, Eric; Liu, Ouida; Mandt, Randy; McGraw, Mary; Mclendon, Roger; McPherson, Christopher; Neder, Luciano; Nguyen, Phuong; Noss, Ardene; Nunziata, Raffaele; Ostrom, Quinn T; Palmer, Cheryl; Perin, Alessandro; Pollo, Bianca; Potapov, Alexander; Potapova, Olga; Rathmell, W Kimryn; Rotin, Daniil; Scarpace, Lisa; Schilero, Cathy; Senecal, Kelly; Shimmel, Kristen; Shurkhay, Vsevolod; Sifri, Suzanne; Singh, Rosy; Sloan, Andrew E; Smolenski, Kathy; Staugaitis, Susan M; Steele, Ruth; Thorne, Leigh; Tirapelli, Daniela P C; Unterberg, Andreas; Vallurupalli, Mahitha; Wang, Yun; Warnick, Ronald; Williams, Felicia; Wolinsky, Yingli; Bell, Sue; Rosenberg, Mara; Stewart, Chip; Huang, Franklin; Grimsby, Jonna L; Radenbaugh, Amie J; Zhang, Jianan

    2015-06-25

    Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q codeletion or carried a TP53 mutation. Most

  3. A high utility integrated map of the pig genome

    USDA-ARS?s Scientific Manuscript database

    Background: The domestic pig is being increasingly exploited as a system for modeling human disease. It also has substantial economic importance for meat-based protein production. Physical clone maps have underpinned large-scale genomic sequencing and enabled focused cloning efforts for many genome...

  4. Community standards for genomic resources, genetic conservation, and data integration

    Treesearch

    Jill Wegrzyn; Meg Staton; Emily Grau; Richard Cronn; C. Dana. Nelson

    2017-01-01

    Genetics and genomics are increasingly important in forestry management and conservation. Next generation sequencing can increase analytical power, but still relies on building on the structure of previously acquired data. Data standards and data sharing allow the community to maximize the analytical power of high throughput genomics data. The landscape of incomplete...

  5. Hepatitis C virus genomic RNA dimerization is mediated via a kissing complex intermediate

    PubMed Central

    Shetty, Sumangala; Kim, Seungtaek; Shimakami, Tetsuro; Lemon, Stanley M.; Mihailescu, Mihaela-Rita

    2010-01-01

    With over 200 million people infected with hepatitis C virus (HCV) worldwide, there is a need for more effective and better-tolerated therapeutic strategies. The HCV genome is a positive-sense; single-stranded RNA encoding a large polyprotein cleaved at multiple sites to produce at least ten proteins, among them an error-prone RNA polymerase that confers a high mutation rate. Despite considerable overall sequence diversity, in the 3′-untranslated region of the HCV genomic RNA there is a 98-nucleotide (nt) sequence named X RNA, the first 55 nt of which (X55 RNA) are 100% conserved among all HCV strains. The X55 region has been suggested to be responsible for in vitro dimerization of the genomic RNA in the presence of the viral core protein, although the mechanism by which this occurs is unknown. In this study, we analyzed the X55 region and characterized the mechanism by which it mediates HCV genomic RNA dimerization. Similar to a mechanism proposed previously for the human immunodeficiency 1 virus (HIV-1) genome, we show that dimerization of the HCV genome involves formation of a kissing complex intermediate, which is converted to a more stable extended duplex conformation in the presence of the core protein. Mutations in the dimer linkage sequence loop sequence that prevent RNA dimerization in vitro significantly reduced but did not completely ablate the ability of HCV RNA to replicate or produce infectious virus in transfected cells. PMID:20360391

  6. Genome-wide computational analysis of potential long noncoding RNA mediated DNA:DNA:RNA triplexes in the human genome.

    PubMed

    Jalali, Saakshi; Singh, Amrita; Maiti, Souvik; Scaria, Vinod

    2017-09-02

    Only a handful of long noncoding RNAs have been functionally characterized. They are known to modulate regulation through interacting with other biomolecules in the cell: DNA, RNA and protein. Though there have been detailed investigations on lncRNA-miRNA and lncRNA-protein interactions, the interaction of lncRNAs with DNA have not been studied extensively. In the present study, we explore whether lncRNAs could modulate genomic regulation by interacting with DNA through the formation of highly stable DNA:DNA:RNA triplexes. We computationally screened 23,898 lncRNA transcripts as annotated by GENCODE, across the human genome for potential triplex forming sequence stretches (PTS). The PTS frequencies were compared across 5'UTR, CDS, 3'UTR, introns, promoter and 1000 bases downstream of the transcription termination sites. These regions were annotated by mapping to experimental regulatory regions, classes of repeat regions and transcription factors. We validated few putative triplex mediated interactions where lncRNA-gene pair interaction is via pyrimidine triplex motif using biophysical methods. We identified 20,04,034 PTS sites to be enriched in promoter and intronic regions across human genome. Additional analysis of the association of PTS with core promoter elements revealed a systematic paucity of PTS in all regulatory regions, except TF binding sites. A total of 25 transcription factors were found to be associated with PTS. Using an interaction network, we showed that a subset of the triplex forming lncRNAs, have a positive association with gene promoters. We also demonstrated an in vitro interaction of one lncRNA candidate with its predicted gene target promoter regions. Our analysis shows that PTS are enriched in gene promoter and largely associated with simple repeats. The current study suggests a major role of a subset of lncRNAs in mediating chromatin organization modulation through CTCF and NSRF proteins.

  7. An integrated computational pipeline and database to support whole-genome sequence annotation.

    PubMed

    Mungall, C J; Misra, S; Berman, B P; Carlson, J; Frise, E; Harris, N; Marshall, B; Shu, S; Kaminker, J S; Prochnik, S E; Smith, C D; Smith, E; Tupy, J L; Wiel, C; Rubin, G M; Lewis, S E

    2002-01-01

    We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture.

  8. The three-dimensional genome organization of Drosophila melanogaster through data integration.

    PubMed

    Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

    2017-07-31

    Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.

  9. Collaboration of MLLT1/ENL, Polycomb and ATM for transcription and genome integrity.

    PubMed

    Ui, Ayako; Yasui, Akira

    2016-04-25

    Polycomb group (PcG) repress, whereas Trithorax group (TrxG) activate transcription for tissue development and cellular proliferation, and misregulation of these factors is often associated with cancer. ENL (MLLT1) and AF9 (MLLT3) are fusion partners of Mixed Lineage Leukemia (MLL), TrxG proteins, and are factors in Super Elongation Complex (SEC). SEC controls transcriptional elongation to release RNA polymerase II, paused around transcription start site. In MLL rearranged leukemia, several components of SEC have been found as MLL-fusion partners and the control of transcriptional elongation is misregulated leading to tumorigenesis in MLL-SEC fused Leukemia. It has been suggested that unexpected collaboration of ENL/AF9-MLL and PcG are involved in tumorigenesis in leukemia. Recently, we found that the collaboration of ENL/AF9 and PcG led to a novel mechanism of transcriptional switch from elongation to repression under ATM-signaling for genome integrity. Activated ATM phosphorylates ENL/AF9 in SEC, and the phosphorylated ENL/AF9 binds BMI1 and RING1B, a heterodimeric E3-ubiquitin-ligase complex in Polycomb Repressive complex 1 (PRC1), and recruits PRC1 at transcriptional elongation sites to rapidly repress transcription. The ENL/AF9 in SEC- and PcG-mediated transcriptional repression promotes DSB repair near transcription sites. The implication of this is that the collaboration of ENL/AF9 in SEC and PcG ensures a rapid response of transcriptional switching from elongation to repression to neighboring genotoxic stresses for DSB repair. Therefore, these results suggested that the collaboration of ENL/AF9 and PcG in transcriptional control is required to maintain genome integrity and may be link to the MLL-ENL/AF9 leukemia.

  10. Collaboration of MLLT1/ENL, Polycomb and ATM for transcription and genome integrity

    PubMed Central

    Ui, Ayako; Yasui, Akira

    2016-01-01

    SUMMARY Polycomb group (PcG) repress, whereas Trithorax group (TrxG) activate transcription for tissue development and cellular proliferation, and misregulation of these factors is often associated with cancer. ENL (MLLT1) and AF9 (MLLT3) are fusion partners of Mixed Lineage Leukemia (MLL), TrxG proteins, and are factors in Super Elongation Complex (SEC). SEC controls transcriptional elongation to release RNA polymerase II, paused around transcription start site. In MLL rearranged leukemia, several components of SEC have been found as MLL-fusion partners and the control of transcriptional elongation is misregulated leading to tumorigenesis in MLL-SEC fused Leukemia. It has been suggested that unexpected collaboration of ENL/AF9-MLL and PcG are involved in tumorigenesis in leukemia. Recently, we found that the collaboration of ENL/AF9 and PcG led to a novel mechanism of transcriptional switch from elongation to repression under ATM-signaling for genome integrity. Activated ATM phosphorylates ENL/AF9 in SEC, and the phosphorylated ENL/AF9 binds BMI1 and RING1B, a heterodimeric E3-ubiquitin-ligase complex in Polycomb Repressive complex 1 (PRC1), and recruits PRC1 at transcriptional elongation sites to rapidly repress transcription. The ENL/AF9 in SEC- and PcG-mediated transcriptional repression promotes DSB repair near transcription sites. The implication of this is that the collaboration of ENL/AF9 in SEC and PcG ensures a rapid response of transcriptional switching from elongation to repression to neighboring genotoxic stresses for DSB repair. Therefore, these results suggested that the collaboration of ENL/AF9 and PcG in transcriptional control is required to maintain genome integrity and may be link to the MLL-ENL/AF9 leukemia. PMID:27310306

  11. Mifepristone increases gamma-retroviral infection efficiency by enhancing integration of virus into the genome of infected cells

    PubMed Central

    Solodushko, Victor; Fouty, Brian

    2010-01-01

    Gamma-retroviruses are commonly used to deliver genes to cells. Previously we demonstrated that the synthetic anti-glucocorticoid and anti-progestin agent, mifepristone, increased gamma-retroviral infection efficiency in different target cells, independent of viral titer. In this paper, we examine how this occurs. We studied the effect of mifepristone on different steps of viral infection (viral entry, viral survival, viral DNA synthesis and retrovirus integration into the host genome) in three distinct retroviral backbones using different virus recognition receptors. We also tested the potential role of glucocorticoid and progesterone receptors in mediating mifepristone’s ability to increase gamma-retroviral infectivity. We show that mifepristone increases gamma-retroviral infection efficiency by facilitating viral integration into the host genome and that this effect appears to be due to mifepristone’s anti-glucocorticoid, but not its anti-progestin, activity. These results suggest that inhibition of the glucocorticoid receptor enhances retroviral integration into the host genome and indicates that cells may have a natural protection again retroviral infection that may be reduced by glucocorticoid receptor antagonists. PMID:20485384

  12. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows.

    PubMed

    Poelstra, J W; Vijay, N; Bossu, C M; Lantz, H; Ryll, B; Müller, I; Baglione, V; Unneberg, P; Wikelski, M; Grabherr, M G; Wolf, J B W

    2014-06-20

    The importance, extent, and mode of interspecific gene flow for the evolution of species has long been debated. Characterization of genomic differentiation in a classic example of hybridization between all-black carrion crows and gray-coated hooded crows identified genome-wide introgression extending far beyond the morphological hybrid zone. Gene expression divergence was concentrated in pigmentation genes expressed in gray versus black feather follicles. Only a small number of narrow genomic islands exhibited resistance to gene flow. One prominent genomic region (<2 megabases) harbored 81 of all 82 fixed differences (of 8.4 million single-nucleotide polymorphisms in total) linking genes involved in pigmentation and in visual perception-a genomic signal reflecting color-mediated prezygotic isolation. Thus, localized genomic selection can cause marked heterogeneity in introgression landscapes while maintaining phenotypic divergence. Copyright © 2014, American Association for the Advancement of Science.

  13. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes.

    PubMed

    Shirasawa, Kenta; Bertioli, David J; Varshney, Rajeev K; Moretzsohn, Marcio C; Leal-Bertioli, Soraya C M; Thudi, Mahendar; Pandey, Manish K; Rami, Jean-Francois; Foncéka, Daniel; Gowda, Makanahally V C; Qin, Hongde; Guo, Baozhu; Hong, Yanbin; Liang, Xuanqiang; Hirakawa, Hideki; Tabata, Satoshi; Isobe, Sachiko

    2013-04-01

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)(4×), were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding.

  14. Integrated Consensus Map of Cultivated Peanut and Wild Relatives Reveals Structures of the A and B Genomes of Arachis and Divergence of the Legume Genomes

    PubMed Central

    Shirasawa, Kenta; Bertioli, David J.; Varshney, Rajeev K.; Moretzsohn, Marcio C.; Leal-Bertioli, Soraya C. M.; Thudi, Mahendar; Pandey, Manish K.; Rami, Jean-Francois; Foncéka, Daniel; Gowda, Makanahally V. C.; Qin, Hongde; Guo, Baozhu; Hong, Yanbin; Liang, Xuanqiang; Hirakawa, Hideki; Tabata, Satoshi; Isobe, Sachiko

    2013-01-01

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)4×, were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding. PMID:23315685

  15. CRISPR/Cas9-Mediated Genome Editing of Mouse Small Intestinal Organoids.

    PubMed

    Schwank, Gerald; Clevers, Hans

    2016-01-01

    The CRISPR/Cas9 system is an RNA-guided genome-editing tool that has been recently developed based on the bacterial CRISPR-Cas immune defense system. Due to its versatility and simplicity, it rapidly became the method of choice for genome editing in various biological systems, including mammalian cells. Here we describe a protocol for CRISPR/Cas9-mediated genome editing in murine small intestinal organoids, a culture system in which somatic stem cells are maintained by self-renewal, while giving rise to all major cell types of the intestinal epithelium. This protocol allows the study of gene function in intestinal epithelial homeostasis and pathophysiology and can be extended to epithelial organoids derived from other internal mouse and human organs.

  16. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of synteny with model fish genomes

    USDA-ARS?s Scientific Manuscript database

    In this paper we generated DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is compose...

  17. viruSITE—integrated database for viral genomics

    PubMed Central

    Stano, Matej; Beke, Gabor; Klucar, Lubos

    2016-01-01

    Viruses are the most abundant biological entities and the reservoir of most of the genetic diversity in the Earth's biosphere. Viral genomes are very diverse, generally short in length and compared to other organisms carry only few genes. viruSITE is a novel database which brings together high-value information compiled from various resources. viruSITE covers the whole universe of viruses and focuses on viral genomes, genes and proteins. The database contains information on virus taxonomy, host range, genome features, sequential relatedness as well as the properties and functions of viral genes and proteins. All entries in the database are linked to numerous information resources. The above-mentioned features make viruSITE a comprehensive knowledge hub in the field of viral genomics. The web interface of the database was designed so as to offer an easy-to-navigate, intuitive and user-friendly environment. It provides sophisticated text searching and a taxonomy-based browsing system. viruSITE also allows for an alternative approach based on sequence search. A proprietary genome browser generates a graphical representation of viral genomes. In addition to retrieving and visualising data, users can perform comparative genomics analyses using a variety of tools. Database URL: http://www.virusite.org/ PMID:28025349

  18. Multiplexed Targeted Genome Engineering Using a Universal Nuclease-Assisted Vector Integration System.

    PubMed

    Brown, Alexander; Woods, Wendy S; Perez-Pinera, Pablo

    2016-07-15

    Engineered nucleases are capable of efficiently modifying complex genomes through introduction of targeted double-strand breaks. However, mammalian genome engineering remains limited by low efficiency of heterologous DNA integration at target sites, which is typically performed through homologous recombination, a complex, ineffective and costly process. In this study, we developed a multiplexable and universal nuclease-assisted vector integration system for rapid generation of gene knock outs using selection that does not require customized targeting vectors, thereby minimizing the cost and time frame needed for gene editing. Importantly, this system is capable of remodeling native mammalian genomes through integration of DNA, up to 50 kb, enabling rapid generation and screening of multigene knockouts from a single transfection. These results support that nuclease assisted vector integration is a robust tool for genome-scale gene editing that will facilitate diverse applications in synthetic biology and gene therapy.

  19. Genetic and statistical study of HIV integration in the human genome

    NASA Astrophysics Data System (ADS)

    Sequeira, Inês J.; Gonçalves, Juliana; Moreira, Elsa; Mexia, João T.; Rueff, José; Brás, Aldina

    2013-10-01

    Integration of the human immunodeficiency virus (HIV) DNA into human genome is essential for HIV-induced disease. The human genome is organized into chromosomes and within these we can define the chromosomal fragile sites. Our aim is to contribute to help clarifying the integration sites preferences of HIV1 and HIV2 in fragile or non-fragile regions. Here we apply statistical techniques, namely non-parametric tests and analysis of variance for analyzing two sets of data of HIV1 and HIV2 integrations in the human genome. The results show that the integrations occur significantly with more intensity in the non-fragile regions of the human genome and that the HIV1 in particular has the major contribution to this fact. This study could have implications in human disease.

  20. GenomeVISTA—an integrated software package for whole-genome alignment and visualization

    PubMed Central

    Poliakov, Alexandre; Foong, Justin; Brudno, Michael; Dubchak, Inna

    2014-01-01

    Summary: With the ubiquitous generation of complete genome assemblies for a variety of species, efficient tools for whole-genome alignment along with user-friendly visualization are critically important. Our VISTA family of tools for comparative genomics, based on algorithms for pairwise and multiple alignments of genomic sequences and whole-genome assemblies, has become one of the standard techniques for comparative analysis. Most of the VISTA programs have been implemented as Web-accessible servers and are extensively used by the biomedical community. In this manuscript, we introduce GenomeVISTA: a novel implementation that incorporates most features of the VISTA family—fast and accurate alignment, visualization capabilities, GUI and analytical tools within a stand-alone software package. GenomeVISTA thus provides flexibility and security for users who need to conduct whole-genome comparisons on their own computers. Availability and implementation: Implemented in Perl, C/C++ and Java, the source code is freely available for download at the VISTA Web site: http://genome.lbl.gov/vista/ Contact: avpoliakov@lbl.gov or ildubchak@lbl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24860159

  1. GenomeVISTA--an integrated software package for whole-genome alignment and visualization.

    PubMed

    Poliakov, Alexandre; Foong, Justin; Brudno, Michael; Dubchak, Inna

    2014-09-15

    With the ubiquitous generation of complete genome assemblies for a variety of species, efficient tools for whole-genome alignment along with user-friendly visualization are critically important. Our VISTA family of tools for comparative genomics, based on algorithms for pairwise and multiple alignments of genomic sequences and whole-genome assemblies, has become one of the standard techniques for comparative analysis. Most of the VISTA programs have been implemented as Web-accessible servers and are extensively used by the biomedical community. In this manuscript, we introduce GenomeVISTA: a novel implementation that incorporates most features of the VISTA family--fast and accurate alignment, visualization capabilities, GUI and analytical tools within a stand-alone software package. GenomeVISTA thus provides flexibility and security for users who need to conduct whole-genome comparisons on their own computers. Implemented in Perl, C/C++ and Java, the source code is freely available for download at the VISTA Web site: http://genome.lbl.gov/vista/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Brassica ASTRA: an integrated database for Brassica genomic research.

    PubMed

    Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David

    2005-01-01

    Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.

  3. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    NEALSON, KENNETH H.

    2013-10-15

    products of dissimilatory iron reduction. Geochim. Cosmochim. Acta. 74:574-583. 10. Karpinets, T.V., A.Y Obraztsova, Y. Wang, D.D. Schmoyer, G.H. Kora, B.H. Park, M.H. Serres, M.F. Ropmine, M.L. Land, T.B. Kothe, J.K. Fredrickson, K.H. Nealson, and E.C. Uberbacher 2010. Conserved synteny at the protein family level reveals genes underlying Shewanella species? cold tolerance and predicts their novel phenotypes. Funct. Integr. Genomics 10: 97 ? 110. (DOI 10.1007/s10143-009-0142-y) 11. Bretschger, O., A.C.M. Cheung, F. Mansfeld, and K.H. Nealson. 2010. Comparative microbial fuel cell evaluations of Shewanella spp. Electroanalysis 22: 883-894. 12. McLean, J.S., G. Wanger, Y.A. Gorby, M. Wainstein, J. McQuaid, Shun?ichi Ishii, O. Bretschger, H. Beyanal, K.H. Nealson. 2010. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Env. Sci. Technol. 44:2721-2717. 13. El-Naggar, M., G. Wanger, K.M. Leung, T.D. Yuzvinsky, G. Southam, J. Yang, W.M. Lau, K.H. Nealson, and Y.A. Gorby. 2010. Electrical Transport Along Bacterial Nanowires from Shewanella oneidensis MR-1 Proc. Nat. Acad. Sci. USA 107:18127-18131. 14. Biffinger, J.C., L.A. Fitzgerald, R. Ray, B.J. Little, S.E. Lizewski, E.R. Petersen, B.R. Ringeisen, W.C. Sanders, P.E. Sheehan, J.J. Pietron, J.W. Baldwin, L.J. Nadeau, G.R. Johnson, M. Ribbens, S.E. Finkel, K.H. Nealson. 2010. The utility of Shewanella japonica for microbial fuel cells. Bioresource Technol. 102:290-297. 15. Rodionov, D. , C. Yang, X. Li, I. Rodionova, Y. Wang, A.Y. Obraztsova, O. P. Zagnitko, R. Overbeek, M. F. Romine, S. Reed, J.K. Fredrickson, K.H. Nealson, A.L. Osterman. 2010. Genomic encyclopedia of sugar utilization pathways in the Shewanella genus. BMC Genomics 2010, 11:494 16. Kan, J., L. Hsu, A.C.M. Cheung, M. Pirbazari, and K.H. Nealson. 2011. Current production by bacterial communities in microbial fuel cells enriched from wastewater sludge

  4. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    SciTech Connect

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  5. Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry

    PubMed Central

    Desiere, Frank; Deutsch, Eric W; Nesvizhskii, Alexey I; Mallick, Parag; King, Nichole L; Eng, Jimmy K; Aderem, Alan; Boyle, Rose; Brunner, Erich; Donohoe, Samuel; Fausto, Nelson; Hafen, Ernst; Hood, Lee; Katze, Michael G; Kennedy, Kathleen A; Kregenow, Floyd; Lee, Hookeun; Lin, Biaoyang; Martin, Dan; Ranish, Jeffrey A; Rawlings, David J; Samelson, Lawrence E; Shiio, Yuzuru; Watts, Julian D; Wollscheid, Bernd; Wright, Michael E; Yan, Wei; Yang, Lihong; Yi, Eugene C; Zhang, Hui; Aebersold, Ruedi

    2005-01-01

    A crucial aim upon the completion of the human genome is the verification and functional annotation of all predicted genes and their protein products. Here we describe the mapping of peptides derived from accurate interpretations of protein tandem mass spectrometry (MS) data to eukaryotic genomes and the generation of an expandable resource for integration of data from many diverse proteomics experiments. Furthermore, we demonstrate that peptide identifications obtained from high-throughput proteomics can be integrated on a large scale with the human genome. This resource could serve as an expandable repository for MS-derived proteome information. PMID:15642101

  6. Approaches to integrating germline and tumor genomic data in cancer research

    PubMed Central

    Feigelson, Heather Spencer; Goddard, Katrina A.B.; Hollombe, Celine; Tingle, Sharna R.; Gillanders, Elizabeth M.; Mechanic, Leah E.; Nelson, Stefanie A.

    2014-01-01

    Cancer is characterized by a diversity of genetic and epigenetic alterations occurring in both the germline and somatic (tumor) genomes. Hundreds of germline variants associated with cancer risk have been identified, and large amounts of data identifying mutations in the tumor genome that participate in tumorigenesis have been generated. Increasingly, these two genomes are being explored jointly to better understand how cancer risk alleles contribute to carcinogenesis and whether they influence development of specific tumor types or mutation profiles. To understand how data from germline risk studies and tumor genome profiling is being integrated, we reviewed 160 articles describing research that incorporated data from both genomes, published between January 2009 and December 2012, and summarized the current state of the field. We identified three principle types of research questions being addressed using these data: (i) use of tumor data to determine the putative function of germline risk variants; (ii) identification and analysis of relationships between host genetic background and particular tumor mutations or types; and (iii) use of tumor molecular profiling data to reduce genetic heterogeneity or refine phenotypes for germline association studies. We also found descriptive studies that compared germline and tumor genomic variation in a gene or gene family, and papers describing research methods, data sources, or analytical tools. We identified a large set of tools and data resources that can be used to analyze and integrate data from both genomes. Finally, we discuss opportunities and challenges for cancer research that integrates germline and tumor genomics data. PMID:25115441

  7. Accuracy and efficiency define Bxb1 integrase as the best of fifteen candidate serine recombinases for the integration of DNA into the human genome

    PubMed Central

    2013-01-01

    Background Phage-encoded serine integrases, such as φC31 integrase, are widely used for genome engineering. Fifteen such integrases have been described but their utility for genome engineering has not been compared in uniform assays. Results We have compared fifteen serine integrases for their utility for DNA manipulations in mammalian cells after first demonstrating that all were functional in E. coli. Chromosomal recombination reporters were used to show that seven integrases were active on chromosomally integrated DNA in human fibroblasts and mouse embryonic stem cells. Five of the remaining eight enzymes were active on extra-chromosomal substrates thereby demonstrating that the ability to mediate extra-chromosomal recombination is no guide to ability to mediate site-specific recombination on integrated DNA. All the integrases that were active on integrated DNA also promoted DNA integration reactions that were not mediated through conservative site-specific recombination or damaged the recombination sites but the extent of these aberrant reactions varied over at least an order of magnitude. Bxb1 integrase yielded approximately two-fold more recombinants and displayed about two fold less damage to the recombination sites than the next best recombinase; φC31 integrase. Conclusions We conclude that the Bxb1 and φC31 integrases are the reagents of choice for genome engineering in vertebrate cells and that DNA damage repair is a major limitation upon the utility of this class of site-specific recombinase. PMID:24139482

  8. Mapping the telomere integrated genome of human herpesvirus 6A and 6B.

    PubMed

    Arbuckle, Jesse H; Pantry, Shara N; Medveczky, Maria M; Prichett, Joshua; Loomis, Kristin S; Ablashi, Dharam; Medveczky, Peter G

    2013-07-20

    Human herpesvirus 6B (HHV-6B) is the causative agent of roseola infantum. HHV-6A and 6B can reactivate in immunosuppressed individuals and are linked with severe inflammatory response, organ rejection and central nervous system diseases. About 0.85% of the US and UK population carries an integrated HHV-6 genome in all nucleated cells through germline transmission. We have previously reported that the HHV-6A genome integrated in telomeres of patients suffering from neurological dysfunction and also in telomeres of tissue culture cells. We now report that HHV-6B also integrates in telomeres during latency. Detailed mapping of the integrated viral genomes demonstrates that a single HHV-6 genome integrates and telomere repeats join the left end of the integrated viral genome. When HEK-293 cells carrying integrated HHV-6A were exposed to the histone deacetylase inhibitor Trichostatin A, circularization and/or formation of concatamers were detected and this assay could be used to distinguish between lytic replication and latency.

  9. INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles.

    PubMed

    Alam, Intikhab; Antunes, André; Kamau, Allan Anthony; Ba Alawi, Wail; Kalkatawi, Manal; Stingl, Ulrich; Bajic, Vladimir B

    2013-01-01

    The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes. We developed a data warehouse system (INDIGO) that enables the integration of annotations for exploration and analysis of newly sequenced microbial genomes. INDIGO offers an opportunity to construct complex queries and combine annotations from multiple sources starting from genomic sequence to protein domain, gene ontology and pathway levels. This data warehouse is aimed at being populated with information from genomes of pure cultures and uncultured single cells of Red Sea bacteria and Archaea. Currently, INDIGO contains information from Salinisphaera shabanensis, Haloplasma contractile, and Halorhabdus tiamatea - extremophiles isolated from deep-sea anoxic brine lakes of the Red Sea. We provide examples of utilizing the system to gain new insights into specific aspects on the unique lifestyle and adaptations of these organisms to extreme environments. We developed a data warehouse system, INDIGO, which enables comprehensive integration of information from various resources to be used for annotation, exploration and analysis of microbial genomes. It will be regularly updated and extended with new genomes. It is aimed to serve as a resource dedicated to the Red Sea microbes. In addition, through INDIGO, we provide our Automatic Annotation of Microbial Genomes (AAMG) pipeline. The INDIGO web server is freely available at http://www.cbrc.kaust.edu.sa/indigo.

  10. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    USDA-ARS?s Scientific Manuscript database

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  11. Integration of Disease Specific Clinical and Genomics Datasets using I2B2 Framework.

    PubMed

    2015-01-01

    The availability of a patient's genomic profile along with the clinical profile for providing individualized care and treatment is paving the road for a new era of personalized medicine, and is an important area of focus in current biomedical research. One of the prominent and globally implemented solutions for clinical and genomics data integration in biomedical research is an NIH funded NCBC initiative--Informatics for Integrating Biology and the Beside (I2B2). This paper presents the development of a pilot prototype for integrating patient's clinical and genomics datasets using open source and scalable I2B2 Framework. It focuses on disease specific clinical data and genomic variants, when combined together can be used for informed decision making in clinical practices by healthcare professionals and for further investigations by biomedical researchers. The research was carried out using a case study of King Abdullah International Medical Research Center (KAIMRC) in collaboration with King Fahad National Guard Hospital (KFNGH).

  12. Ku-mediated coupling of DNA cleavage and repair during programmed genome rearrangements in the ciliate Paramecium tetraurelia.

    PubMed

    Marmignon, Antoine; Bischerour, Julien; Silve, Aude; Fojcik, Clémentine; Dubois, Emeline; Arnaiz, Olivier; Kapusta, Aurélie; Malinsky, Sophie; Bétermier, Mireille

    2014-08-01

    During somatic differentiation, physiological DNA double-strand breaks (DSB) can drive programmed genome rearrangements (PGR), during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES). IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ) pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium DNA cleavage

  13. Ku-Mediated Coupling of DNA Cleavage and Repair during Programmed Genome Rearrangements in the Ciliate Paramecium tetraurelia

    PubMed Central

    Marmignon, Antoine; Bischerour, Julien; Silve, Aude; Fojcik, Clémentine; Dubois, Emeline; Arnaiz, Olivier; Kapusta, Aurélie; Malinsky, Sophie; Bétermier, Mireille

    2014-01-01

    During somatic differentiation, physiological DNA double-strand breaks (DSB) can drive programmed genome rearrangements (PGR), during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES). IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ) pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium DNA cleavage

  14. Modeling the integration of bacterial rRNA fragments into the human cancer genome.

    PubMed

    Sieber, Karsten B; Gajer, Pawel; Dunning Hotopp, Julie C

    2016-03-21

    Cancer is a disease driven by the accumulation of genomic alterations, including the integration of exogenous DNA into the human somatic genome. We previously identified in silico evidence of DNA fragments from a Pseudomonas-like bacteria integrating into the 5'-UTR of four proto-oncogenes in stomach cancer sequencing data. The functional and biological consequences of these bacterial DNA integrations remain unknown. Modeling of these integrations suggests that the previously identified sequences cover most of the sequence flanking the junction between the bacterial and human DNA. Further examination of these reads reveals that these integrations are rich in guanine nucleotides and the integrated bacterial DNA may have complex transcript secondary structures. The models presented here lay the foundation for future experiments to test if bacterial DNA integrations alter the transcription of the human genes.

  15. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  16. An integrated encyclopedia of DNA elements in the human genome.

    PubMed

    2012-09-06

    The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

  17. An Integrated Review of Emoticons in Computer-Mediated Communication

    PubMed Central

    Aldunate, Nerea; González-Ibáñez, Roberto

    2017-01-01

    Facial expressions constitute a rich source of non-verbal cues in face-to-face communication. They provide interlocutors with resources to express and interpret verbal messages, which may affect their cognitive and emotional processing. Contrarily, computer-mediated communication (CMC), particularly text-based communication, is limited to the use of symbols to convey a message, where facial expressions cannot be transmitted naturally. In this scenario, people use emoticons as paralinguistic cues to convey emotional meaning. Research has shown that emoticons contribute to a greater social presence as a result of the enrichment of text-based communication channels. Additionally, emoticons constitute a valuable resource for language comprehension by providing expressivity to text messages. The latter findings have been supported by studies in neuroscience showing that particular brain regions involved in emotional processing are also activated when people are exposed to emoticons. To reach an integrated understanding of the influence of emoticons in human communication on both socio-cognitive and neural levels, we review the literature on emoticons in three different areas. First, we present relevant literature on emoticons in CMC. Second, we study the influence of emoticons in language comprehension. Finally, we show the incipient research in neuroscience on this topic. This mini review reveals that, while there are plenty of studies on the influence of emoticons in communication from a social psychology perspective, little is known about the neurocognitive basis of the effects of emoticons on communication dynamics. PMID:28111564

  18. An Integrated Review of Emoticons in Computer-Mediated Communication.

    PubMed

    Aldunate, Nerea; González-Ibáñez, Roberto

    2016-01-01

    Facial expressions constitute a rich source of non-verbal cues in face-to-face communication. They provide interlocutors with resources to express and interpret verbal messages, which may affect their cognitive and emotional processing. Contrarily, computer-mediated communication (CMC), particularly text-based communication, is limited to the use of symbols to convey a message, where facial expressions cannot be transmitted naturally. In this scenario, people use emoticons as paralinguistic cues to convey emotional meaning. Research has shown that emoticons contribute to a greater social presence as a result of the enrichment of text-based communication channels. Additionally, emoticons constitute a valuable resource for language comprehension by providing expressivity to text messages. The latter findings have been supported by studies in neuroscience showing that particular brain regions involved in emotional processing are also activated when people are exposed to emoticons. To reach an integrated understanding of the influence of emoticons in human communication on both socio-cognitive and neural levels, we review the literature on emoticons in three different areas. First, we present relevant literature on emoticons in CMC. Second, we study the influence of emoticons in language comprehension. Finally, we show the incipient research in neuroscience on this topic. This mini review reveals that, while there are plenty of studies on the influence of emoticons in communication from a social psychology perspective, little is known about the neurocognitive basis of the effects of emoticons on communication dynamics.

  19. Integrating Genomes, Brain and Behavior in the Study of Songbirds

    PubMed Central

    Clayton, David F.; Balakrishnan, Christopher N.; London, Sarah E.

    2010-01-01

    Songbirds share some essential traits but are extraordinarily diverse, allowing comparative analyses aimed at identifying specific genotype–phenotype associations. This diversity encompasses traits like vocal communication and complex social behaviors that are of great interest to humans, but that are not well represented in other accessible research organisms. Many songbirds are readily observable in nature and thus afford unique insight into the links between environment and organism. The distinctive organization of the songbird brain will facilitate analysis of genomic links to brain and behavior. Access to the zebra finch genome sequence will, therefore, prompt new questions and provide the ability to answer those questions. PMID:19788884

  20. CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription

    PubMed Central

    Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M.; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T.; Wilczynski, Grzegorz M.; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

    2015-01-01

    Summary Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced ChIA-PET strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CTCF and RNAPII with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes towards CTCF-foci for coordinated transcription. Furthermore, we show that haplotype-variants and allelic-interactions have differential effects on chromosome configuration influencing gene expression and may provide mechanistic insights into functions associated with disease susceptibility. 3D-genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D-genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. PMID:26686651

  1. Genomic regions responsible for amenability to Agrobacterium-mediated transformation in barley

    PubMed Central

    Hisano, Hiroshi; Sato, Kazuhiro

    2016-01-01

    Different plant cultivars of the same genus and species can exhibit vastly different genetic transformation efficiencies. However, the genetic factors underlying these differences in transformation rate remain largely unknown. In barley, ‘Golden Promise’ is one of a few cultivars reliable for Agrobacterium-mediated transformation. By contrast, cultivar ‘Haruna Nijo’ is recalcitrant to genetic transformation. We identified genomic regions of barley important for successful transformation with Agrobacterium, utilizing the ‘Haruna Nijo’ × ‘Golden Promise’ F2 generation and genotyping by 124 genome-wide SNP markers. We observed significant segregation distortions of these markers from the expected 1:2:1 ratio toward the ‘Golden Promise’-type in regions of chromosomes 2H and 3H, indicating that the alleles of ‘Golden Promise’ in these regions might contribute to transformation efficiency. The same regions, which we termed Transformation Amenability (TFA) regions, were also conserved in transgenic F2 plants generated from a ‘Morex’ × ‘Golden Promise’ cross. The genomic regions identified herein likely include necessary factors for Agrobacterium-mediated transformation in barley. The potential to introduce these loci into any haplotype of barley opens the door to increasing the efficiency of transformation for target alleles into any haplotype of barley by the TFA-based methods proposed in this report. PMID:27874056

  2. Genome-wide signatures of male-mediated migration shaping the Indian gene pool.

    PubMed

    ArunKumar, GaneshPrasad; Tatarinova, Tatiana V; Duty, Jeff; Rollo, Debra; Syama, Adhikarla; Arun, Varatharajan Santhakumari; Kavitha, Valampuri John; Triska, Petr; Greenspan, Bennett; Wells, R Spencer; Pitchappan, Ramasamy

    2015-09-01

    Multiple questions relating to contributions of cultural and demographical factors in the process of human geographical dispersal remain largely unanswered. India, a land of early human settlement and the resulting diversity is a good place to look for some of the answers. In this study, we explored the genetic structure of India using a diverse panel of 78 males genotyped using the GenoChip. Their genome-wide single-nucleotide polymorphism (SNP) diversity was examined in the context of various covariates that influence Indian gene pool. Admixture analysis of genome-wide SNP data showed high proportion of the Southwest Asian component in all of the Indian samples. Hierarchical clustering based on admixture proportions revealed seven distinct clusters correlating to geographical and linguistic affiliations. Convex hull overlay of Y-chromosomal haplogroups on the genome-wide SNP principal component analysis brought out distinct non-overlapping polygons of F*-M89, H*-M69, L1-M27, O2a-M95 and O3a3c1-M117, suggesting a male-mediated migration and expansion of the Indian gene pool. Lack of similar correlation with mitochondrial DNA clades indicated a shared genetic ancestry of females. We suggest that ancient male-mediated migratory events and settlement in various regional niches led to the present day scenario and peopling of India.

  3. Integrated genome-based studies of Shewanella ecophysiology

    SciTech Connect

    Segre Daniel; Beg Qasim

    2012-02-14

    This project was a component of the Shewanella Federation and, as such, contributed to the overall goal of applying the genomic tools to better understand eco-physiology and speciation of respiratory-versatile members of Shewanella genus. Our role at Boston University was to perform bioreactor and high throughput gene expression microarrays, and combine dynamic flux balance modeling with experimentally obtained transcriptional and gene expression datasets from different growth conditions. In the first part of project, we designed the S. oneidensis microarray probes for Affymetrix Inc. (based in California), then we identified the pathways of carbon utilization in the metal-reducing marine bacterium Shewanella oneidensis MR-1, using our newly designed high-density oligonucleotide Affymetrix microarray on Shewanella cells grown with various carbon sources. Next, using a combination of experimental and computational approaches, we built algorithm and methods to integrate the transcriptional and metabolic regulatory networks of S. oneidensis. Specifically, we combined mRNA microarray and metabolite measurements with statistical inference and dynamic flux balance analysis (dFBA) to study the transcriptional response of S. oneidensis MR-1 as it passes through exponential, stationary, and transition phases. By measuring time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different nutrient compositions (minimal lactate and nutritionally rich LB medium), we obtain detailed snapshots of the regulatory strategies used by this bacterium to cope with gradually changing nutrient availability. In addition to traditional clustering, which provides a first indication of major regulatory trends and transcription factors activities, we developed and implemented a new computational approach for Dynamic Detection of Transcriptional Triggers (D2T2). This new method allows us to infer a putative topology of transcriptional dependencies

  4. ADP-ribosylation is involved in the integration of foreign DNA into the mammalian cell genome.

    PubMed Central

    Farzaneh, F; Panayotou, G N; Bowler, L D; Hardas, B D; Broom, T; Walther, C; Shall, S

    1988-01-01

    The most commonly used DNA transfection method, which employs the calcium phosphate co-precipitation of the donor DNA, involves several discrete steps (1,2). These include the uptake of the donor DNA by the recipient cells, the transport of the DNA to the nucleus, transient expression prior to integration into the host cell genome, concatenation and integration of the transfected DNA into the host cell genome and finally the stable expression of the integrated genes (2,3). Both the concatenation and the integration of the donor DNA into the host genome involve the formation and ligation of DNA strand-breaks. In the present study we demonstrate that the nuclear enzyme, adenosine diphosphoribosyl transferase (ADPRT, E.C. 2.4.2.30), which is dependent on the presence of DNA strand breaks for its activity (4,5) and necessary for the efficient ligation of DNA strand-breaks in eukaryotic cells (4,6), is required for the integration of donor DNA into the host genome. However, ADPRT activity does not influence the uptake of DNA into the cell, its episomal maintenance or replication, nor its expression either before or after integration into the host genome. These observations strongly suggest the involvement of ADPRT activity in eukaryotic DNA recombination events. Images PMID:3144706

  5. Genomic landscape of human, bat, and ex vivo DNA transposon integrations.

    PubMed

    Campos-Sánchez, Rebeca; Kapusta, Aurélie; Feschotte, Cédric; Chiaromonte, Francesca; Makova, Kateryna D

    2014-07-01

    The integration and fixation preferences of DNA transposons, one of the major classes of eukaryotic transposable elements, have never been evaluated comprehensively on a genome-wide scale. Here, we present a detailed study of the distribution of DNA transposons in the human and bat genomes. We studied three groups of DNA transposons that integrated at different evolutionary times: 1) ancient (>40 My) and currently inactive human elements, 2) younger (<40 My) bat elements, and 3) ex vivo integrations of piggyBat and Sleeping Beauty elements in HeLa cells. Although the distribution of ex vivo elements reflected integration preferences, the distribution of human and (to a lesser extent) bat elements was also affected by selection. We used regression techniques (linear, negative binomial, and logistic regression models with multiple predictors) applied to 20-kb and 1-Mb windows to investigate how the genomic landscape in the vicinity of DNA transposons contributes to their integration and fixation. Our models indicate that genomic landscape explains 16-79% of variability in DNA transposon genome-wide distribution. Importantly, we not only confirmed previously identified predictors (e.g., DNA conformation and recombination hotspots) but also identified several novel predictors (e.g., signatures of double-strand breaks and telomere hexamer). Ex vivo integrations showed a bias toward actively transcribed regions. Older DNA transposons were located in genomic regions scarce in most conserved elements-likely reflecting purifying selection. Our study highlights how DNA transposons are integral to the evolution of bat and human genomes, and has implications for the development of DNA transposon assays for gene therapy and mutagenesis applications.

  6. MAR-mediated integration of plasmid vectors for in vivo gene transfer and regulation.

    PubMed

    Puttini, Stefania; van Zwieten, Ruthger W; Saugy, Damien; Lekka, Małgorzata; Hogger, Florence; Ley, Deborah; Kulik, Andrzej J; Mermod, Nicolas

    2013-12-02

    The in vivo transfer of naked plasmid DNA into organs such as muscles is commonly used to assess the expression of prophylactic or therapeutic genes in animal disease models. In this study, we devised vectors allowing a tight regulation of transgene expression in mice from such non-viral vectors using a doxycycline-controlled network of activator and repressor proteins. Using these vectors, we demonstrate proper physiological response as consequence of the induced expression of two therapeutically relevant proteins, namely erythropoietin and utrophin. Kinetic studies showed that the induction of transgene expression was only transient, unless epigenetic regulatory elements termed Matrix Attachment Regions, or MAR, were inserted upstream of the regulated promoters. Using episomal plasmid rescue and quantitative PCR assays, we observed that similar amounts of plasmids remained in muscles after electrotransfer with or without MAR elements, but that a significant portion had integrated into the muscle fiber chromosomes. Interestingly, the MAR elements were found to promote plasmid genomic integration but to oppose silencing effects in vivo, thereby mediating long-term expression. This study thus elucidates some of the determinants of transient or sustained expression from the use of non-viral regulated vectors in vivo.

  7. MAR-mediated integration of plasmid vectors for in vivo gene transfer and regulation

    PubMed Central

    2013-01-01

    Background The in vivo transfer of naked plasmid DNA into organs such as muscles is commonly used to assess the expression of prophylactic or therapeutic genes in animal disease models. Results In this study, we devised vectors allowing a tight regulation of transgene expression in mice from such non-viral vectors using a doxycycline-controlled network of activator and repressor proteins. Using these vectors, we demonstrate proper physiological response as consequence of the induced expression of two therapeutically relevant proteins, namely erythropoietin and utrophin. Kinetic studies showed that the induction of transgene expression was only transient, unless epigenetic regulatory elements termed Matrix Attachment Regions, or MAR, were inserted upstream of the regulated promoters. Using episomal plasmid rescue and quantitative PCR assays, we observed that similar amounts of plasmids remained in muscles after electrotransfer with or without MAR elements, but that a significant portion had integrated into the muscle fiber chromosomes. Interestingly, the MAR elements were found to promote plasmid genomic integration but to oppose silencing effects in vivo, thereby mediating long-term expression. Conclusions This study thus elucidates some of the determinants of transient or sustained expression from the use of non-viral regulated vectors in vivo. PMID:24295286

  8. Integrated genomics of Mucorales reveals novel therapeutic targets

    USDA-ARS?s Scientific Manuscript database

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  9. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    USDA-ARS?s Scientific Manuscript database

    The Cucurbitaceae includes important crops as cucumber, melon, watermelon, and squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular li...

  10. Integrated genome-based studies of Shewanella Ecophysiology

    SciTech Connect

    Tiedje, James M.; Konstantinidis, Kostas; Worden, Mark

    2014-01-08

    The aim of the work reported is to study Shewanella population genomics, and to understand the evolution, ecophysiology, and speciation of Shewanella. The tasks supporting this aim are: to study genetic and ecophysiological bases defining the core and diversification of Shewanella species; to determine gene content patterns along redox gradients; and to Investigate the evolutionary processes, patterns and mechanisms of Shewanella.

  11. Integrated genomic approaches to enhance genetic resistance in chickens

    USDA-ARS?s Scientific Manuscript database

    The chicken has led the way amongst agricultural animal species in infectious disease control and, in particular, selection for genetic resistance. The generation of the chicken genome sequence and the availability of other empowering tools and resources greatly enhance the ability to select for enh...

  12. CRISPR-mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice.

    PubMed

    Xu, Li; Park, Ki Ho; Zhao, Lixia; Xu, Jing; El Refaey, Mona; Gao, Yandi; Zhu, Hua; Ma, Jianjie; Han, Renzhi

    2016-03-01

    Duchenne muscular dystrophy (DMD) is a degenerative muscle disease caused by genetic mutations that lead to the disruption of dystrophin in muscle fibers. There is no curative treatment for this devastating disease. Clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) has emerged as a powerful tool for genetic manipulation and potential therapy. Here we demonstrate that CRIPSR-mediated genome editing efficiently excised a 23-kb genomic region on the X-chromosome covering the mutant exon 23 in a mouse model of DMD, and restored dystrophin expression and the dystrophin-glycoprotein complex at the sarcolemma of skeletal muscles in live mdx mice. Electroporation-mediated transfection of the Cas9/gRNA constructs in the skeletal muscles of mdx mice normalized the calcium sparks in response to osmotic shock. Adenovirus-mediated transduction of Cas9/gRNA greatly reduced the Evans blue dye uptake of skeletal muscles at rest and after downhill treadmill running. This study provides proof evidence for permanent gene correction in DMD.

  13. CRISPR-mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice

    PubMed Central

    Xu, Li; Park, Ki Ho; Zhao, Lixia; Xu, Jing; El Refaey, Mona; Gao, Yandi; Zhu, Hua; Ma, Jianjie; Han, Renzhi

    2016-01-01

    Duchenne muscular dystrophy (DMD) is a degenerative muscle disease caused by genetic mutations that lead to the disruption of dystrophin in muscle fibers. There is no curative treatment for this devastating disease. Clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) has emerged as a powerful tool for genetic manipulation and potential therapy. Here we demonstrate that CRIPSR-mediated genome editing efficiently excised a 23-kb genomic region on the X-chromosome covering the mutant exon 23 in a mouse model of DMD, and restored dystrophin expression and the dystrophin-glycoprotein complex at the sarcolemma of skeletal muscles in live mdx mice. Electroporation-mediated transfection of the Cas9/gRNA constructs in the skeletal muscles of mdx mice normalized the calcium sparks in response to osmotic shock. Adenovirus-mediated transduction of Cas9/gRNA greatly reduced the Evans blue dye uptake of skeletal muscles at rest and after downhill treadmill running. This study provides proof evidence for permanent gene correction in DMD. PMID:26449883

  14. FEATnotator: A tool for integrated annotation of sequence features and variation, facilitating interpretation in genomics experiments.

    PubMed

    Podicheti, Ram; Mockaitis, Keithanne

    2015-06-01

    As approaches are sought for more efficient and democratized uses of non-model and expanded model genomics references, ease of integration of genomic feature datasets is especially desirable in multidisciplinary research communities. Valuable conclusions are often missed or slowed when researchers refer experimental results to a single reference sequence that lacks integrated pan-genomic and multi-experiment data in accessible formats. Association of genomic positional information, such as results from an expansive variety of next-generation sequencing experiments, with annotated reference features such as genes or predicted protein binding sites, provides the context essential for conclusions and ongoing research. When the experimental system includes polymorphic genomic inputs, rapid calculation of gene structural and protein translational effects of sequence variation from the reference can be invaluable. Here we present FEATnotator, a lightweight, fast and easy to use open source software program that integrates and reports overlap and proximity in genomic information from any user-defined datasets including those from next generation sequencing applications. We illustrate use of the tool by summarizing whole genome sequence variation of a widely used natural isolate of Arabidopsis thaliana in the context of gene models of the reference accession. Previous discovery of a protein coding deletion influencing root development is replicated rapidly. Appropriate even in investigations of a single gene or genic regions such as QTL, comprehensive reports provided by FEATnotator better prepare researchers for interpretation of their experimental results. The tool is available for download at http://featnotator.sourceforge.net.

  15. The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse.

    PubMed

    Blake, Judith A; Bult, Carol J; Eppig, Janan T; Kadin, James A; Richardson, Joel E

    2014-01-01

    The Mouse Genome Database (MGD) (http://www.informatics.jax.org) is the community model organism database resource for the laboratory mouse, a premier animal model for the study of genetic and genomic systems relevant to human biology and disease. MGD maintains a comprehensive catalog of genes, functional RNAs and other genome features as well as heritable phenotypes and quantitative trait loci. The genome feature catalog is generated by the integration of computational and manual genome annotations generated by NCBI, Ensembl and Vega/HAVANA. MGD curates and maintains the comprehensive listing of functional annotations for mouse genes using the Gene Ontology, and MGD curates and integrates comprehensive phenotype annotations including associations of mouse models with human diseases. Recent improvements include integration of the latest mouse genome build (GRCm38), improved access to comparative and functional annotations for mouse genes with expanded representation of comparative vertebrate genomes and new loads of phenotype data from high-throughput phenotyping projects. All MGD resources are freely available to the research community.

  16. The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse

    PubMed Central

    Blake, Judith A.; Bult, Carol J.; Eppig, Janan T.; Kadin, James A.; Richardson, Joel E.

    2014-01-01

    The Mouse Genome Database (MGD) (http://www.informatics.jax.org) is the community model organism database resource for the laboratory mouse, a premier animal model for the study of genetic and genomic systems relevant to human biology and disease. MGD maintains a comprehensive catalog of genes, functional RNAs and other genome features as well as heritable phenotypes and quantitative trait loci. The genome feature catalog is generated by the integration of computational and manual genome annotations generated by NCBI, Ensembl and Vega/HAVANA. MGD curates and maintains the comprehensive listing of functional annotations for mouse genes using the Gene Ontology, and MGD curates and integrates comprehensive phenotype annotations including associations of mouse models with human diseases. Recent improvements include integration of the latest mouse genome build (GRCm38), improved access to comparative and functional annotations for mouse genes with expanded representation of comparative vertebrate genomes and new loads of phenotype data from high-throughput phenotyping projects. All MGD resources are freely available to the research community. PMID:24285300

  17. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4

    PubMed Central

    Gao, Song; Zhang, Liang; Rao, Venigalla B.

    2016-01-01

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo. On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  18. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4.

    PubMed

    Gao, Song; Zhang, Liang; Rao, Venigalla B

    2016-05-19

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. CRISPR/Cas9-Mediated Genome Editing in Soybean Hairy Roots.

    PubMed

    Cai, Yupeng; Chen, Li; Liu, Xiujie; Sun, Shi; Wu, Cunxiang; Jiang, Bingjun; Han, Tianfu; Hou, Wensheng

    2015-01-01

    As a new technology for gene editing, the CRISPR (clustered regularly interspaced short palindromic repeat)/Cas (CRISPR-associated) system has been rapidly and widely used for genome engineering in various organisms. In the present study, we successfully applied type II CRISPR/Cas9 system to generate and estimate genome editing in the desired target genes in soybean (Glycine max (L.) Merrill.). The single-guide RNA (sgRNA) and Cas9 cassettes were assembled on one vector to improve transformation efficiency, and we designed a sgRNA that targeted a transgene (bar) and six sgRNAs that targeted different sites of two endogenous soybean genes (GmFEI2 and GmSHR). The targeted DNA mutations were detected in soybean hairy roots. The results demonstrated that this customized CRISPR/Cas9 system shared the same efficiency for both endogenous and exogenous genes in soybean hairy roots. We also performed experiments to detect the potential of CRISPR/Cas9 system to simultaneously edit two endogenous soybean genes using only one customized sgRNA. Overall, generating and detecting the CRISPR/Cas9-mediated genome modifications in target genes of soybean hairy roots could rapidly assess the efficiency of each target loci. The target sites with higher efficiencies can be used for regular soybean transformation. Furthermore, this method provides a powerful tool for root-specific functional genomics studies in soybean.

  20. Nucleoporin NUP153 guards genome integrity by promoting nuclear import of 53BP1

    PubMed Central

    Moudry, P; Lukas, C; Macurek, L; Neumann, B; Heriche, J-K; Pepperkok, R; Ellenberg, J; Hodny, Z; Lukas, J; Bartek, J

    2012-01-01

    53BP1 is a mediator of DNA damage response (DDR) and a tumor suppressor whose accumulation on damaged chromatin promotes DNA repair and enhances DDR signaling. Using foci formation of 53BP1 as a readout in two human cell lines, we performed an siRNA-based functional high-content microscopy screen for modulators of cellular response to ionizing radiation (IR). Here, we provide the complete results of this screen as an information resource, and validate and functionally characterize one of the identified ‘hits': a nuclear pore component NUP153 as a novel factor specifically required for 53BP1 nuclear import. Using a range of cell and molecular biology approaches including live-cell imaging, we show that knockdown of NUP153 prevents 53BP1, but not several other DDR factors, from entering the nuclei in the newly forming daughter cells. This translates into decreased IR-induced 53BP1 focus formation, delayed DNA repair and impaired cell survival after IR. In addition, NUP153 depletion exacerbates DNA damage caused by replication stress. Finally, we show that the C-terminal part of NUP153 is required for effective 53BP1 nuclear import, and that 53BP1 is imported to the nucleus through the NUP153–importin-β interplay. Our data define the structure–function relationships within this emerging 53BP1-NUP153/importin-β pathway and implicate this mechanism in the maintenance of genome integrity. PMID:22075984

  1. An integrated functional genomics approach identifies the regulatory network directed by brachyury (T) in chordoma.

    PubMed

    Nelson, Andrew C; Pillay, Nischalan; Henderson, Stephen; Presneau, Nadège; Tirabosco, Roberto; Halai, Dina; Berisha, Fitim; Flicek, Paul; Stemple, Derek L; Stern, Claudio D; Wardle, Fiona C; Flanagan, Adrienne M

    2012-11-01

    Chordoma is a rare malignant tumour of bone, the molecular marker of which is the expression of the transcription factor, brachyury. Having recently demonstrated that silencing brachyury induces growth arrest in a chordoma cell line, we now seek to identify its downstream target genes. Here we use an integrated functional genomics approach involving shRNA-mediated brachyury knockdown, gene expression microarray, ChIP-seq experiments, and bioinformatics analysis to achieve this goal. We confirm that the T-box binding motif of human brachyury is identical to that found in mouse, Xenopus, and zebrafish development, and that brachyury acts primarily as an activator of transcription. Using human chordoma samples for validation purposes, we show that brachyury binds 99 direct targets and indirectly influences the expression of 64 other genes, thereby acting as a master regulator of an elaborate oncogenic transcriptional network encompassing diverse signalling pathways including components of the cell cycle, and extracellular matrix components. Given the wide repertoire of its active binding and the relative specific localization of brachyury to the tumour cells, we propose that an RNA interference-based gene therapy approach is a plausible therapeutic avenue worthy of investigation.

  2. PHRF1 promotes genome integrity by modulating non-homologous end-joining.

    PubMed

    Chang, C-F; Chu, P-C; Wu, P-Y; Yu, M-Y; Lee, J-Y; Tsai, M-D; Chang, M-S

    2015-04-09

    Methylated histone readers are critical for chromatin dynamics, transcription, and DNA repair. Human PHRF1 contains a plant homeodomain (PHD) that recognizes methylated histones and a RING domain, which ubiquitinates substrates. A recent study reveals that PHRF1 is a tumor suppressor that promotes TGF-β cytostatic signaling through TGIF ubiquitination. Also, PHRF1 is a putative phosphorylation substrate of ataxia telangiectasia-mutated/ataxia telangiectasia and Rad3-related kinases; however, the role of PHRF1 in DNA damage response is unclear. Here we report a novel function of PHRF1 in modulating non-homologous end-joining (NHEJ). PHRF1 quickly localizes to DNA damage lesions upon genotoxic insults. Ablation of PHRF1 decreases the efficiency of plasmid-based end-joining, whereas PHRF1 overexpression leads to an elevated NHEJ in H1299 reporter cells. Immunoprecipitation and peptide pull-down assays verify that PHRF1 constitutively binds to di- and trimethylated histone H3 lysine 36 (H3K36) (H3K36me2 and H3K36me3) via its PHD domain. Substitution of S915DT917E to ADAE in PHRF1 decreases its affinity for NBS1. Both PHD domain and SDTE motif are required for its NHEJ-promoting activity. Furthermore, PHRF1 mediates PARP1 polyubiquitination for proteasomal degradation. These results suggest that PHRF1 may combine with H3K36 methylation and NBS1 to promote NHEJ and stabilize genomic integrity upon DNA damage insults.

  3. A Regulatory Role for RUNX1, RUNX3 in the Maintenance of Genomic Integrity.

    PubMed

    Krishnan, Vaidehi; Ito, Yoshiaki

    2017-01-01

    All human cells are constantly attacked by endogenous and exogenous agents that damage the integrity of their genomes. Yet, the ensuing damage is mostly fixed and very rarely gives rise to genomic defects that promote cancer formation. This is due to the co-ordinated functioning of DNA repair proteins and checkpoint mechanisms that accurately detect and repair DNA damage to ensure genomic fitness. According to accumulating evidence, the RUNX family of transcription factors participate in the maintenance of genomic stability through transcriptional and non-transcriptional mechanisms. RUNX1 and RUNX3 maintain genomic integrity in a transcriptional manner by regulating the transactivation of apoptotic genes following DNA damage via complex formation with p53. RUNX1 and RUNX3 also maintain genomic integrity in a non-transcriptional manner during interstand crosslink repair by promoting the recruitment of FANCD2 to sites of DNA damage. Since RUNX genes are frequently aberrant in human cancer, here, we argue that one of the major modes by which RUNX inactivation promotes neoplastic transformation is through the loss of genomic integrity. In particular, there exists strong evidence that leukemic RUNX1-fusions such as RUNX1-ETO disrupt genomic integrity and induce a "mutator" phenotype during the early stages of leukemogenesis. Consistent with increased DNA damage accumulation induced by RUNX1-ETO, PARP inhibition has been shown to be an effective synthetic-lethal therapeutic approach against RUNX1-ETO expressing leukemias. Here, in this chapter we will examine current evidence suggesting that the tumor suppressor potential of RUNX proteins can be at least partly attributed to their ability to ensure high-fidelity DNA repair and thus prevent mutational accumulation during cancer progression.

  4. A physical map of the highly heterozygous Populus genome: integration with the genome sequence and genetic map

    SciTech Connect

    Kelleher, Colin; CHIU, Dr. R.; Shin, Dr. H.; Krywinski, Martin; Fjell, Chris; Wilkin, Jennifer; Yin, Tongming; Difazio, Stephen P.

    2007-01-01

    As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 {+-} 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.

  5. Unusual RNA plant virus integration in the soybean genome leads to the production of small RNAs.

    PubMed

    da Fonseca, Guilherme Cordenonsi; de Oliveira, Luiz Felipe Valter; de Morais, Guilherme Loss; Abdelnor, Ricardo Vilela; Nepomuceno, Alexandre Lima; Waterhouse, Peter M; Farinelli, Laurent; Margis, Rogerio

    2016-05-01

    Horizontal gene transfer (HGT) is known to be a major force in genome evolution. The acquisition of genes from viruses by eukaryotic genomes is a well-studied example of HGT, including rare cases of non-retroviral RNA virus integration. The present study describes the integration of cucumber mosaic virus RNA-1 into soybean genome. After an initial metatranscriptomic analysis of small RNAs derived from soybean, the de novo assembly resulted a 3029-nt contig homologous to RNA-1. The integration of this sequence in the soybean genome was confirmed by DNA deep sequencing. The locus where the integration occurred harbors the full RNA-1 sequence followed by the partial sequence of an endogenous mRNA and another sequence of RNA-1 as an inverted repeat and allowing the formation of a hairpin structure. This region recombined into a retrotransposon located inside an exon of a soybean gene. The nucleotide similarity of the integrated sequence compared to other Cucumber mosaic virus sequences indicates that the integration event occurred recently. We described a rare event of non-retroviral RNA virus integration in soybean that leads to the production of a double-stranded RNA in a similar fashion to virus resistance RNAi plants. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. Integration sites of Epstein-Barr virus genome on chromosomes of human lymphoblastoid cell lines

    SciTech Connect

    Wuu, K.D.; Chen, Y.J.; Wang-Wuu, S.

    1994-09-01

    Epstein-Barr virus (EBV) is the pathogen of infectious mononucleosis. The viral genome is present in more than 95% of the African cases of Burkitt lymphoma and it is usually maintained in episomal form in the tumor cells. Viral integration has been described only for Nanalwa which is a Burkitt lymphoma cell line lacking episomes. In order to examine the role of EBV in the immortalization of human Blymphocytes, we investigated whether the EBV integration into the human genome is essential. If the integration does occur, we would like to know whether the integration is randomly distributed or whether the viral DNA integrates preferentially at certain sites. Fourteen in vitro immortalized human lymphoblastoid cell lines (LCLs) were examined by fluorescence in situ hybridization (FISH) with a biotinylated EBV BamHI w DNA fragment as probe. The episomal form of EBV DNA was found in all cells of these cell lines, while only about 65% of the cells have the integrated viral DNA. This might suggest that integration is not a pre-requisite for cell immortalization. Although all chromosomes, except Y, have been found with integrated viral genome, chromsomes 1 and 5 are the most frequent EBV DNA carrier (p<0.05). Nine chromosome bands, namely, 1p31, 1q31, 2q32, 3q13, 3q26, 5q14, 6q24, 7q31 and 12q21, are preferential targets for EBV integration (p<0.001). Eighty percent of the total 938 EBV hybridization signals were found to be at G-band-positive area. This suggests that the mechanism of EBV integration might be different from that of the retroviruses, which specifically integrate to G-band-negative areas. Thus, we conclude that the integration of EBV to host genome is non-random and it may have something to do with the structure of chromosome and DNA sequences.

  7. Gene Transfer Efficiency and Genome-Wide Integration Profiling of Sleeping Beauty, Tol2, and PiggyBac Transposons in Human Primary T Cells

    PubMed Central

    Huang, Xin; Guo, Hongfeng; Tammana, Syam; Jung, Yong-Chul; Mellgren, Emil; Bassi, Preetinder; Cao, Qing; Tu, Zheng Jin; Kim, Yeong C; Ekker, Stephen C; Wu, Xiaolin; Wang, San Ming; Zhou, Xianzheng

    2010-01-01

    In this study, we compared the genomic integration efficiencies and transposition site preferences of Sleeping Beauty (SB or SB11), Tol2, and piggyBac (PB) transposon systems in primary T cells derived from peripheral blood lymphocytes (PBL) and umbilical cord blood (UCB). We found that PB demonstrated the highest efficiency of stable gene transfer in PBL-derived T cells, whereas SB11 and Tol2 mediated intermediate and lowest efficiencies, respectively. Southern hybridization analysis demonstrated that PB generated the highest number of integrants when compared to SB and Tol2 in both PBL and UCB T cells. Tol2 and PB appeared more likely to promote clonal expansion than SB, which may be in part due to the dysregulated expression of cancer-related genes near the insertion sites. Genome-wide integration analysis demonstrated that SB, Tol2, and PB integrations occurred in all the chromosomes without preference. Additionally, Tol2 and PB integration sites were mainly localized near transcriptional start sites (TSSs), CpG islands and DNaseI hypersensitive sites, whereas SB integrations were randomly distributed. These results suggest that SB may be a preferential choice of the delivery vector in T cells due to its random integration site preference and relatively high efficiency, and support continuing development of SB-mediated T-cell phase I trials. PMID:20606646

  8. Genome integrity and disease prevention in the nervous system.

    PubMed

    McKinnon, Peter J

    2017-06-15

    Multiple DNA repair pathways maintain genome stability and ensure that DNA remains essentially unchanged over the life of a cell. Various human diseases occur if DNA repair is compromised, and most of these impact the nervous system, in some cases exclusively. However, it is often unclear what specific endogenous damage underpins disease pathology. Generally, the types of causative DNA damage are associated with replication, transcription, or oxidative metabolism; other direct sources of endogenous lesions may arise from aberrant topoisomerase activity or ribonucleotide incorporation into DNA. This review focuses on the etiology of DNA damage in the nervous system and the genome stability pathways that prevent human neurologic disease. © 2017 McKinnon; Published by Cold Spring Harbor Laboratory Press.

  9. SOP for pathway inference in Integrated Microbial Genomes (IMG).

    PubMed

    Anderson, Iain; Chen, Amy; Markowitz, Victor; Kyrpides, Nikos; Ivanova, Natalia

    2011-12-31

    One of the most important aspects of genomic analysis is the prediction of which pathways, both metabolic and non-metabolic, are present in an organism. In IMG, this is carried out by the assignment of IMG terms, which are organized into IMG pathways. Based on manual and automatic assignment of IMG terms, the presence or absence of IMG pathways is automatically inferred. The three categories of pathway assertion are asserted (likely present), not asserted (likely absent), and unknown. In the unknown category, at least one term necessary for the pathway is missing, but an ortholog in another organism has the corresponding term assigned to it. Automatic pathway inference is an important initial step in genome analysis.

  10. Childhood Acute Lymphoblastic Leukemia: Integrating Genomics into Therapy

    PubMed Central

    Tasian, Sarah K; Loh, Mignon L; Hunger, Stephen P

    2015-01-01

    Acute lymphoblastic leukemia (ALL), the most common malignancy of childhood, is a genetically complex entity that remains a major cause of childhood cancer-related mortality. Major advances in genomic and epigenomic profiling during the past decade have appreciably enhanced knowledge of the biology of de novo and relapsed ALL and have facilitated more precise risk stratification of patients. These achievements have also provided critical insights regarding potentially targetable lesions for development of new therapeutic approaches in the era of precision medicine. This review delineates the current genetic landscape of childhood ALL with emphasis upon patient outcomes with contemporary treatment regimens, as well as therapeutic implications of newly identified genomic alterations in specific subsets of ALL. PMID:26194091

  11. Integrated metabolomics and phytochemical genomics approaches for studies on rice.

    PubMed

    Okazaki, Yozo; Saito, Kazuki

    2016-01-01

    Metabolomics is widely employed to monitor the cellular metabolic state and assess the quality of plant-derived foodstuffs because it can be used to manage datasets that include a wide range of metabolites in their analytical samples. In this review, we discuss metabolomics research on rice in order to elucidate the overall regulation of the metabolism as it is related to the growth and mechanisms of adaptation to genetic modifications and environmental stresses such as fungal infections, submergence, and oxidative stress. We also focus on phytochemical genomics studies based on a combination of metabolomics and quantitative trait locus (QTL) mapping techniques. In addition to starch, rice produces many metabolites that also serve as nutrients for human consumers. The outcomes of recent phytochemical genomics studies of diverse natural rice resources suggest there is potential for using further effective breeding strategies to improve the quality of ingredients in rice grains.

  12. Noncoding RNAs in DNA Repair and Genome Integrity

    PubMed Central

    Wan, Guohui; Liu, Yunhua; Han, Cecil; Zhang, Xinna

    2014-01-01

    Abstract Significance: The well-studied sequences in the human genome are those of protein-coding genes, which account for only 1%–2% of the total genome. However, with the advent of high-throughput transcriptome sequencing technology, we now know that about 90% of our genome is extensively transcribed and that the vast majority of them are transcribed into noncoding RNAs (ncRNAs). It is of great interest and importance to decipher the functions of these ncRNAs in humans. Recent Advances: In the last decade, it has become apparent that ncRNAs play a crucial role in regulating gene expression in normal development, in stress responses to internal and environmental stimuli, and in human diseases. Critical Issues: In addition to those constitutively expressed structural RNA, such as ribosomal and transfer RNAs, regulatory ncRNAs can be classified as microRNAs (miRNAs), Piwi-interacting RNAs (piRNAs), small interfering RNAs (siRNAs), small nucleolar RNAs (snoRNAs), and long noncoding RNAs (lncRNAs). However, little is known about the biological features and functional roles of these ncRNAs in DNA repair and genome instability, although a number of miRNAs and lncRNAs are regulated in the DNA damage response. Future Directions: A major goal of modern biology is to identify and characterize the full profile of ncRNAs with regard to normal physiological functions and roles in human disorders. Clinically relevant ncRNAs will also be evaluated and targeted in therapeutic applications. Antioxid. Redox Signal. 20, 655–677. PMID:23879367

  13. Integrated genomic analyses of de novo pathways underlying atypical meningiomas

    PubMed Central

    Harmancı, Akdes Serin; Youngblood, Mark W.; Clark, Victoria E.; Coşkun, Süleyman; Henegariu, Octavian; Duran, Daniel; Erson-Omay, E. Zeynep; Kaulen, Leon D.; Lee, Tong Ihn; Abraham, Brian J.; Simon, Matthias; Krischek, Boris; Timmer, Marco; Goldbrunner, Roland; Omay, S. Bülent; Baranoski, Jacob; Baran, Burçin; Carrión-Grant, Geneive; Bai, Hanwen; Mishra-Gorur, Ketu; Schramm, Johannes; Moliterno, Jennifer; Vortmeyer, Alexander O.; Bilgüvar, Kaya; Yasuno, Katsuhito; Young, Richard A.; Günel, Murat

    2017-01-01

    Meningiomas are mostly benign brain tumours, with a potential for becoming atypical or malignant. On the basis of comprehensive genomic, transcriptomic and epigenomic analyses, we compared benign meningiomas to atypical ones. Here, we show that the majority of primary (de novo) atypical meningiomas display loss of NF2, which co-occurs either with genomic instability or recurrent SMARCB1 mutations. These tumours harbour increased H3K27me3 signal and a hypermethylated phenotype, mainly occupying the polycomb repressive complex 2 (PRC2) binding sites in human embryonic stem cells, thereby phenocopying a more primitive cellular state. Consistent with this observation, atypical meningiomas exhibit upregulation of EZH2, the catalytic subunit of the PRC2 complex, as well as the E2F2 and FOXM1 transcriptional networks. Importantly, these primary atypical meningiomas do not harbour TERT promoter mutations, which have been reported in atypical tumours that progressed from benign ones. Our results establish the genomic landscape of primary atypical meningiomas and potential therapeutic targets. PMID:28195122

  14. Genome maintenance and transcription integrity in aging and disease

    PubMed Central

    Wolters, Stefanie; Schumacher, Björn

    2013-01-01

    DNA damage contributes to cancer development and aging. Congenital syndromes that affect DNA repair processes are characterized by cancer susceptibility, developmental defects, and accelerated aging (Schumacher et al., 2008). DNA damage interferes with DNA metabolism by blocking replication and transcription. DNA polymerase blockage leads to replication arrest and can gives rise to genome instability. Transcription, on the other hand, is an essential process for utilizing the information encoded in the genome. DNA damage that interferes with transcription can lead to apoptosis and cellular senescence. Both processes are powerful tumor suppressors (Bartek and Lukas, 2007). Cellular response mechanisms to stalled RNA polymerase II complexes have only recently started to be uncovered. Transcription-coupled DNA damage responses might thus play important roles for the adjustments to DNA damage accumulation in the aging organism (Garinis et al., 2009). Here we review human disorders that are caused by defects in genome stability to explore the role of DNA damage in aging and disease. We discuss how the nucleotide excision repair system functions at the interface of transcription and repair and conclude with concepts how therapeutic targeting of transcription might be utilized in the treatment of cancer. PMID:23443494

  15. Patterns of genomic integration of nuclear chloroplast DNA fragments in plant species.

    PubMed

    Yoshida, Takanori; Furihata, Hazuka Y; Kawabe, Akira

    2014-01-01

    The transfer of organelle DNA fragments to the nuclear genome is frequently observed in eukaryotes. These transfers are thought to play an important role in gene and genome evolution of eukaryotes. In plants, such transfers occur from plastid to nuclear [nuclear plastid DNAs (NUPTs)] and mitochondrial to nuclear (nuclear mitochondrial DNAs) genomes. The amount and genomic organization of organelle DNA fragments have been studied in model plant species, such as Arabidopsis thaliana and rice. At present, publicly available genomic data can be used to conduct such studies in non-model plants. In this study, we analysed the amount and genomic organization of NUPTs in 17 plant species for which genome sequences are available. The amount and distribution of NUPTs varied among the species. We also estimated the distribution of NUPTs according to the time of integration (relative age) by conducting sequence similarity analysis between NUPTs and the plastid genome. The age distributions suggested that the present genomic constitutions of NUPTs could be explained by the combination of the rapidly eliminated deleterious parts and few but constantly existing less deleterious parts.

  16. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

    PubMed

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  17. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes

    PubMed Central

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  18. T lymphocyte-mediated cytotoxicity against autologous EBV-genome-bearing B cells.

    PubMed

    Tsoukas, C D; Fox, R I; Slovin, S F; Carson, D A; Pellegrino, M; Fong, S; Pasquali, J L; Ferrone, S; Kung, P; Vaughan, J H

    1981-05-01

    We have stimulated human peripheral blood lymphocytes in vitro with autologous EBV-infected or noninfected B cells. A cytotoxic response was obtained only when virally infected cells were used. The activity of the effector cells was restricted by the major histocompatibility complex and was directed against EBV-genome-bearing targets. The highest cytolytic response was obtained when lymphocytes of individuals previously exposed to the virus (EBV-VCA positive) were used. Lymphocytes of noninfected donors (EBV-VCA negative) gave a low response; the relative frequency of their effector cells was at least 4-fold lower. Lymphocytes of newborns did not respond. The cytotoxic activity was mediated by T lymphocytes of the cytotoxic/suppressor subset, as determined by cytofluorographic analysis and antibody plus complement-mediated lysis, using monoclonal antibodies to human lymphocyte surface antigen.

  19. HIV-1 Integrates Widely throughout the Genome of the Human Blood Fluke Schistosoma mansoni

    PubMed Central

    Mann, Victoria H.; Dubrovsky, Larisa; Yan, Hong-bin; Huckvale, Thomas; Protasio, Anna V.; Pushkarsky, Tatiana; Iordanskiy, Sergey; Bukrinsky, Michael I.

    2016-01-01

    Schistosomiasis is the most important helminthic disease of humanity in terms of morbidity and mortality. Facile manipulation of schistosomes using lentiviruses would enable advances in functional genomics in these and related neglected tropical diseases pathogens including tapeworms, and including their non-dividing cells. Such approaches have hitherto been unavailable. Blood stream forms of the human blood fluke, Schistosoma mansoni, the causative agent of the hepatointestinal schistosomiasis, were infected with the human HIV-1 isolate NL4-3 pseudotyped with vesicular stomatitis virus glycoprotein. The appearance of strong stop and positive strand cDNAs indicated that virions fused to schistosome cells, the nucleocapsid internalized and the RNA genome reverse transcribed. Anchored PCR analysis, sequencing HIV-1-specific anchored Illumina libraries and Whole Genome Sequencing (WGS) of schistosomes confirmed chromosomal integration; >8,000 integrations were mapped, distributed throughout the eight pairs of chromosomes including the sex chromosomes. The rate of integrations in the genome exceeded five per 1,000 kb and HIV-1 integrated into protein-encoding loci and elsewhere with integration bias dissimilar to that of human T cells. We estimated ~ 2,100 integrations per schistosomulum based on WGS, i.e. about two or three events per cell, comparable to integration rates in human cells. Accomplishment in schistosomes of post-entry processes essential for HIV-1replication, including integrase-catalyzed integration, was remarkable given the phylogenetic distance between schistosomes and primates, the natural hosts of the genus Lentivirus. These enigmatic findings revealed that HIV-1 was active within cells of S. mansoni, and provided the first demonstration that HIV-1 can integrate into the genome of an invertebrate. PMID:27764257

  20. HIV-1 Integrates Widely throughout the Genome of the Human Blood Fluke Schistosoma mansoni.

    PubMed

    Suttiprapa, Sutas; Rinaldi, Gabriel; Tsai, Isheng J; Mann, Victoria H; Dubrovsky, Larisa; Yan, Hong-Bin; Holroyd, Nancy; Huckvale, Thomas; Durrant, Caroline; Protasio, Anna V; Pushkarsky, Tatiana; Iordanskiy, Sergey; Berriman, Matthew; Bukrinsky, Michael I; Brindley, Paul J

    2016-10-01

    Schistosomiasis is the most important helminthic disease of humanity in terms of morbidity and mortality. Facile manipulation of schistosomes using lentiviruses would enable advances in functional genomics in these and related neglected tropical diseases pathogens including tapeworms, and including their non-dividing cells. Such approaches have hitherto been unavailable. Blood stream forms of the human blood fluke, Schistosoma mansoni, the causative agent of the hepatointestinal schistosomiasis, were infected with the human HIV-1 isolate NL4-3 pseudotyped with vesicular stomatitis virus glycoprotein. The appearance of strong stop and positive strand cDNAs indicated that virions fused to schistosome cells, the nucleocapsid internalized and the RNA genome reverse transcribed. Anchored PCR analysis, sequencing HIV-1-specific anchored Illumina libraries and Whole Genome Sequencing (WGS) of schistosomes confirmed chromosomal integration; >8,000 integrations were mapped, distributed throughout the eight pairs of chromosomes including the sex chromosomes. The rate of integrations in the genome exceeded five per 1,000 kb and HIV-1 integrated into protein-encoding loci and elsewhere with integration bias dissimilar to that of human T cells. We estimated ~ 2,100 integrations per schistosomulum based on WGS, i.e. about two or three events per cell, comparable to integration rates in human cells. Accomplishment in schistosomes of post-entry processes essential for HIV-1replication, including integrase-catalyzed integration, was remarkable given the phylogenetic distance between schistosomes and primates, the natural hosts of the genus Lentivirus. These enigmatic findings revealed that HIV-1 was active within cells of S. mansoni, and provided the first demonstration that HIV-1 can integrate into the genome of an invertebrate.

  1. Human Genome-Wide Expression Analysis Reorients the Study of Inflammatory Mediators and Biomechanics in Osteoarthritis

    PubMed Central

    Sandy, John D.; Chan, Deva D.; Trevino, Robert L.; Wimmer, Markus A.; Plaas, Anna

    2015-01-01

    A major objective of this article is to examine the research implications of recently available genome-wide expression profiles of cartilage from human osteoarthritis (OA) joints. We propose that when viewed in the light of extensive earlier work this novel data provides a unique opportunity to reorient the design of experimental systems toward clinical relevance. Specifically, in the area of cartilage explant biology this will require a fresh evaluation of existing paradigms, so as to optimize the choices of tissue source, cytokine/growth factor/nutrient addition, and biomechanical environment for discovery. Within this context, we firstly discuss the literature on the nature and role of potential catabolic mediators in OA pathology, including data from human OA cartilage, animal models of OA and ex vivo studies. Secondly, due to the number and breadth of studies on IL-1β in this area, a major focus of the article is a critical analysis of the design and interpretation of cartilage studies where IL-1β has been used as a model cytokine. Thirdly, the article provides a data-driven perspective (including genome-wide analysis of clinical samples, studies on mutant mice, and clinical trials), which concludes that IL-1β should be replaced by soluble mediators such as IL-17 or TGF-β1, which are much more likely to mimic the disease in OA model systems. We also discuss the evidence that changes in early OA can be attributed to the activity of such soluble mediators, whereas late-stage disease results more from a chronic biomechanical effect on the matrix and cells of the remaining cartilage and on other local mediator-secreting cells. Lastly, an updated protocol for in vitro studies with cartilage explants and chondrocytes (including the use of specific gene expression arrays) is provided to motivate more disease-relevant studies on the interplay of cytokines/growth factors and biomechanics on cellular behavior. PMID:26521740

  2. Genomic characterization of viral integration sites in HPV-related cancers.

    PubMed

    Bodelon, Clara; Untereiner, Michael E; Machiela, Mitchell J; Vinokurova, Svetlana; Wentzensen, Nicolas

    2016-11-01

    Persistent infection with carcinogenic human papillomaviruses (HPV) causes the majority of anogenital cancers and a subset of head and neck cancers. The HPV genome is frequently found integrated into the host genome of invasive cancers. The mechanisms of how it may promote disease progression are not well understood. Thoroughly characterizing integration events can provide insights into HPV carcinogenesis. Individual studies have reported limited number of integration sites in cell lines and human samples. We performed a systematic review of published integration sites in HPV-related cancers and conducted a pooled analysis to formally test for integration hotspots and genomic features enriched in integration events using data from the Encyclopedia of DNA Elements (ENCODE). Over 1,500 integration sites were reported in the literature, of which 90.8% (N = 1,407) were in human tissues. We found 10 cytobands enriched for integration events, three previously reported ones (3q28, 8q24.21 and 13q22.1) and seven additional ones (2q22.3, 3p14.2, 8q24.22, 14q24.1, 17p11.1, 17q23.1 and 17q23.2). Cervical infections with HPV18 were more likely to have breakpoints in 8q24.21 (p = 7.68 × 10(-4) ) than those with HPV16. Overall, integration sites were more likely to be in gene regions than expected by chance (p = 6.93 × 10(-9) ). They were also significantly closer to CpG regions, fragile sites, transcriptionally active regions and enhancers. Few integration events occurred within 50 Kb of known cervical cancer driver genes. This suggests that HPV integrates in accessible regions of the genome, preferentially genes and enhancers, which may affect the expression of target genes.

  3. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    PubMed

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  4. ArkMAP: integrating genomic maps across species and data sources.

    PubMed

    Paterson, Trevor; Law, Andy

    2013-08-13

    The visualisation of genetic and genomic maps aligned within and between species and across data sources can be used to inform studies of genome evolution, assist genome assembly projects and aid gene discovery and identification. Whilst annotation, integration and exploration of assembled genome sequences is well supported, there are fewer tools available which can display genetic maps for less well-characterized species, and integrate these maps with annotated reference genomes to support cross species comparisons. We have developed a desktop application to draw and align genetic and genomic maps, retrieved from remote data sources or loaded as local files. Maps can be retrieved from our public map database ArkDB or from any Ensembl data source (i.e. Ensembl and Ensembl Genomes). By using the JEnsembl API, maps can be drawn for any release version of any of the thousands of species present in Ensembl data sources, allowing not only inter-specific comparisons, but also comparisons between different versions/revisions of assembled genomes. Maps can be aligned by relating identical or synonymous markers across maps, or through the gene homology/orthology relationship data stored in the Ensembl Compara databases, allowing ready visualization of regions of conserved synteny between species. The map drawing canvas is highly configurable, supports interactive exploration of maps, markers and relationships and allows export of publication quality graphics. ArkMAP allows users to draw and interactively explore gene and variation maps for any version of any annotated genome curated in the Ensembl data sources, and to integrate local mapping data. The maps and inter-map relationships drawn are highly configurable and ArkMAP may be used to produce publication quality graphics. ArkMAP is freely available as an auto-updating Java 'Web Start' application, or as a standalone archived application.

  5. The age and genomic integrity of neurons after cortical stroke in humans.

    PubMed

    Huttner, Hagen B; Bergmann, Olaf; Salehpour, Mehran; Rácz, Attila; Tatarishvili, Jemal; Lindgren, Emma; Csonka, Tamás; Csiba, László; Hortobágyi, Tibor; Méhes, Gábor; Englund, Elisabet; Solnestam, Beata Werne; Zdunek, Sofia; Scharenberg, Christian; Ström, Lena; Ståhl, Patrik; Sigurgeirsson, Benjamin; Dahl, Andreas; Schwab, Stefan; Possnert, Göran; Bernard, Samuel; Kokaia, Zaal; Lindvall, Olle; Lundeberg, Joakim; Frisén, Jonas

    2014-06-01

    It has been unclear whether ischemic stroke induces neurogenesis or neuronal DNA rearrangements in the human neocortex. Using immunohistochemistry; transcriptome, genome and ploidy analyses; and determination of nuclear bomb test-derived (14)C concentration in neuronal DNA, we found neither to be the case. A large proportion of cortical neurons displayed DNA fragmentation and DNA repair a short time after stroke, whereas neurons at chronic stages after stroke showed DNA integrity, demonstrating the relevance of an intact genome for survival.

  6. Integrating Genomic Resources with Electronic Health Records using the HL7 Infobutton Standard

    PubMed Central

    Overby, Casey Lynnette; Del Fiol, Guilherme; Rubinstein, Wendy S.; Maglott, Donna R.; Nelson, Tristan H.; Milosavljevic, Aleksandar; Martin, Christa L.; Goehringer, Scott R.; Freimuth, Robert R.; Williams, Marc S.

    2016-01-01

    Summary Background The Clinical Genome Resource (ClinGen) Electronic Health Record (EHR) Workgroup aims to integrate ClinGen resources with EHRs. A promising option to enable this integration is through the Health Level Seven (HL7) Infobutton Standard. EHR systems that are certified according to the US Meaningful Use program provide HL7-compliant infobutton capabilities, which can be leveraged to support clinical decision-making in genomics. Objectives To integrate genomic knowledge resources using the HL7 infobutton standard. Two tactics to achieve this objective were: (1) creating an HL7-compliant search interface for ClinGen, and (2) proposing guidance for genomic resources on achieving HL7 Infobutton standard accessibility and compliance. Methods We built a search interface utilizing OpenInfobutton, an open source reference implementation of the HL7 Infobutton standard. ClinGen resources were assessed for readiness towards HL7 compliance. Finally, based upon our experiences we provide recommendations for publishers seeking to achieve HL7 compliance. Results Eight genomic resources and two sub-resources were integrated with the ClinGen search engine via OpenInfobutton and the HL7 infobutton standard. Resources we assessed have varying levels of readiness towards HL7-compliance. Furthermore, we found that adoption of standard terminologies used by EHR systems is the main gap to achieve compliance. Conclusion Genomic resources can be integrated with EHR systems via the HL7 Infobutton standard using OpenInfobutton. Full compliance of genomic resources with the Infobutton standard would further enhance interoperability with EHR systems. PMID:27579472

  7. ITEP: An integrated toolkit for exploration of microbial pan-genomes

    PubMed Central

    2014-01-01

    Background Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. Results We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP’s capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network

  8. An integrated computational pipeline and database to support whole-genome sequence annotation

    PubMed Central

    Mungall, CJ; Misra, S; Berman, BP; Carlson, J; Frise, E; Harris, N; Marshall, B; Shu, S; Kaminker, JS; Prochnik, SE; Smith, CD; Smith, E; Tupy, JL; Wiel, C; Rubin, GM; Lewis, SE

    2002-01-01

    We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture. PMID:12537570

  9. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  10. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  11. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  12. Integration of genomic medicine into pathology residency training: the stanford open curriculum.

    PubMed

    Schrijver, Iris; Natkunam, Yasodha; Galli, Stephen; Boyd, Scott D

    2013-03-01

    Next-generation sequencing methods provide an opportunity for molecular pathology laboratories to perform genomic testing that is far more comprehensive than single-gene analyses. Genome-based test results are expected to develop into an integral component of diagnostic clinical medicine and to provide the basis for individually tailored health care. To achieve these goals, rigorous interpretation of high-quality data must be informed by the medical history and the phenotype of the patient. The discipline of pathology is well positioned to implement genome-based testing and to interpret its results, but new knowledge and skills must be included in the training of pathologists to develop expertise in this area. Pathology residents should be trained in emerging technologies to integrate genomic test results appropriately with more traditional testing, to accelerate clinical studies using genomic data, and to help develop appropriate standards of data quality and evidence-based interpretation of these test results. We have created a genomic pathology curriculum as a first step in helping pathology residents build a foundation for the understanding of genomic medicine and its implications for clinical practice. This curriculum is freely accessible online.

  13. Multiplex genomic walking: Integration of the wet lab and computer lab into a single prototyping environment

    SciTech Connect

    Gillevet, P.M.

    1993-12-31

    The authors are presently sequencing the entire genome of Mycoplasma capricolum, one of the smallest of free living organisms by a Multiplex Genomic Walking strategy. This technique involves the repetitive hybridization of sequencing membranes with oligonucleotide probes to acquire sequence data in discrete steps along the genome. The technique allows one to walk a genome in a directed manner eliminating the problems associated with random shotgun assembly. Furthermore, the repetitive stripping and hybridization process is relatively simple to reproduce and has the potential to be easily automated. The Genetic Data Environment (GDE), an X Windows based Graphic User Interface has allowed the seamless integration of a core multiple sequence editor with pre-existing external sequence analysis programs and internally developed programs into a single prototypic environment. This system has facilitated linkage of the 9 Harvard Genome Lab`s internal database and automated data control systems into one Graphic User Interface which can handle the archiving and analysis of both random fluorescent sequencing data and genomic walking data from the Mycoplasma project. Finally, it has facilitated the integration of the Genomic sequence data into a PROLOG database environment for the comparative analysis of Mycoplasma capricolum and other organisms.

  14. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system.

    PubMed

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D; Markowitz, Victor M; Kyrpides, Nikos C

    2009-11-24

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  15. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de).

  16. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system

    SciTech Connect

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D.; Markowitz, Victor M.; Kyrpides, Nikos C.

    2009-05-01

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across a statistically significant and phylogeneticaly diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate and explore gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  17. VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites.

    PubMed

    Calabria, Andrea; Leo, Simone; Benedicenti, Fabrizio; Cesana, Daniela; Spinozzi, Giulio; Orsini, Massimilano; Merella, Stefania; Stupka, Elia; Zanetti, Gianluigi; Montini, Eugenio

    2014-01-01

    The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings.

  18. Exploration of Genomic, Proteomic, and Histopathological Image Data Integration Methods for Clinical Prediction

    PubMed Central

    Poruthoor, A.; Phan, J.H.; Kothari, S.; Wang, May D.

    2016-01-01

    The emergence of large multi-platform and multi-scale data repositories in biomedicine has enabled the exploration of data integration for holistic decision making. In this research, we investigate multi-modal genomic, proteomic, and histopathological image data integration for prediction of ovarian cancer clinical endpoints in The Cancer Genome Atlas (TCGA). Specifically, we study two data integration techniques, simple data concatenation and ensemble classification, to determine whether they can improve prediction of ovarian cancer grade or patient survival. Results indicate that integration via ensemble classification is more effective than simple data concatenation. We also highlight several key factors impacting data integration outcome such as predictability of endpoint, class prevalence, and unbalanced representation of features from different data modalities.

  19. Comparison of 432 Pseudomonas strains through integration of genomic, functional, metabolic and expression data

    PubMed Central

    Koehorst, Jasper J.; van Dam, Jesse C. J.; van Heck, Ruben G. A.; Saccenti, Edoardo; dos Santos, Vitor A. P. Martins; Suarez-Diez, Maria; Schaap, Peter J.

    2016-01-01

    Pseudomonas is a highly versatile genus containing species that can be harmful to humans and plants while others are widely used for bioengineering and bioremediation. We analysed 432 sequenced Pseudomonas strains by integrating results from a large scale functional comparison using protein domains with data from six metabolic models, nearly a thousand transcriptome measurements and four large scale transposon mutagenesis experiments. Through heterogeneous data integration we linked gene essentiality, persistence and expression variability. The pan-genome of Pseudomonas is closed indicating a limited role of horizontal gene transfer in the evolutionary history of this genus. A large fraction of essential genes are highly persistent, still non essential genes represent a considerable fraction of the core-genome. Our results emphasize the power of integrating large scale comparative functional genomics with heterogeneous data for exploring bacterial diversity and versatility. PMID:27922098

  20. RNA Interference Is Responsible for Reduction of Transgene Expression after Sleeping Beauty Transposase Mediated Somatic Integration

    PubMed Central

    Rauschhuber, Christina; Ehrhardt, Anja

    2012-01-01

    Background Integrating non-viral vectors based on transposable elements are widely used for genetically engineering mammalian cells in functional genomics and therapeutic gene transfer. For the Sleeping Beauty (SB) transposase system it was demonstrated that convergent transcription driven by the SB transposase inverted repeats (IRs) in eukaryotic cells occurs after somatic integration. This could lead to formation of double-stranded RNAs potentially presenting targets for the RNA interference (RNAi) machinery and subsequently resulting into silencing of the transgene. Therefore, we aimed at investigating transgene expression upon transposition under RNA interference knockdown conditions. Principal Findings To establish RNAi knockdown cell lines we took advantage of the P19 protein, which is derived from the tomato bushy stunt virus. P19 binds and inhibits 21 nucleotides long, small-interfering RNAs and was shown to sufficiently suppress RNAi. We found that transgene expression upon SB mediated transposition was enhanced, resulting into a 3.2-fold increased amount of colony forming units (CFU) after transposition. In contrast, if the transgene cassette is insulated from the influence of chromosomal position effects by the chicken-derived cHS4 insulating sequences or when applying the Forg Prince transposon system, that displays only negligible transcriptional activity, similar numbers of CFUs were obtained. Conclusion In summary, we provide evidence for the first time that after somatic integration transposon derived transgene expression is regulated by the endogenous RNAi machinery. In the future this finding will help to further improve the molecular design of the SB transposase vector system. PMID:22570690

  1. Munich Information Center for Protein Sequences Plant Genome Resources. A Framework for Integrative and Comparative Analyses1[w

    PubMed Central

    Schoof, Heiko; Spannagl, Manuel; Yang, Li; Ernst, Rebecca; Gundlach, Heidrun; Haase, Dirk; Haberer, Georg; Mayer, Klaus F.X.

    2005-01-01

    With several plant genomes sequenced, the power of comparative genome analysis can now be applied. However, genome-scale cross-species analyses are limited by the effort for data integration. To develop an integrated cross-species plant genome resource, we maintain comprehensive databases for model plant genomes, including Arabidopsis (Arabidopsis thaliana), maize (Zea mays), Medicago truncatula, and rice (Oryza sativa). Integration of data and resources is emphasized, both in house as well as with external partners and databases. Manual curation and state-of-the-art bioinformatic analysis are combined to achieve quality data. Easy access to the data is provided through Web interfaces and visualization tools, bulk downloads, and Web services for application-level access. This allows a consistent view of the model plant genomes for comparative and evolutionary studies, the transfer of knowledge between species, and the integration with functional genomics data. PMID:16010004

  2. From integrative genomics to systems genetics in the rat to link genotypes to phenotypes

    PubMed Central

    Moreno-Moral, Aida

    2016-01-01

    ABSTRACT Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. PMID:27736746

  3. High-throughput genomic mapping of vector integration sites in gene therapy studies.

    PubMed

    Beard, Brian C; Adair, Jennifer E; Trobridge, Grant D; Kiem, Hans-Peter

    2014-01-01

    Gene therapy has enormous potential to treat a variety of infectious and genetic diseases. To date hundreds of patients worldwide have received hematopoietic cell products that have been gene-modified with retrovirus vectors carrying therapeutic transgenes, and many patients have been cured or demonstrated disease stabilization as a result (Adair et al., Sci Transl Med 4:133ra57, 2012; Biffi et al., Science 341:1233158, 2013; Aiuti et al., Science 341:1233151, 2013; Fischer et al., Gene 525:170-173, 2013). Unfortunately, for some patients the provirus integration dysregulated the expression of nearby genes leading to clonal outgrowth and, in some cases, cancer. Thus, the unwanted side effect of insertional mutagenesis has become a major concern for retrovirus gene therapy. The careful study of retrovirus integration sites (RIS) and the contribution of individual gene-modified clones to hematopoietic repopulating cells is of crucial importance for all gene therapy studies. Supporting this, the US Food and Drug Administration (FDA) has mandated the careful monitoring of RIS in all clinical trials of gene therapy. An invaluable method was developed: linear amplification mediated-polymerase chain reaction (LAM-PCR) capable of analyzing in vitro and complex in vivo samples, capturing valuable genomic information directly flanking the site of provirus integration. Linking this method and similar methods to high-throughput sequencing has now made possible an unprecedented understanding of the integration profile of various retrovirus vectors, and allows for sensitive monitoring of their safety. It also allows for a detailed comparison of improved safety-enhanced gene therapy vectors. An important readout of safety is the relative contribution of individual gene-modified repopulating clones. One limitation of LAM-PCR is that the ability to capture the relative contribution of individual clones is compromised because of the initial linear PCR common to all current methods

  4. NDRG1 links p53 with proliferation-mediated centrosome homeostasis and genome stability.

    PubMed

    Croessmann, Sarah; Wong, Hong Yuen; Zabransky, Daniel J; Chu, David; Mendonca, Janet; Sharma, Anup; Mohseni, Morassa; Rosen, D Marc; Scharpf, Robert B; Cidado, Justin; Cochran, Rory L; Parsons, Heather A; Dalton, W Brian; Erlanger, Bracha; Button, Berry; Cravero, Karen; Kyker-Snowman, Kelly; Beaver, Julia A; Kachhap, Sushant; Hurley, Paula J; Lauring, Josh; Park, Ben Ho

    2015-09-15

    The tumor protein 53 (TP53) tumor suppressor gene is the most frequently somatically altered gene in human cancers. Here we show expression of N-Myc down-regulated gene 1 (NDRG1) is induced by p53 during physiologic low proliferative states, and mediates centrosome homeostasis, thus maintaining genome stability. When placed in physiologic low-proliferating conditions, human TP53 null cells fail to increase expression of NDRG1 compared with isogenic wild-type controls and TP53 R248W knockin cells. Overexpression and RNA interference studies demonstrate that NDRG1 regulates centrosome number and amplification. Mechanistically, NDRG1 physically associates with γ-tubulin, a key component of the centrosome, with reduced association in p53 null cells. Strikingly, TP53 homozygous loss was mutually exclusive of NDRG1 overexpression in over 96% of human cancers, supporting the broad applicability of these results. Our study elucidates a mechanism of how TP53 loss leads to abnormal centrosome numbers and genomic instability mediated by NDRG1.

  5. RecQ Helicases: Conserved Guardians of Genomic Integrity.

    PubMed

    Larsen, Nicolai Balle; Hickson, Ian D

    2013-01-01

    The RecQ family of DNA helicases is highly conserved throughout -evolution, and is important for the maintenance of genome stability. In humans, five RecQ family members have been identified: BLM, WRN, RECQ4, RECQ1 and RECQ5. Defects in three of these give rise to Bloom's syndrome (BLM), Werner's syndrome (WRN) and Rothmund-Thomson/RAPADILINO/Baller-Gerold (RECQ4) syndromes. These syndromes are characterised by cancer predisposition and/or premature ageing. In this review, we focus on the roles of BLM and its S. cerevisiae homologue, Sgs1, in genome maintenance. BLM/Sgs1 has been shown to play a critical role in homologous recombination at multiple steps, including end-resection, displacement loop formation, branch migration and double Holliday junction dissolution. In addition, recent evidence has revealed a role for BLM/Sgs1 in the stabilisation and repair of replication forks damaged during a perturbed S-phase. Finally BLM also plays a role in the suppression and/or resolution of ultra-fine anaphase DNA bridges that form between sister-chromatids during mitosis.

  6. Integrative genome-wide analysis reveals a robust genomic glioblastoma signature associated with copy number driving changes in gene expression.

    PubMed

    de Tayrac, Marie; Etcheverry, Amandine; Aubry, Marc; Saïkali, Stephan; Hamlat, Abderrahmane; Quillien, Veronique; Le Treut, André; Galibert, Marie-Dominique; Mosser, Jean

    2009-01-01

    Glioblastoma multiforme shows multiple chromosomal aberrations, the impact of which on gene expression remains unclear. To investigate this relationship and to identify putative initiating genomic events, we integrated a paired copy number and gene expression survey in glioblastoma using whole human genome arrays. Loci of recurrent copy number alterations were combined with gene expression profiles obtained on the same tumor samples. We identified a set of 406 "cis-acting DNA targeted genes" corresponding to genomic aberrations with direct copy-number-driving changes in gene expression, defined as genes with either significantly concordant or correlated changes in DNA copy number and expression. Functional annotation revealed that these genes participate in key processes of cancer cell biology, providing insights into the genetic mechanisms driving glioblastoma. The robustness of the gene selection was validated on an external microarray data set including 81 glioblastomas and 23 non-neoplastic brain samples. The integration of array CGH and gene expression data highlights a robust cis-acting DNA targeted genes signature that may be critical for glioblastoma progression, with two tumor suppressor genes PCDH9 and STARD13 that could be involved in tumor invasiveness and resistance to etoposide.

  7. Fluorescent reporters for markerless genomic integration in Staphylococcus aureus

    PubMed Central

    de Jong, Nienke W. M.; van der Horst, Thijs; van Strijp, Jos A. G.; Nijland, Reindert

    2017-01-01

    We present integration vectors for Staphylococcus aureus encoding the fluorescent reporters mAmetrine, CFP, sGFP, YFP, mCherry and mKate. The expression is driven either from the sarA-P1 promoter or from any other promoter of choice. The reporter can be inserted markerless in the chromosome of a wide range of S. aureus strains. The integration site chosen does not disrupt any open reading frame, provides good expression, and has no detectable effect on the strains physiology. As an intermediate construct, we present a set of replicating plasmids containing the same fluorescent reporters. Also in these reporter plasmids the sarA-P1 promoter can be replaced by any other promoter of interest for expression studies. Cassettes from the replication plasmids can be readily swapped with the integration vector. With these constructs it becomes possible to monitor reporters of separate fluorescent wavelengths simultaneously. PMID:28266573

  8. Site-specific in situ amplification of the integrated polyomavirus genome: a case for a context-specific over-replication model of gene amplification.

    PubMed

    Syu, L J; Fluck, M M

    1997-08-08

    The fate of the genome of the polyoma (Py) tumor virus following integration in the chromosomes of transformed rat FR3T3 cells was re-examined. The viral sequences were integrated at a single transformant-specific chromosomal site in each of 22 transformants tested. In situ amplification of the viral sequences was observed in 24 of 34 transformants analyzed. Large T antigen, the unique viral function involved in initiating DNA replication from the viral origin, was essential for the amplification process. There was an absolute requirement for a reiteration of viral sequences and the extent of the reiteration affected the degree of amplification. The reiteration may be important for homologous recombination-mediated resolution of in situ amplified sequences. Among 11 transformants harboring a 1 to 2 kb repeat, the degree of amplification was transformant-specific and varied over a wide range. At the high end of the spectrum, the genome copy number increased 1300-fold at steady state, while at the low end, amplification was below twofold. Some aspect of the host chromatin at the site integration that affected viral gene expression, also directly or indirectly modulated the amplification. Use of high-resolution electrophoresis for the analysis of the integrated amplified sequences revealed a recurring novel pattern, consisting of a ladder with numerous bands separated by a constant distance approximately the size of the Py genome. We suggest that this pattern was generated by conversion of the amplified viral genomes to head to tail linear arrays with cell to cell variations in the number of genome repeats at single, transformant-specific, chromosomal sites. In light of the known "out of schedule" firing of the Py origin, we propose an "onion skin" structure intermediate and present a homologous recombination model for the conversion from onion skins to linear arrays. The relevance of the in situ amplification of the Py genome to cellular gene amplification is

  9. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

    PubMed Central

    2012-01-01

    Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic

  10. Integrated genome-wide chromatin occupancy and expression analyses identify key myeloid pro-differentiation transcription factors repressed by Myb.

    PubMed

    Zhao, Liang; Glazov, Evgeny A; Pattabiraman, Diwakar R; Al-Owaidi, Faisal; Zhang, Ping; Brown, Matthew A; Leo, Paul J; Gonda, Thomas J

    2011-06-01

    To gain insight into the mechanisms by which the Myb transcription factor controls normal hematopoiesis and particularly, how it contributes to leukemogenesis, we mapped the genome-wide occupancy of Myb by chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) in ERMYB myeloid progenitor cells. By integrating the genome occupancy data with whole genome expression profiling data, we identified a Myb-regulated transcriptional program. Gene signatures for leukemia stem cells, normal hematopoietic stem/progenitor cells and myeloid development were overrepresented in 2368 Myb regulated genes. Of these, Myb bound directly near or within 793 genes. Myb directly activates some genes known critical in maintaining hematopoietic stem cells, such as Gfi1 and Cited2. Importantly, we also show that, despite being usually considered as a transactivator, Myb also functions to repress approximately half of its direct targets, including several key regulators of myeloid differentiation, such as Sfpi1 (also known as Pu.1), Runx1, Junb and Cebpb. Furthermore, our results demonstrate that interaction with p300, an established coactivator for Myb, is unexpectedly required for Myb-mediated transcriptional repression. We propose that the repression of the above mentioned key pro-differentiation factors may contribute essentially to Myb's ability to suppress differentiation and promote self-renewal, thus maintaining progenitor cells in an undifferentiated state and promoting leukemic transformation.

  11. Retrovirus Integration Database (RID): a public database for retroviral insertion sites into host genomes.

    PubMed

    Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H

    2016-07-04

    The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .

  12. Homologous recombination maintenance of genome integrity during DNA damage tolerance

    PubMed Central

    Prado, Félix

    2014-01-01

    The DNA strand exchange protein Rad51 provides a safe mechanism for the repair of DNA breaks using the information of a homologous DNA template. Homologous recombination (HR) also plays a key role in the response to DNA damage that impairs the advance of the replication forks by providing mechanisms to circumvent the lesion and fill in the tracks of single-stranded DNA that are generated during the process of lesion bypass. These activities postpone repair of the blocking lesion to ensure that DNA replication is completed in a timely manner. Experimental evidence generated over the last few years indicates that HR participates in this DNA damage tolerance response together with additional error-free (template switch) and error-prone (translesion synthesis) mechanisms through intricate connections, which are presented here. The choice between repair and tolerance, and the mechanism of tolerance, is critical to avoid increased mutagenesis and/or genome rearrangements, which are both hallmarks of cancer. PMID:27308329

  13. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells.

    PubMed

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H

    2015-09-22

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis.

  14. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells

    PubMed Central

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H.

    2015-01-01

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  15. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of conserved synteny with model fish genomes.

    PubMed

    Palti, Yniv; Genet, Carine; Gao, Guangtu; Hu, Yuqin; You, Frank M; Boussaha, Mekki; Rexroad, Caird E; Luo, Ming-Cheng

    2012-06-01

    DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries were generated to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is composed of 167,989 clones of which 158,670 are assembled into contigs and 9,319 are singletons. The number of contigs was reduced from 4,173 to 3,220. End sequencing of clones from the new libraries generated a total of 11,958 high quality sequence reads. The end sequences were used to develop 238 new microsatellites of which 42 were added to the genetic map. Conserved synteny between the rainbow trout genome and model fish genomes was analyzed using 188,443 BAC end sequence (BES) reads. The fractions of BES reads with significant BLASTN hits against the zebrafish, medaka, and stickleback genomes were 8.8%, 9.7%, and 10.5%, respectively, while the fractions of significant BLASTX hits against the zebrafish, medaka, and stickleback protein databases were 6.2%, 5.8%, and 5.5%, respectively. The overall number of unique regions of conserved synteny identified through grouping of the rainbow trout BES into fingerprinting contigs was 2,259, 2,229, and 2,203 for stickleback, medaka, and zebrafish, respectively. These numbers are approximately three to five times greater than those we have previously identified using BAC paired ends. Clustering of the conserved synteny analysis results by linkage groups as derived from the integrated physical and genetic map revealed that despite the low sequence homology, large blocks of macrosynteny are conserved between chromosome arms of rainbow trout and the model fish species.

  16. Plant Genome DataBase Japan (PGDBj): A Portal Website for the Integration of Plant Genome-Related Databases

    PubMed Central

    Asamizu, Erika; Ichihara, Hisako; Nakaya, Akihiro; Nakamura, Yasukazu; Hirakawa, Hideki; Ishii, Takahiro; Tamura, Takuro; Fukami-Kobayashi, Kaoru; Nakajima, Yukari; Tabata, Satoshi

    2014-01-01

    The Plant Genome DataBase Japan (PGDBj, http://pgdbj.jp/?ln=en) is a portal website that aims to integrate plant genome-related information from databases (DBs) and the literature. The PGDBj is comprised of three component DBs and a cross-search engine, which provides a seamless search over the contents of the DBs. The three DBs are as follows. (i) The Ortholog DB, providing gene cluster information based on the amino acid sequence similarity. Over 500,000 amino acid sequences of 20 Viridiplantae species were subjected to reciprocal BLAST searches and clustered. Sequences from plant genome DBs (e.g. TAIR10 and RAP-DB) were also included in the cluster with a direct link to the original DB. (ii) The Plant Resource DB, integrating the SABRE DB, which provides cDNA and genome sequence resources accumulated and maintained in the RIKEN BioResource Center and National BioResource Projects. (iii) The DNA Marker DB, providing manually or automatically curated information of DNA markers, quantitative trait loci and related linkage maps, from the literature and external DBs. As the PGDBj targets various plant species, including model plants, algae, and crops important as food, fodder and biofuel, researchers in the field of basic biology as well as a wide range of agronomic fields are encouraged to perform searches using DNA sequences, gene names, traits and phenotypes of interest. The PGDBj will return the search results from the component DBs and various types of linked external DBs. PMID:24363285

  17. Genome-wide RNAi screen reveals ALK1 mediates LDL uptake and transcytosis in endothelial cells

    PubMed Central

    Kraehling, Jan R.; Chidlow, John H.; Rajagopal, Chitra; Sugiyama, Michael G.; Fowler, Joseph W.; Lee, Monica Y.; Zhang, Xinbo; Ramírez, Cristina M.; Park, Eon Joo; Tao, Bo; Chen, Keyang; Kuruvilla, Leena; Larriveé, Bruno; Folta-Stogniew, Ewa; Ola, Roxana; Rotllan, Noemi; Zhou, Wenping; Nagle, Michael W.; Herz, Joachim; Williams, Kevin Jon; Eichmann, Anne; Lee, Warren L.; Fernández-Hernando, Carlos; Sessa, William C.

    2016-01-01

    In humans and animals lacking functional LDL receptor (LDLR), LDL from plasma still readily traverses the endothelium. To identify the pathways of LDL uptake, a genome-wide RNAi screen was performed in endothelial cells and cross-referenced with GWAS-data sets. Here we show that the activin-like kinase 1 (ALK1) mediates LDL uptake into endothelial cells. ALK1 binds LDL with lower affinity than LDLR and saturates only at hypercholesterolemic concentrations. ALK1 mediates uptake of LDL into endothelial cells via an unusual endocytic pathway that diverts the ligand from lysosomal degradation and promotes LDL transcytosis. The endothelium-specific genetic ablation of Alk1 in Ldlr-KO animals leads to less LDL uptake into the aortic endothelium, showing its physiological role in endothelial lipoprotein metabolism. In summary, identification of pathways mediating LDLR-independent uptake of LDL may provide unique opportunities to block the initiation of LDL accumulation in the vessel wall or augment hepatic LDLR-dependent clearance of LDL. PMID:27869117

  18. Prolonged Integration Site Selection of a Lentiviral Vector in the Genome of Human Keratinocytes

    PubMed Central

    Qian, Wei; Wang, Yong; Li, Rui-fu; Zhou, Xin; Liu, Jing; Peng, Dai-zhi

    2017-01-01

    Background Lentiviral vectors have been successfully used for human skin cell gene transfer studies. Defining the selection of integration sites for retroviral vectors in the host genome is crucial in risk assessment analysis of gene therapy. However, genome-wide analyses of lentiviral integration sites in human keratinocytes, especially after prolonged growth, are poorly understood. Material/Methods In this study, 874 unique lentiviral vector integration sites in human HaCaT keratinocytes after long-term culture were identified and analyzed with the online tool GTSG-QuickMap and SPSS software. Results The data indicated that lentiviral vectors showed integration site preferences for genes and gene-rich regions. Conclusions This study will likely assist in determining the relative risks of the lentiviral vector system and in the design of a safe lentiviral vector system in the gene therapy of skin diseases. PMID:28255155

  19. Processing speed impairment in schizophrenia is mediated by white matter integrity

    PubMed Central

    Karbasforoushan, Haleh; Duffy, Brittney; Blackford, Jennifer Urbano; Woodward, Neil D.

    2017-01-01

    Background Processing speed predicts functional outcome and is a potential endophenotype for schizophrenia. Establishing the neural basis of processing speed impairment may inform the treatment and etiology of schizophrenia. Neuroimaging investigations in healthy subjects have linked processing speed to brain anatomical connectivity. However, the relationship between processing speed impairment and white matter integrity in schizophrenia is unclear. Methods Individuals with schizophrenia and healthy subjects underwent diffusion tensor imaging (DTI) and completed a brief neuropsychological assessment that included measures of processing speed, verbal learning, working memory, and executive functioning. Group differences in white matter integrity, inferred from fractional anisotropy (FA), were examined throughout the brain and the hypothesis that processing speed impairment in schizophrenia is mediated by diminished white matter integrity was tested. Results White matter integrity of the corpus callosum, cingulum, superior and inferior frontal gyri, and precuneus was reduced in schizophrenia. Average FA in these regions mediated group differences in processing speed, but not other cognitive domains. Diminished white matter integrity in schizophrenia was accounted for, in large part, by individual differences in processing speed. Conclusions Cognitive impairment in schizophrenia mediated by reduced white matter integrity. This relationship was strongest for processing speed as deficits in working memory, verbal learning, and executive functioning were not mediated by white integrity. Larger sample sizes may be required to detect more subtle mediation effects in these domains. Interventions that preserve white matter integrity or ameliorate white matter disruption may enhance processing speed and functional outcome in schizophrenia. PMID:25066842

  20. Processing speed impairment in schizophrenia is mediated by white matter integrity.

    PubMed

    Karbasforoushan, H; Duffy, B; Blackford, J U; Woodward, N D

    2015-01-01

    Processing speed predicts functional outcome and is a potential endophenotype for schizophrenia. Establishing the neural basis of processing speed impairment may inform the treatment and etiology of schizophrenia. Neuroimaging investigations in healthy subjects have linked processing speed to brain anatomical connectivity. However, the relationship between processing speed impairment and white matter (WM) integrity in schizophrenia is unclear. Individuals with schizophrenia and healthy subjects underwent diffusion tensor imaging (DTI) and completed a brief neuropsychological assessment that included measures of processing speed, verbal learning, working memory and executive functioning. Group differences in WM integrity, inferred from fractional anisotropy (FA), were examined throughout the brain and the hypothesis that processing speed impairment in schizophrenia is mediated by diminished WM integrity was tested. WM integrity of the corpus callosum, cingulum, superior and inferior frontal gyri, and precuneus was reduced in schizophrenia. Average FA in these regions mediated group differences in processing speed but not in other cognitive domains. Diminished WM integrity in schizophrenia was accounted for, in large part, by individual differences in processing speed. Cognitive impairment in schizophrenia was mediated by reduced WM integrity. This relationship was strongest for processing speed because deficits in working memory, verbal learning and executive functioning were not mediated by WM integrity. Larger sample sizes may be required to detect more subtle mediation effects in these domains. Interventions that preserve WM integrity or ameliorate WM disruption may enhance processing speed and functional outcome in schizophrenia.

  1. p53 isoform Δ133p53 promotes efficiency of induced pluripotent stem cells and ensures genomic integrity during reprogramming

    PubMed Central

    Gong, Lu; Pan, Xiao; Chen, Haide; Rao, Lingjun; Zeng, Yelin; Hang, Honghui; Peng, Jinrong; Xiao, Lei; Chen, Jun

    2016-01-01

    Human induced pluripotent stem (iPS) cells have great potential in regenerative medicine, but this depends on the integrity of their genomes. iPS cells have been found to contain a large number of de novo genetic alterations due to DNA damage response during reprogramming. Thus, to maintain the genetic stability of iPS cells is an important goal in iPS cell technology. DNA damage response can trigger tumor suppressor p53 activation, which ensures genome integrity of reprogramming cells by inducing apoptosis and senescence. p53 isoform Δ133p53 is a p53 target gene and functions to not only antagonize p53 mediated apoptosis, but also promote DNA double-strand break (DSB) repair. Here we report that Δ133p53 is induced in reprogramming. Knockdown of Δ133p53 results 2-fold decrease in reprogramming efficiency, 4-fold increase in chromosomal aberrations, whereas overexpression of Δ133p53 with 4 Yamanaka factors showes 4-fold increase in reprogamming efficiency and 2-fold decrease in chromosomal aberrations, compared to those in iPS cells induced only with 4 Yamanaka factors. Overexpression of Δ133p53 can inhibit cell apoptosis and promote DNA DSB repair foci formation during reprogramming. Our finding demonstrates that the overexpression of Δ133p53 not only enhances reprogramming efficiency, but also results better genetic quality in iPS cells. PMID:27874035

  2. Genomic integration of the full-length dystrophin coding sequence in Duchenne muscular dystrophy induced pluripotent stem cells.

    PubMed

    Farruggio, Alfonso P; Bhakta, Mital S; du Bois, Haley; Ma, Julia; P Calos, Michele

    2017-04-01

    The plasmid vectors that express the full-length human dystrophin coding sequence in human cells was developed. Dystrophin, the protein mutated in Duchenne muscular dystrophy, is extraordinarily large, providing challenges for cloning and plasmid production in Escherichia coli. The authors expressed dystrophin from the strong, widely expressed CAG promoter, along with co-transcribed luciferase and mCherry marker genes useful for tracking plasmid expression. Introns were added at the 3' and 5' ends of the dystrophin sequence to prevent translation in E. coli, resulting in improved plasmid yield. Stability and yield were further improved by employing a lower-copy number plasmid origin of replication. The dystrophin plasmids also carried an attB site recognized by phage phiC31 integrase, enabling the plasmids to be integrated into the human genome at preferred locations by phiC31 integrase. The authors demonstrated single-copy integration of plasmid DNA into the genome and production of human dystrophin in the human 293 cell line, as well as in induced pluripotent stem cells derived from a patient with Duchenne muscular dystrophy. Plasmid-mediated dystrophin expression was also demonstrated in mouse muscle. The dystrophin expression plasmids described here will be useful in cell and gene therapy studies aimed at ameliorating Duchenne muscular dystrophy.

  3. An integrated CRISPR Bombyx mori genome editing system with improved efficiency and expanded target sites.

    PubMed

    Ma, Sanyuan; Liu, Yue; Liu, Yuanyuan; Chang, Jiasong; Zhang, Tong; Wang, Xiaogang; Shi, Run; Lu, Wei; Xia, Xiaojuan; Zhao, Ping; Xia, Qingyou

    2017-02-09

    Genome editing enabled unprecedented new opportunities for targeted genomic engineering of a wide variety of organisms ranging from microbes, plants, animals and even human embryos. The serial establishing and rapid applications of genome editing tools significantly accelerated Bombyx mori (B. mori) research during the past years. However, the only CRISPR system in B. mori was the commonly used SpCas9, which only recognize target sites containing NGG PAM sequence. In the present study, we first improve the efficiency of our previous established SpCas9 system by 3.5 folds. The improved high efficiency was also observed at several loci in both BmNs cells and B. mori embryos. Then to expand the target sites, we showed that two newly discovered CRISPR system, SaCas9 and AsCpf1, could also induce highly efficient site-specific genome editing in BmNs cells, and constructed an integrated CRISPR system. Genome-wide analysis of targetable sites was further conducted and showed that the integrated system cover 69,144,399 sites in B. mori genome, and one site could be found in every 6.5 bp. The efficiency and resolution of this CRISPR platform will probably accelerate both fundamental researches and applicable studies in B. mori, and perhaps other insects.

  4. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia.

    PubMed

    Williams, Anna V; Miller, Joseph T; Small, Ian; Nevill, Paul G; Boykin, Laura M

    2016-03-01

    Combining whole genome data with previously obtained amplicon sequences has the potential to increase the resolution of phylogenetic analyses, particularly at low taxonomic levels or where recent divergence, rapid speciation or slow genome evolution has resulted in limited sequence variation. However, the integration of these types of data for large scale phylogenetic studies has rarely been investigated. Here we conduct a phylogenetic analysis of the whole chloroplast genome and two nuclear ribosomal loci for 65 Acacia species from across the most recent Acacia phylogeny. We then combine this data with previously generated amplicon sequences (four chloroplast loci and two nuclear ribosomal loci) for 508 Acacia species. We use several phylogenetic methods, including maximum likelihood bootstrapping (with and without constraint) and ExaBayes, in order to determine the success of combining a dataset of 4000bp with one of 189,000bp. The results of our study indicate that the inclusion of whole genome data gave a far better resolved and well supported representation of the phylogenetic relationships within Acacia than using only amplicon sequences, with the greatest support observed when using a whole genome phylogeny as a constraint on the amplicon sequences. Our study therefore provides methods for optimal integration of genomic and amplicon sequences.

  5. An integrated approach for analyzing clinical genomic variant data from next-generation sequencing.

    PubMed

    Crowgey, Erin L; Stabley, Deborah L; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M; Polson, Shawn W; Sol-Church, Katia; Wu, Cathy H

    2015-04-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource's iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease.

  6. Dynamic Interplay between Nucleoid Segregation and Genome Integrity in Chlamydomonas Chloroplasts.

    PubMed

    Odahara, Masaki; Kobayashi, Yusuke; Shikanai, Toshiharu; Nishimura, Yoshiki

    2016-12-01

    The chloroplast (cp) genome is organized as nucleoids that are dispersed throughout the cp stroma. Previously, a cp homolog of bacterial recombinase RecA (cpRECA) was shown to be involved in the maintenance of cp genome integrity by repairing damaged chloroplast DNA and by suppressing aberrant recombination between short dispersed repeats in the moss Physcomitrella patens Here, overexpression and knockdown analysis of cpRECA in the green alga Chlamydomonas reinhardtii revealed that cpRECA was involved in cp nucleoid dynamics as well as having a role in maintaining cp genome integrity. Overexpression of cpRECA tagged with yellow fluorescent protein or hemagglutinin resulted in the formation of giant filamentous structures that colocalized exclusively to chloroplast DNA and cpRECA localized to cp nucleoids in a heterogenous manner. Knockdown of cpRECA led to a significant reduction in cp nucleoid number that was accompanied by nucleoid enlargement. This phenotype resembled those of gyrase inhibitor-treated cells and monokaryotic chloroplast mutant cells and suggested that cpRECA was involved in organizing cp nucleoid dynamics. The cp genome also was destabilized by induced recombination between short dispersed repeats in cpRECA-knockdown cells and gyrase inhibitor-treated cells. Taken together, these results suggest that cpRECA and gyrase are both involved in nucleoid dynamics and the maintenance of genome integrity and that the mechanisms underlying these processes may be intimately related in C. reinhardtii cps. © 2016 American Society of Plant Biologists. All Rights Reserved.

  7. A Pilot Bridging Data Integration and Analytics: BioMediator and R?

    PubMed Central

    Jeng, S.; Wang, K.; Barbero, J.; Brinkley, J; Tarczy-Hornoch, P.

    2005-01-01

    Biological research today involves aggregating and analyzing large amounts of data from disparate sources. Tools such as the University of Washington’s BioMediator system integrate heterogeneous data. Analytic packages such as the R environment have a rich set of tools to analyze biomedical research data. Our pilot project bridged data integration and analytics in a general way by successfully incorporating the BioMediator system into the R platform for specific analyses on neurophysiologic research data. PMID:16779282

  8. PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.

    PubMed

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X

    2017-01-01

    Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.

  9. Integrative genomics--a basic and essential tool for the development of molecular medicine.

    PubMed

    Ostrowski, Jerzy

    2008-01-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, and usually on the scale of single genes. Medicine in the post-genomic era will utilize thousands of molecular markers associated with disease that are provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical analyses and bioinformatic modeling of biological systems. The collecting, cataloging and comparison of data from molecular studies and the subsequent development of conclusions create the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm called integrative genomics.

  10. Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis.

    PubMed

    Low, Teck Yew; van Heesch, Sebastiaan; van den Toorn, Henk; Giansanti, Piero; Cristobal, Alba; Toonen, Pim; Schafer, Sebastian; Hübner, Norbert; van Breukelen, Bas; Mohammed, Shabaz; Cuppen, Edwin; Heck, Albert J R; Guryev, Victor

    2013-12-12

    Quantitative and qualitative protein characteristics are regulated at genomic, transcriptomic, and posttranscriptional levels. Here, we integrated in-depth transcriptome and proteome analyses of liver tissues from two rat strains to unravel the interactions within and between these layers. We obtained peptide evidence for 26,463 rat liver proteins. We validated 1,195 gene predictions, 83 splice events, 126 proteins with nonsynonymous variants, and 20 isoforms with nonsynonymous RNA editing. Quantitative RNA sequencing and proteomics data correlate highly between strains but poorly among each other, indicating extensive nongenetic regulation. Our multilevel analysis identified a genomic variant in the promoter of the most differentially expressed gene Cyp17a1, a previously reported top hit in genome-wide association studies for human hypertension, as a potential contributor to the hypertension phenotype in SHR rats. These results demonstrate the power of and need for integrative analysis for understanding genetic control of molecular dynamics and phenotypic diversity in a system-wide manner.

  11. Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

    PubMed

    Bibby, Kyle

    2014-02-01

    The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.

  12. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

    PubMed Central

    Irizarry, Kristopher J. L.; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L.; Barrett, Gini; Barr, Margaret C.

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076

  13. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    PubMed

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  14. Decoding the genome with an integrative analysis tool: combinatorial CRM Decoder.

    PubMed

    Kang, Keunsoo; Kim, Joomyeong; Chung, Jae Hoon; Lee, Daeyoup

    2011-09-01

    The identification of genome-wide cis-regulatory modules (CRMs) and characterization of their associated epigenetic features are fundamental steps toward the understanding of gene regulatory networks. Although integrative analysis of available genome-wide information can provide new biological insights, the lack of novel methodologies has become a major bottleneck. Here, we present a comprehensive analysis tool called combinatorial CRM decoder (CCD), which utilizes the publicly available information to identify and characterize genome-wide CRMs in a species of interest. CCD first defines a set of the epigenetic features which is significantly associated with a set of known CRMs as a code called 'trace code', and subsequently uses the trace code to pinpoint putative CRMs throughout the genome. Using 61 genome-wide data sets obtained from 17 independent mouse studies, CCD successfully catalogued ∼12 600 CRMs (five distinct classes) including polycomb repressive complex 2 target sites as well as imprinting control regions. Interestingly, we discovered that ∼4% of the identified CRMs belong to at least two different classes named 'multi-functional CRM', suggesting their functional importance for regulating spatiotemporal gene expression. From these examples, we show that CCD can be applied to any potential genome-wide datasets and therefore will shed light on unveiling genome-wide CRMs in various species.

  15. Barriers and potential solutions for Critical Zone data integration between environmental genomics and the geosciences

    NASA Astrophysics Data System (ADS)

    Aronson, E. L.; Meyer, F.; Packman, A. I.; Mayorga, E.

    2015-12-01

    The Earth's permeable near-surface layer from bedrock to canopy is referred to as the Critical Zone (CZ). Integration of bio- and geoscience data is critical for understanding physical, biological and chemical interactions in the CZ. Genomic and meta-genomic scientists study organisms both in laboratory settings and in the environment, in order to understand the interactions of organisms with the environment. Geoscientists are using environmental data to describe and model dynamics of physical and chemical properties. Yet, there is no agreed upon method for integrating genomic and environmental data to address interactions of living and non-living components of the CZ. There are standards for data interchange being developed in the geosciences and genomics sciences, via standards organization such as the Open Geospatial Consortium (OGC), as well as by research communities in biogeochemistry, hydrology, climatology, and other fields. These are in parallel to, but typically not in coordination with the standards the Genomics Standards Consortium (GSC) is developing for genomics. In addition, efforts are being made to allow for intercompatability of these CZ data with data generated by NEON, Inc. The interoperability of these types of data is limited with current software and cyberinfrastructure. A group of CZ geoscientists, environmental genomic scientists and cyberinfrastructure scientists are coming together to develop a set of common data collection and integration methods and sets of common standards. The data generated by this effort across multiple CZ sites (including the US CZ Observatories, or CZOs) around the world, along with NEON facility data, will be used to test EarthCube (an NSF initiative to develop cyberinfrastructure for the geosciences) cyberinfrastructure, with the goal of bridging this gap in standards and interoperability. Potential solutions to these issues of interoperability will be presented, and a way forward will be described.

  16. Rice TOGO Browser: A platform to retrieve integrated information on rice functional and applied genomics.

    PubMed

    Nagamura, Yoshiaki; Antonio, Baltazar A; Sato, Yutaka; Miyao, Akio; Namiki, Nobukazu; Yonemaru, Jun-ichi; Minami, Hiroshi; Kamatsuki, Kaori; Shimura, Kan; Shimizu, Yuji; Hirochika, Hirohiko

    2011-02-01

    The Rice TOGO Browser is an online public resource designed to facilitate integration and visualization of mapping data of bacterial artificial chromosome (BAC)/P1-derived artificial chromosome (PAC) clones, genes, restriction fragment length polymorphism (RFLP)/simple sequence repeat (SSR) markers and phenotype data represented as quantitative trait loci (QTLs) onto the genome sequence, and to provide a platform for more efficient utilization of genome information from the point of view of applied genomics as well as functional genomics. Three search options, namely keyword search, region search and trait search, generate various types of data in a user-friendly interface with three distinct viewers, a chromosome viewer, an integrated map viewer and a sequence viewer, thereby providing the opportunity to view the position of genes and/or QTLs at the chromosomal level and to retrieve any sequence information in a user-defined genome region. Furthermore, the gene list, marker list and genome sequence in a specified region delineated by RFLP/SSR markers and any sequences designed as primers can be viewed and downloaded to support forward genetics approaches. An additional feature of this database is the graphical viewer for BLAST search to reveal information not only for regions with significant sequence similarity but also for regions adjacent to those with similarity but with no hits between sequences. An easy to use and intuitive user interface can help a wide range of users in retrieving integrated mapping information including agronomically important traits on the rice genome sequence. The database can be accessed at http://agri-trait.dna.affrc.go.jp/.

  17. Integrating functional genomics to accelerate mechanistic personalized medicine

    PubMed Central

    Tyner, Jeffrey W.

    2017-01-01

    The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns. PMID:28299357

  18. Integrating functional genomics to accelerate mechanistic personalized medicine.

    PubMed

    Tyner, Jeffrey W

    2017-03-01

    The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns.

  19. Dissecting the brown adipogenic regulatory network using integrative genomics

    PubMed Central

    Pradhan, Rachana N.; Bues, Johannes J.; Gardeux, Vincent; Schwalie, Petra C.; Alpern, Daniel; Chen, Wanze; Russeil, Julie; Raghav, Sunil K.; Deplancke, Bart

    2017-01-01

    Brown adipocytes regulate energy expenditure via mitochondrial uncoupling, which makes them attractive therapeutic targets to tackle obesity. However, the regulatory mechanisms underlying brown adipogenesis are still poorly understood. To address this, we profiled the transcriptome and chromatin state during mouse brown fat cell differentiation, revealing extensive gene expression changes and chromatin remodeling, especially during the first day post-differentiation. To identify putatively causal regulators, we performed transcription factor binding site overrepresentation analyses in active chromatin regions and prioritized factors based on their expression correlation with the bona-fide brown adipogenic marker Ucp1 across multiple mouse and human datasets. Using loss-of-function assays, we evaluated both the phenotypic effect as well as the transcriptomic impact of several putative regulators on the differentiation process, uncovering ZFP467, HOXA4 and Nuclear Factor I A (NFIA) as novel transcriptional regulators. Of these, NFIA emerged as the regulator yielding the strongest molecular and cellular phenotypes. To examine its regulatory function, we profiled the genomic localization of NFIA, identifying it as a key early regulator of terminal brown fat cell differentiation. PMID:28181539

  20. Integrated and translational genomics for analysis of complex traits in crops

    USDA-ARS?s Scientific Manuscript database

    We report here on integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of translating gems from these resources into useable DNA markers in the ...

  1. Filling the knowledge gap: Integrating quantitative genetics and genomics in graduate education and outreach

    USDA-ARS?s Scientific Manuscript database

    The genomics revolution provides vital tools to address global food security. Yet to be incorporated into livestock breeding, molecular techniques need to be integrated into a quantitative genetics framework. Within the U.S., with shrinking faculty numbers with the requisite skills, the capacity to ...

  2. Supporting community annotation and user collaboration in the integrated microbial genomes (IMG) system

    SciTech Connect

    Chen, I-Min A.; Markowitz, Victor M.; Palaniappan, Krishna; Szeto, Ernest; Chu, Ken; Huang, Jinghua; Ratner, Anna; Pillay, Manoj; Hadjithomas, Michalis; Huntemann, Marcel; Mikhailova, Natalia; Ovchinnikova, Galina; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2016-04-26

    Background: The exponential growth of genomic data from next generation technologies renders traditional manual expert curation effort unsustainable. Many genomic systems have included community annotation tools to address the problem. Most of these systems adopted a "Wiki-based" approach to take advantage of existing wiki technologies, but encountered obstacles in issues such as usability, authorship recognition, information reliability and incentive for community participation. Results: Here, we present a different approach, relying on tightly integrated method rather than "Wiki-based" method, to support community annotation and user collaboration in the Integrated Microbial Genomes (IMG) system. The IMG approach allows users to use existing IMG data warehouse and analysis tools to add gene, pathway and biosynthetic cluster annotations, to analyze/reorganize contigs, genes and functions using workspace datasets, and to share private user annotations and workspace datasets with collaborators. We show that the annotation effort using IMG can be part of the research process to overcome the user incentive and authorship recognition problems thus fostering collaboration among domain experts. The usability and reliability issues are addressed by the integration of curated information and analysis tools in IMG, together with DOE Joint Genome Institute (JGI) expert review. Conclusion: By incorporating annotation operations into IMG, we provide an integrated environment for users to perform deeper and extended data analysis and annotation in a single system that can lead to publications and community knowledge sharing as shown in the case studies.

  3. Supporting community annotation and user collaboration in the integrated microbial genomes (IMG) system

    DOE PAGES

    Chen, I-Min A.; Markowitz, Victor M.; Palaniappan, Krishna; ...

    2016-04-26

    Background: The exponential growth of genomic data from next generation technologies renders traditional manual expert curation effort unsustainable. Many genomic systems have included community annotation tools to address the problem. Most of these systems adopted a "Wiki-based" approach to take advantage of existing wiki technologies, but encountered obstacles in issues such as usability, authorship recognition, information reliability and incentive for community participation. Results: Here, we present a different approach, relying on tightly integrated method rather than "Wiki-based" method, to support community annotation and user collaboration in the Integrated Microbial Genomes (IMG) system. The IMG approach allows users to use existingmore » IMG data warehouse and analysis tools to add gene, pathway and biosynthetic cluster annotations, to analyze/reorganize contigs, genes and functions using workspace datasets, and to share private user annotations and workspace datasets with collaborators. We show that the annotation effort using IMG can be part of the research process to overcome the user incentive and authorship recognition problems thus fostering collaboration among domain experts. The usability and reliability issues are addressed by the integration of curated information and analysis tools in IMG, together with DOE Joint Genome Institute (JGI) expert review. Conclusion: By incorporating annotation operations into IMG, we provide an integrated environment for users to perform deeper and extended data analysis and annotation in a single system that can lead to publications and community knowledge sharing as shown in the case studies.« less

  4. Enhancing the specificity of recombinase-mediated genome engineering through dimer interface redesign.

    PubMed

    Gaj, Thomas; Sirk, Shannon J; Tingle, Ryan D; Mercer, Andrew C; Wallen, Mark C; Barbas, Carlos F

    2014-04-02

    Despite recent advances in genome engineering made possible by the emergence of site-specific endonucleases, there remains a need for tools capable of specifically delivering genetic payloads into the human genome. Hybrid recombinases based on activated catalytic domains derived from the resolvase/invertase family of serine recombinases fused to Cys2-His2 zinc-finger or TAL effector DNA-binding domains are a class of reagents capable of achieving this. The utility of these enzymes, however, has been constrained by their low overall targeting specificity, largely due to the formation of side-product homodimers capable of inducing off-target modifications. Here, we combine rational design and directed evolution to re-engineer the serine recombinase dimerization interface and generate a recombinase architecture that reduces formation of these undesirable homodimers by >500-fold. We show that these enhanced recombinases demonstrate substantially improved targeting specificity in mammalian cells and achieve rates of site-specific integration similar to those previously reported for site-specific nucleases. Additionally, we show that enhanced recombinases exhibit low toxicity and promote the delivery of the human coagulation factor IX and α-galactosidase genes into endogenous genomic loci with high specificity. These results provide a general means for improving hybrid recombinase specificity by protein engineering and illustrate the potential of these enzymes for basic research and therapeutic applications.

  5. [Prolonging the vase life of carnation "Mabel" through integrating repeated ACC oxidase genes into its genome].

    PubMed

    Yu, Yi-Xun; Bao, Man-Zhu

    2004-10-01

    Carnation (Dianthus caryophyllus L.) is one of the most important cut flowers. The cultivar "Mabel" of carnation was transformed with direct repeat gene of ACC oxidase, the key enzyme in ethylene synthesis, driven by the CaMV35S promoter mediated by Agrobacterium tumefacien. Hygromycin phosphotransferase (HPT) gene was used as selection marker. Leaf explants were pre-cultured on shoot-inducing medium for 2 d, then immersed in Agrobacterium suspension for 8-12 min. Co-cultivation was carried out on the medium (MS+BA 1.0 mg/L+NAA 0.3 mg/L +Acetosyringone 100 micromol/L, pH 5.8-6.0) for 3 d. After that transformants were obtained by transferring explants to selection medium supplemented with 5 mg/L hygromycin (Hyg) and 400 mg/L cefotaxime (Cef). Southern blotting detection showed that a foreign gene was integrated into the carnation genome and 3 transgenic lines (T257, T299 and T273 line) obtained. Addition of acetosyringone and the time of co-culture were the main factors that influenced transformation frequency. After being transplanted to soil, transgenic plants were grew normally in greenhouse. Ethylene production of cut flower of transgenic T257 line was 95% lower than that of the control, and that of T299 line was reduced by 90% than that of the control, while that of transgenic T273 line has no of significantly different from control. Vase life of transgenic T257 line was 5 d longer than that of the control line at 25 degrees C.

  6. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards

    PubMed Central

    Rappaport, Noa; Hadar, Rotem; Plaschkes, Inbar; Iny Stein, Tsippi; Rosen, Naomi; Kohn, Asher; Twik, Michal; Safran, Marilyn

    2017-01-01

    Abstract A major challenge in understanding gene regulation is the unequivocal identification of enhancer elements and uncovering their connections to genes. We present GeneHancer, a novel database of human enhancers and their inferred target genes, in the framework of GeneCards. First, we integrated a total of 434 000 reported enhancers from four different genome-wide databases: the Encyclopedia of DNA Elements (ENCODE), the Ensembl regulatory build, the functional annotation of the mammalian genome (FANTOM) project and the VISTA Enhancer Browser. Employing an integration algorithm that aims to remove redundancy, GeneHancer portrays 285 000 integrated candidate enhancers (covering 12.4% of the genome), 94 000 of which are derived from more than one source, and each assigned an annotation-derived confidence score. GeneHancer subsequently links enhancers to genes, using: tissue co-expression correlation between genes and enhancer RNAs, as well as enhancer-targeted transcription factor genes; expression quantitative trait loci for variants within enhancers; and capture Hi-C, a promoter-specific genome conformation assay. The individual scores based on each of these four methods, along with gene–enhancer genomic distances, form the basis for GeneHancer’s combinatorial likelihood-based scores for enhancer–gene pairing. Finally, we define ‘elite’ enhancer–gene relations reflecting both a high-likelihood enhancer definition and a strong enhancer–gene association. GeneHancer predictions are fully integrated in the widely used GeneCards Suite, whereby candidate enhancers and their annotations are displayed on every relevant GeneCard. This assists in the mapping of non-coding variants to enhancers, and via the linked genes, forms a basis for variant–phenotype interpretation of whole-genome sequences in health and disease. Database URL: http://www.genecards.org/ PMID:28605766

  7. Multiplex CRISPR/Cas9-based genome engineering enhanced by Drosha-mediated sgRNA-shRNA structure.

    PubMed

    Yan, Qiang; Xu, Kun; Xing, Jiani; Zhang, Tingting; Wang, Xin; Wei, Zehui; Ren, Chonghua; Liu, Zhongtian; Shao, Simin; Zhang, Zhiying

    2016-12-12

    The clustered regularly interspaced short palindromic repeats (CRISPR) system has recently been developed into a powerful genome-editing technology, as it requires only two key components (Cas9 protein and sgRNA) to function and further enables multiplex genome targeting and homology-directed repair (HDR) based precise genome editing in a wide variety of organisms. Here, we report a novel and interesting strategy by using the Drosha-mediated sgRNA-shRNA structure to direct Cas9 for multiplex genome targeting and precise genome editing. For multiplex genome targeting assay, we achieved more than 9% simultaneous mutant efficiency for 3 genomic loci among the puromycin-selected cell clones. By introducing the shRNA against DNA ligase IV gene (LIG4) into the sgRNA-shRNA construct, the HDR-based precise genome editing efficiency was improved as more than 2-fold. Our works provide a useful tool for multiplex and precise genome modifying in mammalian cells.

  8. Multiplex CRISPR/Cas9-based genome engineering enhanced by Drosha-mediated sgRNA-shRNA structure

    PubMed Central

    Yan, Qiang; Xu, Kun; Xing, Jiani; Zhang, Tingting; Wang, Xin; Wei, Zehui; Ren, Chonghua; Liu, Zhongtian; Shao, Simin; Zhang, Zhiying

    2016-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) system has recently been developed into a powerful genome-editing technology, as it requires only two key components (Cas9 protein and sgRNA) to function and further enables multiplex genome targeting and homology-directed repair (HDR) based precise genome editing in a wide variety of organisms. Here, we report a novel and interesting strategy by using the Drosha-mediated sgRNA-shRNA structure to direct Cas9 for multiplex genome targeting and precise genome editing. For multiplex genome targeting assay, we achieved more than 9% simultaneous mutant efficiency for 3 genomic loci among the puromycin-selected cell clones. By introducing the shRNA against DNA ligase IV gene (LIG4) into the sgRNA-shRNA construct, the HDR-based precise genome editing efficiency was improved as more than 2-fold. Our works provide a useful tool for multiplex and precise genome modifying in mammalian cells. PMID:27941919

  9. Various applications of TALEN- and CRISPR/Cas9-mediated homologous recombination to modify the Drosophila genome.

    PubMed

    Yu, Zhongsheng; Chen, Hanqing; Liu, Jiyong; Zhang, Hongtao; Yan, Yan; Zhu, Nannan; Guo, Yawen; Yang, Bo; Chang, Yan; Dai, Fei; Liang, Xuehong; Chen, Yixu; Shen, Yan; Deng, Wu-Min; Chen, Jianming; Zhang, Bo; Li, Changqing; Jiao, Renjie

    2014-04-15

    Modifying the genomes of many organisms is becoming as easy as manipulating DNA in test tubes, which is made possible by two recently developed techniques based on either the customizable DNA binding protein, TALEN, or the CRISPR/Cas9 system. Here, we describe a series of efficient applications derived from these two technologies, in combination with various homologous donor DNA plasmids, to manipulate the Drosophila genome: (1) to precisely generate genomic deletions; (2) to make genomic replacement of a DNA fragment at single nucleotide resolution; and (3) to generate precise insertions to tag target proteins for tracing their endogenous expressions. For more convenient genomic manipulations, we established an easy-to-screen platform by knocking in a white marker through homologous recombination. Further, we provided a strategy to remove the unwanted duplications generated during the "ends-in" recombination process. Our results also indicate that TALEN and CRISPR/Cas9 had comparable efficiency in mediating genomic modifications through HDR (homology-directed repair); either TALEN or the CRISPR/Cas9 system could efficiently mediate in vivo replacement of DNA fragments of up to 5 kb in Drosophila, providing an ideal genetic tool for functional annotations of the Drosophila genome.

  10. Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma.

    PubMed

    2017-06-15

    Liver cancer has the second highest worldwide cancer mortality rate and has limited therapeutic options. We analyzed 363 hepatocellular carcinoma (HCC) cases by whole-exome sequencing and DNA copy number analyses, and we analyzed 196 HCC cases by DNA methylation, RNA, miRNA, and proteomic expression also. DNA sequencing and mutation analysis identified significantly mutated genes, including LZTR1, EEF1A1, SF3B1, and SMARCA4. Significant alterations by mutation or downregulation by hypermethylation in genes likely to result in HCC metabolic reprogramming (ALB, APOB, and CPS1) were observed. Integrative molecular HCC subtyping incorporating unsupervised clustering of five data platforms identified three subtypes, one of which was associated with poorer prognosis in three HCC cohorts. Integrated analyses enabled development of a p53 target gene expression signature correlating with poor survival. Potential therapeutic targets for which inhibitors exist include WNT signaling, MDM4, MET, VEGFA, MCL1, IDH1, TERT, and immune checkpoint proteins CTLA-4, PD-1, and PD-L1. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Integrative exploration of genomic profiles for triple negative breast cancer identifies potential drug targets

    PubMed Central

    Wang, Xiaosheng; Guda, Chittibabu

    2016-01-01

    Abstract Background: Triple negative breast cancer (TNBC) is high-risk due to its rapid drug resistance and recurrence, metastasis, and lack of targeted therapy. So far, no molecularly targeted therapeutic agents have been clinically approved for TNBC. It is imperative that we discover new targets for TNBC therapy. Objectives: A large volume of cancer genomics data are emerging and advancing breast cancer research. We may integrate different types of TNBC genomic data to discover molecular targets for TNBC therapy. Data sources: We used publicly available TNBC tumor tissue genomic data in the Cancer Genome Atlas database in this study. Methods: We integratively explored genomic profiles (gene expression, copy number, methylation, microRNA [miRNA], and gene mutation) in TNBC and identified hyperactivated genes that have higher expression, more copy numbers, lower methylation level, or are targets of miRNAs with lower expression in TNBC than in normal samples. We ranked the hyperactivated genes into different levels based on all the genomic evidence and performed functional analyses of the sets of genes identified. More importantly, we proposed potential molecular targets for TNBC therapy based on the hyperactivated genes. Results: Some of the genes we identified such as FGFR2, MAPK13, TP53, SRC family, MUC family, and BCL2 family have been suggested to be potential targets for TNBC treatment. Others such as CSF1R, EPHB3, TRIB1, and LAD1 could be promising new targets for TNBC treatment. By utilizing this integrative analysis of genomic profiles for TNBC, we hypothesized that some of the targeted treatment strategies for TNBC currently in development are more likely to be promising, such as poly (ADP-ribose) polymerase inhibitors, while the others are more likely to be discouraging, such as angiogenesis inhibitors. Limitations: The findings in this study need to be experimentally validated in the future. Conclusion: This is a systematic study that combined 5

  12. Efficient CRISPR/Cas9-Mediated Genome Editing in Mice by Zygote Electroporation of Nuclease

    PubMed Central

    Qin, Wenning; Dion, Stephanie L.; Kutny, Peter M.; Zhang, Yingfan; Cheng, Albert W.; Jillette, Nathaniel L.; Malhotra, Ankit; Geurts, Aron M.; Chen, Yi-Guang; Wang, Haoyi

    2015-01-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas) system is an adaptive immune system in bacteria and archaea that has recently been exploited for genome engineering. Mutant mice can be generated in one step through direct delivery of the CRISPR/Cas9 components into a mouse zygote. Although the technology is robust, delivery remains a bottleneck, as it involves manual injection of the components into the pronuclei or the cytoplasm of mouse zygotes, which is technically demanding and inherently low throughput. To overcome this limitation, we employed electroporation as a means to deliver the CRISPR/Cas9 components, including Cas9 messenger RNA, single-guide RNA, and donor oligonucleotide, into mouse zygotes and recovered live mice with targeted nonhomologous end joining and homology-directed repair mutations with high efficiency. Our results demonstrate that mice carrying CRISPR/Cas9-mediated targeted mutations can be obtained with high efficiency by zygote electroporation. PMID:25819794

  13. Molecular Characterization of Pediatric Restrictive Cardiomyopathy from Integrative Genomics

    PubMed Central

    Rindler, Tara N.; Hinton, Robert B.; Salomonis, Nathan; Ware, Stephanie M.

    2017-01-01

    Pediatric restrictive cardiomyopathy (RCM) is a genetically heterogeneous heart disease with limited therapeutic options. RCM cases are largely idiopathic; however, even within families with a known genetic cause for cardiomyopathy, there is striking variability in disease severity. Although accumulating evidence implicates both gene expression and alternative splicing in development of dilated cardiomyopathy (DCM), there have been no detailed molecular characterizations of underlying pathways dysregulated in RCM. RNA-Seq on a cohort of pediatric RCM patients compared to other forms of adult cardiomyopathy and controls identified transcriptional differences highly common to the cardiomyopathies, as well as those unique to RCM. Transcripts selectively induced in RCM include many known and novel G-protein coupled receptors linked to calcium handling and contractile regulation. In-depth comparisons of alternative splicing revealed splicing events shared among cardiomyopathy subtypes, as well as those linked solely to RCM. Genes identified with altered alternative splicing implicate RBM20, a DCM splicing factor, as a potential mediator of alternative splicing in RCM. We present the first comprehensive report on molecular pathways dysregulated in pediatric RCM including unique/shared pathways identified compared to other cardiomyopathy subtypes and demonstrate that disruption of alternative splicing patterns in pediatric RCM occurs in the inverse direction as DCM. PMID:28098235

  14. Molecular Characterization of Pediatric Restrictive Cardiomyopathy from Integrative Genomics.

    PubMed

    Rindler, Tara N; Hinton, Robert B; Salomonis, Nathan; Ware, Stephanie M

    2017-01-18

    Pediatric restrictive cardiomyopathy (RCM) is a genetically heterogeneous heart disease with limited therapeutic options. RCM cases are largely idiopathic; however, even within families with a known genetic cause for cardiomyopathy, there is striking variability in disease severity. Although accumulating evidence implicates both gene expression and alternative splicing in development of dilated cardiomyopathy (DCM), there have been no detailed molecular characterizations of underlying pathways dysregulated in RCM. RNA-Seq on a cohort of pediatric RCM patients compared to other forms of adult cardiomyopathy and controls identified transcriptional differences highly common to the cardiomyopathies, as well as those unique to RCM. Transcripts selectively induced in RCM include many known and novel G-protein coupled receptors linked to calcium handling and contractile regulation. In-depth comparisons of alternative splicing revealed splicing events shared among cardiomyopathy subtypes, as well as those linked solely to RCM. Genes identified with altered alternative splicing implicate RBM20, a DCM splicing factor, as a potential mediator of alternative splicing in RCM. We present the first comprehensive report on molecular pathways dysregulated in pediatric RCM including unique/shared pathways identified compared to other cardiomyopathy subtypes and demonstrate that disruption of alternative splicing patterns in pediatric RCM occurs in the inverse direction as DCM.

  15. The REST remodeling complex protects genomic integrity during embryonic neurogenesis

    PubMed Central

    Nechiporuk, Tamilla; McGann, James; Mullendorff, Karin; Hsieh, Jenny; Wurst, Wolfgang; Floss, Thomas; Mandel, Gail

    2016-01-01

    The timely transition from neural progenitor to post-mitotic neuron requires down-regulation and loss of the neuronal transcriptional repressor, REST. Here, we have used mice containing a gene trap in the Rest gene, eliminating transcription from all coding exons, to remove REST prematurely from neural progenitors. We find that catastrophic DNA damage occurs during S-phase of the cell cycle, with long-term consequences including abnormal chromosome separation, apoptosis, and smaller brains. Persistent effects are evident by latent appearance of proneural glioblastoma in adult mice deleted additionally for the tumor suppressor p53 protein (p53). A previous line of mice deleted for REST in progenitors by conventional gene targeting does not exhibit these phenotypes, likely due to a remaining C-terminal peptide that still binds chromatin and recruits co-repressors. Our results suggest that REST-mediated chromatin remodeling is required in neural progenitors for proper S-phase dynamics, as part of its well-established role in repressing neuronal genes until terminal differentiation. DOI: http://dx.doi.org/10.7554/eLife.09584.001 PMID:26745185

  16. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    PubMed Central

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  17. GAPIT Version 2: An Enhanced Integrated Tool for Genomic Association and Prediction.

    PubMed

    Tang, You; Liu, Xiaolei; Wang, Jiabo; Li, Meng; Wang, Qishan; Tian, Feng; Su, Zhongbin; Pan, Yuchun; Liu, Di; Lipka, Alexander E; Buckler, Edward S; Zhang, Zhiwu

    2016-07-01

    Most human diseases and agriculturally important traits are complex. Dissecting their genetic architecture requires continued development of innovative and powerful statistical methods. Corresponding advances in computing tools are critical to efficiently use these statistical innovations and to enhance and accelerate biomedical and agricultural research and applications. The genome association and prediction integrated tool (GAPIT) was first released in 2012 and became widely used for genome-wide association studies (GWAS) and genomic prediction. The GAPIT implemented computationally efficient statistical methods, including the compressed mixed linear model (CMLM) and genomic prediction by using genomic best linear unbiased prediction (gBLUP). New state-of-the-art statistical methods have now been implemented in a new, enhanced version of GAPIT. These methods include factored spectrally transformed linear mixed models (FaST-LMM), enriched CMLM (ECMLM), FaST-LMM-Select, and settlement of mixed linear models under progressively exclusive relationship (SUPER). The genomic prediction methods implemented in this new release of the GAPIT include gBLUP based on CMLM, ECMLM, and SUPER. Additionally, the GAPIT was updated to improve its existing output display features and to add new data display and evaluation functions, including new graphing options and capabilities, phenotype simulation, power analysis, and cross-validation. These enhancements make the GAPIT a valuable resource for determining appropriate experimental designs and performing GWAS and genomic prediction. The enhanced R-based GAPIT software package uses state-of-the-art methods to conduct GWAS and genomic prediction. The GAPIT also provides new functions for developing experimental designs and creating publication-ready tabular summaries and graphs to improve the efficiency and application of genomic research.

  18. Genome-wide scans between two honeybee populations reveal putative signatures of human-mediated selection.

    PubMed

    Parejo, M; Wragg, D; Henriques, D; Vignal, A; Neuditschko, M

    2017-09-04

    Human-mediated selection has left signatures in the genomes of many domesticated animals, including the European dark honeybee, Apis mellifera mellifera, which has been selected by apiculturists for centuries. Using whole-genome sequence information, we investigated selection signatures in spatially separated honeybee subpopulations (Switzerland, n = 39 and France, n = 17). Three different test statistics were calculated in windows of 2 kb (fixation index, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio) and combined into a recently developed composite selection score. Applying a stringent false discovery rate of 0.01, we identified six significant selective sweeps distributed across five chromosomes covering eight genes. These genes are associated with multiple molecular and biological functions, including regulation of transcription, receptor binding and signal transduction. Of particular interest is a selection signature on chromosome 1, which corresponds to the WNT4 gene, the family of which is conserved across the animal kingdom with a variety of functions. In Drosophila melanogaster, WNT4 alleles have been associated with differential wing, cross vein and abdominal phenotypes. Defining phenotypic characteristics of different Apis mellifera ssp., which are typically used as selection criteria, include colour and wing venation pattern. This signal is therefore likely to be a good candidate for human mediated-selection arising from different applied breeding practices in the two managed populations. © 2017 The Authors. Animal Genetics published by John Wiley & Sons Ltd on behalf of Stichting International Foundation for Animal Genetics.

  19. Identifying candidate driver genes by integrative ovarian cancer genomics data

    NASA Astrophysics Data System (ADS)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  20. Integrative Exploratory Analysis of Two or More Genomic Datasets.

    PubMed

    Meng, Chen; Culhane, Aedin

    2016-01-01

    Exploratory analysis is an essential step in the analysis of high throughput data. Multivariate approaches such as correspondence analysis (CA), principal component analysis, and multidimensional scaling are widely used in the exploratory analysis of single dataset. Modern biological studies often assay multiple types of biological molecules (e.g., mRNA, protein, phosphoproteins) on a same set of biological samples, thereby creating multiple different types of omics data or multiassay data. Integrative exploratory analysis of these multiple omics data is required to leverage the potential of multiple omics studies. In this chapter, we describe the application of co-inertia analysis (CIA; for analyzing two datasets) and multiple co-inertia analysis (MCIA; for three or more datasets) to address this problem. These methods are powerful yet simple multivariate approaches that represent samples using a lower number of variables, allowing a more easily identification of the correlated structure in and between multiple high dimensional datasets. Graphical representations can be employed to this purpose. In addition, the methods simultaneously project samples and variables (genes, proteins) onto the same lower dimensional space, so the most variant variables from each dataset can be selected and associated with samples, which can be further used to facilitate biological interpretation and pathway analysis. We applied CIA to explore the concordance between mRNA and protein expression in a panel of 60 tumor cell lines from the National Cancer Institute. In the same 60 cell lines, we used MCIA to perform a cross-platform comparison of mRNA gene expression profiles obtained on four different microarray platforms. Last, as an example of integrative analysis of multiassay or multi-omics data we analyzed transcriptomic, proteomic, and phosphoproteomic data from pluripotent (iPS) and embryonic stem (ES) cell lines.

  1. Cerebral White Matter Integrity Mediates Adult Age Differences in Cognitive Performance

    ERIC Educational Resources Information Center

    Madden, David J.; Spaniol, Julia; Costello, Matthew C.; Bucur, Barbara; White, Leonard E.; Cabeza, Roberto; Davis, Simon W.; Dennis, Nancy A.; Provenzale, James M.; Huettel, Scott A.

    2009-01-01

    Previous research has established that age-related decline occurs in measures of cerebral white matter integrity, but the role of this decline in age-related cognitive changes is not clear. To conclude that white matter integrity has a mediating (causal) contribution, it is necessary to demonstrate that statistical control of the white…

  2. Genome3D: A viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome

    PubMed Central

    2010-01-01

    Background New technologies are enabling the measurement of many types of genomic and epigenomic information at scales ranging from the atomic to nuclear. Much of this new data is increasingly structural in nature, and is often difficult to coordinate with other data sets. There is a legitimate need for integrating and visualizing these disparate data sets to reveal structural relationships not apparent when looking at these data in isolation. Results We have applied object-oriented technology to develop a downloadable visualization tool, Genome3D, for integrating and displaying epigenomic data within a prescribed three-dimensional physical model of the human genome. In order to integrate and visualize large volume of data, novel statistical and mathematical approaches have been developed to reduce the size of the data. To our knowledge, this is the first such tool developed that can visualize human genome in three-dimension. We describe here the major features of Genome3D and discuss our multi-scale data framework using a representative basic physical model. We then demonstrate many of the issues and benefits of multi-resolution data integration. Conclusions Genome3D is a software visualization tool that explores a wide range of structural genomic and epigenetic data. Data from various sources of differing scales can be integrated within a hierarchical framework that is easily adapted to new developments concerning the structure of the physical genome. In addition, our tool has a simple annotation mechanism to incorporate non-structural information. Genome3D is unique is its ability to manipulate large amounts of multi-resolution data from diverse sources to uncover complex and new structural relationships within the genome. PMID:20813045

  3. Generating genetically modified mice using CRISPR/Cas-mediated genome engineering.

    PubMed

    Yang, Hui; Wang, Haoyi; Jaenisch, Rudolf

    2014-08-01

    Mice with specific gene modifications are valuable tools for studying development and disease. Traditional gene targeting in mice using embryonic stem (ES) cells, although suitable for generating sophisticated genetic modifications in endogenous genes, is complex and time-consuming. We have recently described CRISPR/Cas-mediated genome engineering for the generation of mice carrying mutations in multiple genes, endogenous reporters, conditional alleles or defined deletions. Here we provide a detailed protocol for embryo manipulation by piezo-driven injection of nucleic acids into the cytoplasm to create gene-modified mice. Beginning with target design, the generation of gene-modified mice can be achieved in as little as 4 weeks. We also describe the application of the CRISPR/Cas technology for the simultaneous editing of multiple genes (five genes or more) after a single transfection of ES cells. The principles described in this protocol have already been applied in rats and primates, and they are applicable to sophisticated genome engineering in species in which ES cells are not available.

  4. A Bayesian integrative approach for multi-platform genomic data: A kidney cancer case study.

    PubMed

    Chekouo, Thierry; Stingo, Francesco C; Doecke, James D; Do, Kim-Anh

    2017-06-01

    Integration of genomic data from multiple platforms has the capability to increase precision, accuracy, and statistical power in the identification of prognostic biomarkers. A fundamental problem faced in many multi-platform studies is unbalanced sample sizes due to the inability to obtain measurements from all the platforms for all the patients in the study. We have developed a novel Bayesian approach that integrates multi-regression models to identify a small set of biomarkers that can accurately predict time-to-event outcomes. This method fully exploits the amount of available information across platforms and does not exclude any of the subjects from the analysis. Through simulations, we demonstrate the utility of our method and compare its performance to that of methods that do not borrow information across regression models. Motivated by The Cancer Genome Atlas kidney renal cell carcinoma dataset, our methodology provides novel insights missed by non-integrative models. © 2016, The International Biometric Society.

  5. Integration of banana streak badnavirus into the Musa genome: molecular and cytogenetic evidence.

    PubMed

    Harper, G; Osuji, J O; Heslop-Harrison, J S; Hull, R

    1999-03-15

    Breeding and tissue culture of certain cultivars of bananas (Musa) have led to high levels of banana streak badnavirus (BSV) infection in progeny from symptomless parents. BSV DNA hybridized to genomic DNA of one such parent, Obino l'Ewai, suggesting integration of viral sequences. Sequencing of clones of Obino l'Ewai genomic DNA revealed an interface between BSV and Musa sequences and a complex BSV integrant. In situ hybridization revealed two different BSV sequence locations in Obino l'Ewai chromosomes and a complex arrangement of BSV and Musa sequences was shown by probing stretched DNA fibers. This is the first report of integrated sequences that possibly lead to a plant pararetrovirus episomal infection by a mechanism differing markedly from animal retroviral systems.

  6. Annotation of the zebrafish genome through an integrated transcriptomic and proteomic analysis.

    PubMed

    Kelkar, Dhanashree S; Provost, Elayne; Chaerkady, Raghothama; Muthusamy, Babylakshmi; Manda, Srikanth S; Subbannayya, Tejaswini; Selvan, Lakshmi Dhevi N; Wang, Chieh-Huei; Datta, Keshava K; Woo, Sunghee; Dwivedi, Sutopa B; Renuse, Santosh; Getnet, Derese; Huang, Tai-Chung; Kim, Min-Sik; Pinto, Sneha M; Mitchell, Christopher J; Madugundu, Anil K; Kumar, Praveen; Sharma, Jyoti; Advani, Jayshree; Dey, Gourav; Balakrishnan, Lavanya; Syed, Nazia; Nanjappa, Vishalakshi; Subbannayya, Yashwanth; Goel, Renu; Prasad, T S Keshava; Bafna, Vineet; Sirdeshmukh, Ravi; Gowda, Harsha; Wang, Charles; Leach, Steven D; Pandey, Akhilesh

    2014-11-01

    Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼ 69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes.

  7. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    PubMed Central

    2011-01-01

    Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR) relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data. PMID:21943338

  8. Tc1-like Transposase Thm3 of Silver Carp (Hypophthalmichthys molitrix) Can Mediate Gene Transposition in the Genome of Blunt Snout Bream (Megalobrama amblycephala).

    PubMed

    Guo, Xiu-Ming; Zhang, Qian-Qian; Sun, Yi-Wen; Jiang, Xia-Yun; Zou, Shu-Ming

    2015-10-04

    Tc1-like transposons consist of an inverted repeat sequence flanking a transposase gene that exhibits similarity to the mobile DNA element, Tc1, of the nematode, Caenorhabditis elegans. They are widely distributed within vertebrate genomes including teleost fish; however, few active Tc1-like transposases have been discovered. In this study, 17 Tc1-like transposon sequences were isolated from 10 freshwater fish species belonging to the families Cyprinidae, Adrianichthyidae, Cichlidae, and Salmonidae. We conducted phylogenetic analyses of these sequences using previously isolated Tc1-like transposases and report that 16 of these elements comprise a new subfamily of Tc1-like transposons. In particular, we show that one transposon, Thm3 from silver carp (Hypophthalmichthys molitrix; Cyprinidae), can encode a 335-aa transposase with apparently intact domains, containing three to five copies in its genome. We then coinjected donor plasmids harboring 367 bp of the left end and 230 bp of the right end of the nonautonomous silver carp Thm1 cis-element along with capped Thm3 transposase RNA into the embryos of blunt snout bream (Megalobrama amblycephala; one- to two-cell embryos). This experiment revealed that the average integration rate could reach 50.6% in adult fish. Within the blunt snout bream genome, the TA dinucleotide direct repeat, which is the signature of Tc1-like family of transposons, was created adjacent to both ends of Thm1 at the integration sites. Our results indicate that the silver carp Thm3 transposase can mediate gene insertion by transposition within the genome of blunt snout bream genome, and that this occurs with a TA position preference.

  9. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology.

    PubMed

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho; Park, Kwang-Wook

    2016-03-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs.

  10. New Insights into the Classification and Integration Specificity of Streptococcus Integrative Conjugative Elements through Extensive Genome Exploration.

    PubMed

    Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie

    2015-01-01

    Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent.

  11. Distinct SUMO Ligases Cooperate with Esc2 and Slx5 to Suppress Duplication-Mediated Genome Rearrangements

    PubMed Central

    Albuquerque, Claudio P.; Wang, Guoliang; Lee, Nancy S.; Kolodner, Richard D.; Putnam, Christopher D.; Zhou, Huilin

    2013-01-01

    Suppression of duplication-mediated gross chromosomal rearrangements (GCRs) is essential to maintain genome integrity in eukaryotes. Here we report that SUMO ligase Mms21 has a strong role in suppressing GCRs in Saccharomyces cerevisiae, while Siz1 and Siz2 have weaker and partially redundant roles. Understanding the functions of these enzymes has been hampered by a paucity of knowledge of their substrate specificity in vivo. Using a new quantitative SUMO-proteomics technology, we found that Siz1 and Siz2 redundantly control the abundances of most sumoylated substrates, while Mms21 more specifically regulates sumoylation of RNA polymerase-I and the SMC-family proteins. Interestingly, Esc2, a SUMO-like domain-containing protein, specifically promotes the accumulation of sumoylated Mms21-specific substrates and functions with Mms21 to suppress GCRs. On the other hand, the Slx5-Slx8 complex, a SUMO-targeted ubiquitin ligase, suppresses the accumulation of sumoylated Mms21-specific substrates. Thus, distinct SUMO ligases work in concert with Esc2 and Slx5-Slx8 to control substrate specificity and sumoylation homeostasis to prevent GCRs. PMID:23935535

  12. Distinct SUMO ligases cooperate with Esc2 and Slx5 to suppress duplication-mediated genome rearrangements.

    PubMed

    Albuquerque, Claudio P; Wang, Guoliang; Lee, Nancy S; Kolodner, Richard D; Putnam, Christopher D; Zhou, Huilin

    2013-01-01

    Suppression of duplication-mediated gross chromosomal rearrangements (GCRs) is essential to maintain genome integrity in eukaryotes. Here we report that SUMO ligase Mms21 has a strong role in suppressing GCRs in Saccharomyces cerevisiae, while Siz1 and Siz2 have weaker and partially redundant roles. Understanding the functions of these enzymes has been hampered by a paucity of knowledge of their substrate specificity in vivo. Using a new quantitative SUMO-proteomics technology, we found that Siz1 and Siz2 redundantly control the abundances of most sumoylated substrates, while Mms21 more specifically regulates sumoylation of RNA polymerase-I and the SMC-family proteins. Interestingly, Esc2, a SUMO-like domain-containing protein, specifically promotes the accumulation of sumoylated Mms21-specific substrates and functions with Mms21 to suppress GCRs. On the other hand, the Slx5-Slx8 complex, a SUMO-targeted ubiquitin ligase, suppresses the accumulation of sumoylated Mms21-specific substrates. Thus, distinct SUMO ligases work in concert with Esc2 and Slx5-Slx8 to control substrate specificity and sumoylation homeostasis to prevent GCRs.

  13. Exploring breast carcinogenesis through integrative genomics and epigenomics analyses.

    PubMed

    Minning, Chin; Mokhtar, Norfilza Mohd; Abdullah, Norlia; Muhammad, Rohaizak; Emran, Nor Aina; Ali, Siti Aishah M D; Harun, Roslan; Jamal, Rahman

    2014-11-01

    There have been many DNA methylation studies on breast cancer which showed various methylation patterns involving tumour suppressor genes and oncogenes but only a few of those studies link the methylation data with gene expression. More data are required especially from the Asian region and to analyse how the epigenome data correlate with the transcriptome. DNA methylation profiling was carried out on 76 fresh frozen primary breast tumour tissues and 25 adjacent non-cancerous breast tissues using the Illumina Infinium(®) HumanMethylation27 BeadChip. Validation of methylation results was performed on 7 genes using either MS-MLPA or MS-qPCR. Gene expression profiling was done on 15 breast tumours and 5 adjacent non-cancerous breast tissues using the Affymetrix GeneChip(®) Human Gene 1.0 ST array. The overlapping genes between DNA methylation and gene expression datasets were further mapped to the KEGG database to identify the molecular pathways that linked these genes together. Supervised hierarchical cluster analysis revealed 1,389 hypermethylated CpG sites and 22 hypomethylated CpG sites in cancer compared to the normal samples. Gene expression microarray analysis using a fold-change of at least 1.5 and a false discovery rate (FDR) at p>0.05 identified 404 upregulated and 463 downregulated genes in cancer samples. Integration of both datasets identified 51 genes with hypermethylation with low expression (negative association) and 13 genes with hypermethylation with high expression (positive association). Most of the overlapping genes belong to the focal adhesion and extracellular matrix-receptor interaction that play important roles in breast carcinogenesis. The present study displayed the value of using multiple datasets in the same set of tissues and how the integrative analysis can create a list of well-focused genes as well as to show the correlation between epigenetic changes and gene expression. These gene signatures can help us understand the epigenetic

  14. Integration of Brassica A genome genetic linkage map between Brassica napus and B. rapa.

    PubMed

    Suwabe, Keita; Morgan, Colin; Bancroft, Ian

    2008-03-01

    An integrated linkage map between B. napus and B. rapa was constructed based on a total of 44 common markers comprising 41 SSR (33 BRMS, 6 Saskatoon, and 2 BBSRC) and 3 SNP/indel markers. Between 3 and 7 common markers were mapped onto each of the linkage groups A1 to A10. The position and order of most common markers revealed a high level of colinearity between species, although two small regions on A4, A5, and A10 revealed apparent local inversions between them. These results indicate that the A genome of Brassica has retained a high degree of colinearity between species, despite each species having evolved independently after the integration of the A and C genomes in the amphidiploid state. Our results provide a genetic integration of the Brassica A genome between B. napus and B. rapa. As the analysis employed sequence-based molecular markers, the information will accelerate the exploitation of the B. rapa genome sequence for the improvement of oilseed rape.

  15. Integrative physiology, functional genomics and the phenotype gap: a guide for comparative physiologists.

    PubMed

    Dow, Julian A T

    2007-05-01

    Classical, curiosity-led comparative physiology finds itself at a crossroads. Major funding for classical physiology is becoming harder to find, as grant agencies focus on more molecular approaches or on science with more immediate strategic value to their respective countries. In turn, this shift in funding places Zoology and Animal Science departments under enormous stress: student numbers are buoyant, but how can research funding be maintained at high levels? Our research group has argued for the redefinition of integrative physiology as the investigation of gene function in an organotypic context in the intact animal. Implicit in this definition is the use of transgenics and reverse genetics to manipulate gene function in a cell-specific manner; this in turn implies the use of a genetically tractable 'model organism'. The significance of this definition is that it aligns integrative physiology with functional genomics. Again, functional genomics draws heavily on reverse genetics to elucidate the function of novel genes. The phenotype gap (the mismatch between what a genetic model organism's genome encodes and the reasons that it has historically been studied) emphasises the need to attract and empower functional biologists: can all 13,500 genes in Drosophila really be explained in terms of developmental biology? So, by embracing the integrative physiology manifesto, comparative physiologists can not only accelerate their own research, but their functional skills can make them indispensable in the post-genomic endeavour.

  16. GnpIS: an information system to integrate genetic and genomic data from plants and fungi

    PubMed Central

    Steinbach, Delphine; Alaux, Michael; Amselem, Joelle; Choisne, Nathalie; Durand, Sophie; Flores, Raphaël; Keliet, Aminah-Olivia; Kimmel, Erik; Lapalu, Nicolas; Luyten, Isabelle; Michotey, Célia; Mohellibi, Nacer; Pommier, Cyril; Reboux, Sébastien; Valdenaire, Dorothée; Verdelet, Daphné; Quesneville, Hadi

    2013-01-01

    Data integration is a key challenge for modern bioinformatics. It aims to provide biologists with tools to explore relevant data produced by different studies. Large-scale international projects can generate lots of heterogeneous and unrelated data. The challenge is to integrate this information with other publicly available data. Nucleotide sequencing throughput has been improved with new technologies; this increases the need for powerful information systems able to store, manage and explore data. GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, markers, single nucleotide polymorphisms, germplasms and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Institute for Agricultural Research. Here, we illustrate its use. Database URL: http://urgi.versailles.inra.fr/gnpis PMID:23959375

  17. GnpIS: an information system to integrate genetic and genomic data from plants and fungi.

    PubMed

    Steinbach, Delphine; Alaux, Michael; Amselem, Joelle; Choisne, Nathalie; Durand, Sophie; Flores, Raphaël; Keliet, Aminah-Olivia; Kimmel, Erik; Lapalu, Nicolas; Luyten, Isabelle; Michotey, Célia; Mohellibi, Nacer; Pommier, Cyril; Reboux, Sébastien; Valdenaire, Dorothée; Verdelet, Daphné; Quesneville, Hadi

    2013-01-01

    Data integration is a key challenge for modern bioinformatics. It aims to provide biologists with tools to explore relevant data produced by different studies. Large-scale international projects can generate lots of heterogeneous and unrelated data. The challenge is to integrate this information with other publicly available data. Nucleotide sequencing throughput has been improved with new technologies; this increases the need for powerful information systems able to store, manage and explore data. GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, markers, single nucleotide polymorphisms, germplasms and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Institute for Agricultural Research. Here, we illustrate its use. Database URL: http://urgi.versailles.inra.fr/gnpis.

  18. Increasing the Efficiency of CRISPR/Cas9-mediated Precise Genome Editing of HSV-1 Virus in Human Cells

    PubMed Central

    Lin, Chaolong; Li, Huanhuan; Hao, Mengru; Xiong, Dan; Luo, Yong; Huang, Chenghao; Yuan, Quan; Zhang, Jun; Xia, Ningshao

    2016-01-01

    Genetically modified HSV-1 viruses serve as promising vectors for tumour therapy and vaccine development. The CRISPR/Cas9 system is one of the most powerful tools for precise gene editing of the genomes of organisms. However, whether the CRISPR/Cas9 system can precisely and efficiently make gene replacements in the genome of HSV-1 remains essentially unknown. Here, we reported CRISPR/Cas9-mediated editing of the HSV-1 genome in human cells, including the knockout and replacement of large genes. In established cells stably expressing CRISPR/Cas9, gRNA in coordination with Cas9 could direct a precise cleavage within a pre-defined target region, and foreign genes were successfully used to replace the target gene seamlessly by HDR-mediated gene replacement. Introducing the NHEJ inhibitor SCR7 to the CRISPR/Cas9 system greatly facilitated HDR-mediated gene replacement in the HSV-1 genome. We provided the first genetic evidence that two copies of the ICP0 gene in different locations on the same HSV-1 genome could be simultaneously modified with high efficiency and with no off-target modifications. We also developed a revolutionized isolation platform for desired recombinant viruses using single-cell sorting. Together, our work provides a significantly improved method for targeted editing of DNA viruses, which will facilitate the development of anti-cancer oncolytic viruses and vaccines. PMID:27713537

  19. Increasing the Efficiency of CRISPR/Cas9-mediated Precise Genome Editing of HSV-1 Virus in Human Cells.

    PubMed

    Lin, Chaolong; Li, Huanhuan; Hao, Mengru; Xiong, Dan; Luo, Yong; Huang, Chenghao; Yuan, Quan; Zhang, Jun; Xia, Ningshao

    2016-10-07

    Genetically modified HSV-1 viruses serve as promising vectors for tumour therapy and vaccine development. The CRISPR/Cas9 system is one of the most powerful tools for precise gene editing of the genomes of organisms. However, whether the CRISPR/Cas9 system can precisely and efficiently make gene replacements in the genome of HSV-1 remains essentially unknown. Here, we reported CRISPR/Cas9-mediated editing of the HSV-1 genome in human cells, including the knockout and replacement of large genes. In established cells stably expressing CRISPR/Cas9, gRNA in coordination with Cas9 could direct a precise cleavage within a pre-defined target region, and foreign genes were successfully used to replace the target gene seamlessly by HDR-mediated gene replacement. Introducing the NHEJ inhibitor SCR7 to the CRISPR/Cas9 system greatly facilitated HDR-mediated gene replacement in the HSV-1 genome. We provided the first genetic evidence that two copies of the ICP0 gene in different locations on the same HSV-1 genome could be simultaneously modified with high efficiency and with no off-target modifications. We also developed a revolutionized isolation platform for desired recombinant viruses using single-cell sorting. Together, our work provides a significantly improved method for targeted editing of DNA viruses, which will facilitate the development of anti-cancer oncolytic viruses and vaccines.

  20. Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.

    PubMed

    Masseroli, Marco; Canakoglu, Arif; Ceri, Stefano

    2016-01-01

    Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists' ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Proteomic Knowledge Base (GPKB), which integrates several of the most relevant sources of such dispersed information (including Entrez Gene, UniProt, IntAct, Expasy Enzyme, GO, GOA, BioCyc, KEGG, Reactome, and OMIM). Our solution is general, as it uses a flexible, modular, and multilevel global data schema based on abstraction and generalization of integrated data features, and a set of automatic procedures for easing data integration and maintenance, also when the integrated data sources evolve in data content, structure, and number. These procedures also assure consistency, quality, and provenance tracking of all integrated data, and perform the semantic closure of the hierarchical relationships of the integrated biomedical ontologies. At http://www.bioinformatics.deib.polimi.it/GPKB/, a Web interface allows graphical easy composition of queries, although complex, on the knowledge base, supporting also semantic query expansion and comprehensive explorative search of the integrated data to better sustain biomedical knowledge extraction.

  1. IMG/M: integrated genome and metagenome comparative data analysis system.

    PubMed

    Chen, I-Min A; Markowitz, Victor M; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  2. IMG/M: integrated genome and metagenome comparative data analysis system

    PubMed Central

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2017-01-01

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. PMID:27738135

  3. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    PubMed

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  4. Integrated and sequence-ordered BAC- and YAC-based physical maps for the rat genome.

    PubMed

    Krzywinski, Martin; Wallis, John; Gösele, Claudia; Bosdet, Ian; Chiu, Readman; Graves, Tina; Hummel, Oliver; Layman, Dan; Mathewson, Carrie; Wye, Natasja; Zhu, Baoli; Albracht, Derek; Asano, Jennifer; Barber, Sarah; Brown-John, Mabel; Chan, Susanna; Chand, Steve; Cloutier, Alison; Davito, Jonathon; Fjell, Chris; Gaige, Tony; Ganten, Detlev; Girn, Noreen; Guggenheimer, Kurtis; Himmelbauer, Heinz; Kreitler, Thomas; Leach, Stephen; Lee, Darlene; Lehrach, Hans; Mayo, Michael; Mead, Kelly; Olson, Teika; Pandoh, Pawan; Prabhu, Anna-Liisa; Shin, Heesun; Tänzer, Simone; Thompson, Jason; Tsai, Miranda; Walker, Jason; Yang, George; Sekhon, Mandeep; Hillier, LaDeana; Zimdahl, Heike; Marziali, Andre; Osoegawa, Kazutoyo; Zhao, Shaying; Siddiqui, Asim; de Jong, Pieter J; Warren, Wes; Mardis, Elaine; McPherson, John D; Wilson, Richard; Hübner, Norbert; Jones, Steven; Marra, Marco; Schein, Jacqueline

    2004-04-01

    As part of the effort to sequence the genome of Rattus norvegicus, we constructed a physical map comprised of fingerprinted bacterial artificial chromosome (BAC) clones from the CHORI-230 BAC library. These BAC clones provide approximately 13-fold redundant coverage of the genome and have been assembled into 376 fingerprint contigs. A yeast artificial chromosome (YAC) map was also constructed and aligned with the BAC map via fingerprinted BAC and P1 artificial chromosome clones (PACs) sharing interspersed repetitive sequence markers with the YAC-based physical map. We have annotated 95% of the fingerprint map clones in contigs with coordinates on the version 3.1 rat genome sequence assembly, using BAC-end sequences and in silico mapping methods. These coordinates have allowed anchoring 358 of the 376 fingerprint map contigs onto the sequence assembly. Of these, 324 contigs are anchored to rat genome sequences localized to chromosomes, and 34 contigs are anchored to unlocalized portions of the rat sequence assembly. The remaining 18 contigs, containing 54 clones, still require placement. The fingerprint map is a high-resolution integrative data resource that provides genome-ordered associations among BAC, YAC, and PAC clones and the assembled sequence of the rat genome.

  5. Integrated cytogenetic BAC map of the genome of the gray, short-tailed opossum, Monodelphis domestica.

    PubMed

    Duke, S E; Samollow, P B; Mauceli, E; Lindblad-Toh, K; Breen, M

    2007-01-01

    The generation of high-quality genome assemblies for numerous species is advancing at a rapid pace. As the number of genome assemblies increases, so does our ability to investigate genome relationships and their contributions to unraveling complex biological, evolutionary, and biomedical processes. A key process in the generation of a genome assembly is to determine and verify the precise physical location and order of the large sequence blocks (scaffolds) that result from the assembly. For organisms of relatively recent common ancestry this process may be achieved largely through comparative sequence alignment. However, as the evolutionary distance between species lengthens, the use of comparative sequence alignment becomes increasingly less reliable. Simultaneous cytogenetic mapping, using multicolor fluorescence in-situ hybridization (FISH) analysis, offers an alternative means to define the cytogenetic location and relative order of DNA sequences, thereby anchoring the genome sequence to the karyotype. In this article we report the molecular cytogenetic locations of 415 bacterial artificial chromosome (BAC) clones that served to anchor sequence scaffolds of the gray, short-tailed opossum (Monodelphis domestica) to its karyotype, which enabled accurate integration of these regions into the genome assembly.

  6. IMG/M: integrated genome and metagenome comparative data analysis system

    DOE PAGES

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; ...

    2016-10-13

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support formore » examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review(ER) companion system (IMG/M ER: https://img.jgi.doe.gov/ mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.« less

  7. IMG/M: integrated genome and metagenome comparative data analysis system

    SciTech Connect

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2016-10-13

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review(ER) companion system (IMG/M ER: https://img.jgi.doe.gov/ mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.

  8. A Genome-Wide Analysis of Promoter-Mediated Phenotypic Noise in Escherichia coli

    PubMed Central

    Silander, Olin K.; Nikolic, Nela; Zaslaver, Alon; Bren, Anat; Kikoin, Ilya; Alon, Uri; Ackermann, Martin

    2012-01-01

    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as “phenotypic noise.” In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone. PMID:22275871

  9. A genome-wide analysis of promoter-mediated phenotypic noise in Escherichia coli.

    PubMed

    Silander, Olin K; Nikolic, Nela; Zaslaver, Alon; Bren, Anat; Kikoin, Ilya; Alon, Uri; Ackermann, Martin

    2012-01-01

    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as "phenotypic noise." In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone.

  10. Repression of telomerase gene promoter requires human-specific genomic context and is mediated by multiple HDAC1-containing corepressor complexes.

    PubMed

    Cheng, De; Zhao, Yuanjun; Wang, Shuwen; Zhang, Fan; Russo, Mariano; McMahon, Steven B; Zhu, Jiyue

    2017-03-01

    The human telomerase reverse transcriptase (hTERT) gene is repressed in most somatic cells, whereas the expression of the mouse mTert gene is widely detected. To understand the mechanisms of this human-specific repression, we constructed bacterial artificial chromosome (BAC) reporters using human and mouse genomic DNAs encompassing the TERT genes and neighboring loci. Upon chromosomal integration, the hTERT, but not the mTert, reporter was stringently repressed in telomerase-negative human cells in a histone deacetylase (HDAC)-dependent manner, replicating the expression of their respective endogenous genes. In chimeric BACs, the mTert promoter became strongly repressed in the human genomic context, but the hTERT promoter was highly active in the mouse genomic context. Furthermore, an unrelated herpes simplex virus-thymidine kinase (HSV-TK) promoter was strongly repressed in the human, but not in the mouse, genomic context. These results demonstrated that the repression of hTERT gene was dictated by distal elements and its chromatin environment. This repression depended on class I HDACs and involved multiple corepressor complexes, including HDAC1/2-containing Sin3B, nucleosome remodeling and histone deacetylase (NuRD), and corepressor of RE1 silencing transcription factor (CoREST) complexes. Together, our data indicate that the lack of telomerase expression in most human somatic cells results from its repressive genomic environment, providing new insight into the mechanism of long-recognized differential telomerase regulation in mammalian species.-Cheng, D., Zhao, Y., Wang, S., Zhang, F., Russo, M., McMahon, S. B., Zhu, J. Repression of telomerase gene promoter requires human-specific genomic context and is mediated by multiple HDAC1-containing corepressor complexes. © FASEB.

  11. Zygote-mediated generation of genome-modified mice using Streptococcus thermophilus 1-derived CRISPR/Cas system.

    PubMed

    Fujii, Wataru; Kakuta, Shigeru; Yoshioka, Shin; Kyuwa, Shigeru; Sugiura, Koji; Naito, Kunihiko

    2016-08-26

    Mammalian zygote-mediated genome-engineering by CRISPR/Cas is currently used for the generation of genome-modified animals. Here we report that a Streptococcus thermophilus-1 derived orthologous CRISPR/Cas system, which recognizes the 5'-NNAGAA sequence as a protospacer adjacent motif (PAM), is useful in mouse zygotes and is applicable for generating knockout mice (87.5%) and targeted knock-in mice (45.5%). The induced mutation could be inherited in the next generation. This novel CRISPR/Cas can expand the feasibility of the zygote-mediated generation of genome-modified animals that require an exact mutation design. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  12. Salt stress in Desulfovibrio vulgaris Hildenborough: an integrated genomics approach.

    PubMed

    Mukhopadhyay, Aindrila; He, Zhili; Alm, Eric J; Arkin, Adam P; Baidoo, Edward E; Borglin, Sharon C; Chen, Wenqiong; Hazen, Terry C; He, Qiang; Holman, Hoi-Ying; Huang, Katherine; Huang, Rick; Joyner, Dominique C; Katz, Natalie; Keller, Martin; Oeller, Paul; Redding, Alyssa; Sun, Jun; Wall, Judy; Wei, Jing; Yang, Zamin; Yen, Huei-Che; Zhou, Jizhong; Keasling, Jay D

    2006-06-01

    The ability of Desulfovibrio vulgaris Hildenborough to reduce, and therefore contain, toxic and radioactive metal waste has made all factors that affect the physiology of this organism of great interest. Increased salinity is an important and frequent fluctuation faced by D. vulgaris in its natural habitat. In liquid culture, exposure to excess salt resulted in striking elongation of D. vulgaris cells. Using data from transcriptomics, proteomics, metabolite assays, phospholipid fatty acid profiling, and electron microscopy, we used a systems approach to explore the effects of excess NaCl on D. vulgaris. In this study we demonstrated that import of osmoprotectants, such as glycine betaine and ectoine, is the primary mechanism used by D. vulgaris to counter hyperionic stress. Several efflux systems were also highly up-regulated, as was the ATP synthesis pathway. Increases in the levels of both RNA and DNA helicases suggested that salt stress affected the stability of nucleic acid base pairing. An overall increase in the level of branched fatty acids indicated that there were changes in cell wall fluidity. The immediate response to salt stress included up-regulation of chemotaxis genes, although flagellar biosynthesis was down-regulated. Other down-regulated systems included lactate uptake permeases and ABC transport systems. The results of an extensive NaCl stress analysis were compared with microarray data from a KCl stress analysis, and unlike many other bacteria, D. vulgaris responded similarly to the two stresses. Integration of data from multiple methods allowed us to develop a conceptual model for the salt stress response in D. vulgaris that can be compared to those in other microorganisms.

  13. Salt Stress in Desulfovibrio vulgaris Hildenborough: an Integrated Genomics Approach

    PubMed Central

    Mukhopadhyay, Aindrila; He, Zhili; Alm, Eric J.; Arkin, Adam P.; Baidoo, Edward E.; Borglin, Sharon C.; Chen, Wenqiong; Hazen, Terry C.; He, Qiang; Holman, Hoi-Ying; Huang, Katherine; Huang, Rick; Joyner, Dominique C.; Katz, Natalie; Keller, Martin; Oeller, Paul; Redding, Alyssa; Sun, Jun; Wall, Judy; Wei, Jing; Yang, Zamin; Yen, Huei-Che; Zhou, Jizhong; Keasling, Jay D.

    2006-01-01

    The ability of Desulfovibrio vulgaris Hildenborough to reduce, and therefore contain, toxic and radioactive metal waste has made all factors that affect the physiology of this organism of great interest. Increased salinity is an important and frequent fluctuation faced by D. vulgaris in its natural habitat. In liquid culture, exposure to excess salt resulted in striking elongation of D. vulgaris cells. Using data from transcriptomics, proteomics, metabolite assays, phospholipid fatty acid profiling, and electron microscopy, we used a systems approach to explore the effects of excess NaCl on D. vulgaris. In this study we demonstrated that import of osmoprotectants, such as glycine betaine and ectoine, is the primary mechanism used by D. vulgaris to counter hyperionic stress. Several efflux systems were also highly up-regulated, as was the ATP synthesis pathway. Increases in the levels of both RNA and DNA helicases suggested that salt stress affected the stability of nucleic acid base pairing. An overall increase in the level of branched fatty acids indicated that there were changes in cell wall fluidity. The immediate response to salt stress included up-regulation of chemotaxis genes, although flagellar biosynthesis was down-regulated. Other down-regulated systems included lactate uptake permeases and ABC transport systems. The results of an extensive NaCl stress analysis were compared with microarray data from a KCl stress analysis, and unlike many other bacteria, D. vulgaris responded similarly to the two stresses. Integration of data from multiple methods allowed us to develop a conceptual model for the salt stress response in D. vulgaris that can be compared to those in other microorganisms. PMID:16707698

  14. Personalised Medicine Possible With Real-Time Integration of Genomic and Clinical Data To Inform Clinical Decision-Making.

    PubMed

    Martin-Sanchez, Fernando; Turner, Maureen; Johnstone, Alice; Heffer, Leon; Rafael, Naomi; Bakker, Tim; Thorne, Natalie; Macciocca, Ivan; Gaff, Clara

    2015-01-01

    Despite widespread use of genomic sequencing in research, there are gaps in our understanding of the performance and provision of genomic sequencing in clinical practice. The Melbourne Genomics Health Alliance (the Alliance), has been established to determine the feasibility, performance and impact of using genomic sequencing as a diagnostic tool. The Alliance has partnered with BioGrid Australia to enable the linkage of genomic sequencing, clinical treatment and outcome data for this project. This integrated dataset of genetic, clinical and patient sourced information will be used by the Alliance to evaluate the potential diagnostic value of genomic sequencing in routine clinical practice. This project will allow the Alliance to provide recommendations to facilitate the integration of genomic sequencing into clinical practice to enable personalised disease treatment.

  15. Polymerase Θ is a key driver of genome evolution and of CRISPR/Cas9-mediated mutagenesis

    PubMed Central

    van Schendel, Robin; Roerink, Sophie F.; Portegijs, Vincent; van den Heuvel, Sander; Tijsterman, Marcel

    2015-01-01

    Cells are protected from toxic DNA double-stranded breaks (DSBs) by a number of DNA repair mechanisms, including some that are intrinsically error prone, thus resulting in mutations. To what extent these mechanisms contribute to evolutionary diversification remains unknown. Here, we demonstrate that the A-family polymerase theta (POLQ) is a major driver of inheritable genomic alterations in Caenorhabditis elegans. Unlike somatic cells, which use non-homologous end joining (NHEJ) to repair DNA transposon-induced DSBs, germ cells use polymerase theta-mediated end joining, a conceptually simple repair mechanism requiring only one nucleotide as a template for repair. Also CRISPR/Cas9-induced genomic changes are exclusively generated through polymerase theta-mediated end joining, refuting a previously assumed requirement for NHEJ in their formation. Finally, through whole-genome sequencing of propagated populations, we show that only POLQ-proficient animals accumulate genomic scars that are abundantly present in genomes of wild C. elegans, pointing towards POLQ as a major driver of genome diversification. PMID:26077599

  16. Identification of metastasis-associated genes in colorectal cancer through an integrated genomic and transcriptomic analysis

    PubMed Central

    Peng, Sihua

    2013-01-01

    Objective Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of microarray data was presented, by combined with evidence acquired from comparative genomic hybridization (CGH) data. Methods Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify metastasis-associated genes in CRC. Results A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions Our results demonstrated that integration analysis is an effective strategy for mining cancer-associated genes. PMID:24385689

  17. Integrated genome-wide analysis of expression quantitative trait loci aids interpretation of genomic association studies.

    PubMed

    Joehanes, Roby; Zhang, Xiaoling; Huan, Tianxiao; Yao, Chen; Ying, Sai-Xia; Nguyen, Quang Tri; Demirkale, Cumhur Yusuf; Feolo, Michael L; Sharopova, Nataliya R; Sturcke, Anne; Schäffer, Alejandro A; Heard-Costa, Nancy; Chen, Han; Liu, Po-Ching; Wang, Richard; Woodhouse, Kimberly A; Tanriverdi, Kahraman; Freedman, Jane E; Raghavachari, Nalini; Dupuis, Josée; Johnson, Andrew D; O'Donnell, Christopher J; Levy, Daniel; Munson, Peter J

    2017-01-25

    Identification of single nucleotide polymorphisms (SNPs) associated with gene expression levels, known as expression quantitative trait loci (eQTLs), may improve understanding of the functional role of phenotype-associated SNPs in genome-wide association studies (GWAS). The small sample sizes of some previous eQTL studies have limited their statistical power. We conducted an eQTL investigation of microarray-based gene and exon expression levels in whole blood in a cohort of 5257 individuals, exceeding the single cohort size of previous studies by more than a factor of 2. We detected over 19,000 independent lead cis-eQTLs and over 6000 independent lead trans-eQTLs, targeting over 10,000 gene targets (eGenes), with a false discovery rate (FDR) < 5%. Of previously published significant GWAS SNPs, 48% are identified to be significant eQTLs in our study. Some trans-eQTLs point toward novel mechanistic explanations for the association of the SNP with the GWAS-related phenotype. We also identify 59 distinct blocks or clusters of trans-eQTLs, each targeting the expression of sets of six to 229 distinct trans-eGenes. Ten of these sets of target genes are significantly enriched for microRNA targets (FDR < 5%). Many of these clusters are associated in GWAS with multiple phenotypes. These findings provide insights into the molecular regulatory patterns involved in human physiology and pathophysiology. We illustrate the value of our eQTL database in the context of a recent GWAS meta-analysis of coronary artery disease and provide a list of targeted eGenes for 21 of 58 GWAS loci.

  18. Genomic Analysis of Sleeping Beauty Transposon Integration in Human Somatic Cells

    PubMed Central

    Turchiano, Giandomenico; Latella, Maria Carmela; Gogol-Döring, Andreas; Cattoglio, Claudia; Mavilio, Fulvio; Izsvák, Zsuzsanna; Ivics, Zoltán; Recchia, Alessandra

    2014-01-01

    The Sleeping Beauty (SB) transposon is a non-viral integrating vector system with proven efficacy for gene transfer and functional genomics. However, integration efficiency is negatively affected by the length of the transposon. To optimize the SB transposon machinery, the inverted repeats and the transposase gene underwent several modifications, resulting in the generation of the hyperactive SB100X transposase and of the high-capacity “sandwich” (SA) transposon. In this study, we report a side-by-side comparison of the SA and the widely used T2 arrangement of transposon vectors carrying increasing DNA cargoes, up to 18 kb. Clonal analysis of SA integrants in human epithelial cells and in immortalized keratinocytes demonstrates stability and integrity of the transposon independently from the cargo size and copy number-dependent expression of the cargo cassette. A genome-wide analysis of unambiguously mapped SA integrations in keratinocytes showed an almost random distribution, with an overrepresentation in repetitive elements (satellite, LINE and small RNAs) compared to a library representing insertions of the first-generation transposon vector and to gammaretroviral and lentiviral libraries. The SA transposon/SB100X integrating system therefore shows important features as a system for delivering large gene constructs for gene therapy applications. PMID:25390293

  19. Genomic analysis of Sleeping Beauty transposon integration in human somatic cells.

    PubMed

    Turchiano, Giandomenico; Latella, Maria Carmela; Gogol-Döring, Andreas; Cattoglio, Claudia; Mavilio, Fulvio; Izsvák, Zsuzsanna; Ivics, Zoltán; Recchia, Alessandra

    2014-01-01

    The Sleeping Beauty (SB) transposon is a non-viral integrating vector system with proven efficacy for gene transfer and functional genomics. However, integration efficiency is negatively affected by the length of the transposon. To optimize the SB transposon machinery, the inverted repeats and the transposase gene underwent several modifications, resulting in the generation of the hyperactive SB100X transposase and of the high-capacity "sandwich" (SA) transposon. In this study, we report a side-by-side comparison of the SA and the widely used T2 arrangement of transposon vectors carrying increasing DNA cargoes, up to 18 kb. Clonal analysis of SA integrants in human epithelial cells and in immortalized keratinocytes demonstrates stability and integrity of the transposon independently from the cargo size and copy number-dependent expression of the cargo cassette. A genome-wide analysis of unambiguously mapped SA integrations in keratinocytes showed an almost random distribution, with an overrepresentation in repetitive elements (satellite, LINE and small RNAs) compared to a library representing insertions of the first-generation transposon vector and to gammaretroviral and lentiviral libraries. The SA transposon/SB100X integrating system therefore shows important features as a system for delivering large gene constructs for gene therapy applications.

  20. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

    PubMed

    Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H

    2010-12-24

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.

  1. The human specialized DNA polymerases and non-B DNA: vital relationships to preserve genome integrity.

    PubMed

    Boyer, Anne-Sophie; Grgurevic, Srdana; Cazaux, Christophe; Hoffmann, Jean-Sébastien

    2013-11-29

    In addition to the canonical right-handed double helix, DNA molecule can adopt several other non-B DNA structures. Readily formed in the genome at specific DNA repetitive sequences, these secondary conformations present a distinctive challenge for progression of DNA replication forks. Impeding normal DNA synthesis, cruciforms, hairpins, H DNA, Z DNA and G4 DNA considerably impact the genome stability and in some instances play a causal role in disease development. Along with previously discovered dedicated DNA helicases, the specialized DNA polymerases emerge as major actors performing DNA synthesis through these distorted impediments. In their new role, they are facilitating DNA synthesis on replication stalling sites formed by non-B DNA structures and thereby helping the completion of DNA replication, a process otherwise crucial for preserving genome integrity and concluding normal cell division. This review summarizes the evidence gathered describing the function of specialized DNA polymerases in replicating DNA through non-B DNA structures.

  2. Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine.

    PubMed

    Ostrowski, Jerzy; Wyrwicz, Lucjan S

    2009-09-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, which are usually applied on the scale of single genes. Medicine in the postgenomic era will utilize thousands of disease-associated molecular markers provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical and bioinformatic analyses to model biological systems. Collecting, cataloging and comparing data from molecular studies, and the subsequent development of conclusions, creates the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm known as integrative genomics.

  3. Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project

    PubMed Central

    Gerstein, Mark B.; Lu, Zhi John; Van Nostrand, Eric L.; Cheng, Chao; Arshinoff, Bradley I.; Liu, Tao; Yip, Kevin Y.; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K.; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P.; Barber, Galt; Brdlik, Cathleen M.; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O.; Dernburg, Abby F.; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C.; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A.; Gassmann, Reto; Good, Peter J.; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S.; Habegger, Lukas; Han, Ting; Henikoff, Jorja G.; Henz, Stefan R.; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A. Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W. James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K.; Kolasinska-Zwierz, Paulina; Lai, Eric C.; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F.; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D.; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M.; Muroyama, Andrew; Murray, John I.; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A.; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J.; Slightam, Cindie; Smith, Richard; Spencer, William C.; Stinson, E. O.; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L.; Whittle, Christina M.; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C.; Micklem, Gos; Liu, X. Shirley; Reinke, Valerie; Kim, Stuart K.; Hillier, LaDeana W.; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D.; Waterston, Robert H.

    2011-01-01

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome. PMID:21177976

  4. More powerful genetic association testing via a new statistical framework for integrative genomics

    PubMed Central

    Zhao, Sihai D.; Cai, T. Tony; Li, Hongzhe

    2015-01-01

    Integrative genomics offers a promising approach to more powerful genetic association studies. The hope is that combining outcome and genotype data with other types of genomic information can lead to more powerful SNP detection. We present a new association test based on a statistical model that explicitly assumes that genetic variations affect the outcome through perturbing gene expression levels. It is shown analytically that the proposed approach can have more power to detect SNPs that are associated with the outcome through transcriptional regulation, compared to tests using the outcome and genotype data alone, and simulations show that our method is relatively robust to misspecification. We also provide a strategy for applying our approach to high-dimensional genomic data. We use this strategy to identify a potentially new association between a SNP and a yeast cell’s response to the natural product tomatidine, which standard association analysis did not detect. PMID:24975802

  5. More powerful genetic association testing via a new statistical framework for integrative genomics.

    PubMed

    Zhao, Sihai D; Cai, T Tony; Li, Hongzhe

    2014-12-01

    Integrative genomics offers a promising approach to more powerful genetic association studies. The hope is that combining outcome and genotype data with other types of genomic information can lead to more powerful SNP detection. We present a new association test based on a statistical model that explicitly assumes that genetic variations affect the outcome through perturbing gene expression levels. It is shown analytically that the proposed approach can have more power to detect SNPs that are associated with the outcome through transcriptional regulation, compared to tests using the outcome and genotype data alone, and simulations show that our method is relatively robust to misspecification. We also provide a strategy for applying our approach to high-dimensional genomic data. We use this strategy to identify a potentially new association between a SNP and a yeast cell's response to the natural product tomatidine, which standard association analysis did not detect.

  6. Integrative genomic characterization of oral squamous cell carcinomaidentifies frequent somatic drivers

    PubMed Central

    Pickering, Curtis R.; Zhang, Jiexin; Yoo, Suk Young; Bengtsson, Linnea; Moorthy, Shhyam; Neskey, David M.; Zhao, Mei; Alves, Marcus V Ortega; Chang, Kyle; Drummond, Jennifer; Cortez, Elsa; Xie, Tong-xin; Zhang, Di; Chung, Woonbok; Issa, Jean-Pierre J.; Zweidler-McKay, Patrick A.; Wu, Xifeng; El-Naggar, Adel K.; Weinstein, John N.; Wang, Jing; Muzny, Donna M.; Gibbs, Richard A.; Wheeler, David A.; Myers, Jeffrey N.; Frederick, Mitchell J.

    2013-01-01

    The survival of patients with oral squamous cell carcinoma (OSCC) has not changed significantly in several decades, leading clinicians and investigators to search for promising molecular targets. To this end, we performed comprehensive genomic analysis of gene expression, copy number, methylation and point mutations in OSCC. Integrated analysis revealed more somatic events than previously reported, identifying four major driver pathways (mitogenic signaling, Notch, cell cycle, TP53) and two additional key genes (FAT1, CASP8). The Notch pathway was defective in 66% of patients, and in follow-up studies of mechanism, functional NOTCH1 signaling inhibited proliferation of OSCC cell lines. Frequent mutation of CASP8 defines a new molecular subtype of OSCC with few copy number changes. Although genomic alterations are dominated by loss of tumor suppressor genes, 80% of patients harbored at least one genomic alteration in a targetable gene, suggesting that novel approaches to treatment may be possible for this debilitating disease. PMID:23619168

  7. Data integration for plant genomics--exemplars from the