Science.gov

Sample records for genomic integration mediated

  1. Exogenous gene integration mediated by genome editing technologies in zebrafish.

    PubMed

    Morita, Hitoshi; Taimatsu, Kiyohito; Yanagi, Kanoko; Kawahara, Atsuo

    2017-03-08

    Genome editing technologies, such as transcription activator-like effector nuclease (TALEN) and the clustered regularly interspaced short palindromic repeat (CRISPR)/ CRISPR-associated protein (Cas) systems, can induce DNA double-strand breaks (DSBs) at the targeted genomic locus, leading to frameshift-mediated gene disruption in the process of DSB repair. Recently, the technology-induced DSBs followed by DSB repairs are applied to integrate exogenous genes into the targeted genomic locus in various model organisms. In addition to a conventional knock-in technology mediated by homology-directed repair (HDR), novel knock-in technologies using refined donor vectors have also been developed with the genome editing technologies based on other DSB repair mechanisms, including non-homologous end joining (NHEJ) and microhomology-mediated end joining (MMEJ). Therefore, the improved knock-in technologies would contribute to freely modify the genome of model organisms.

  2. Enhanced CRISPR/Cas9-mediated biallelic genome targeting with dual surrogate reporter-integrated donors.

    PubMed

    Wu, Yun; Xu, Kun; Ren, Chonghua; Li, Xinyi; Lv, Huijiao; Han, Furong; Wei, Zehui; Wang, Xin; Zhang, Zhiying

    2017-02-18

    The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system has recently emerged as a simple, yet powerful genome engineering tool, which has been widely used for genome modification in various organisms and cell types. However, screening biallelic genome-modified cells is often time-consuming and technically challenging. In this study, we incorporated two different surrogate reporter cassettes into paired donor plasmids, which were used as both the surrogate reporters and the knock-in donors. By applying our dual surrogate reporter-integrated donor system, we demonstrate high frequency of CRISPR/Cas9-mediated biallelic genome integration in both human HEK293T and porcine PK15 cells (34.09% and 18.18%, respectively). Our work provides a powerful genetic tool for assisting the selection and enrichment of cells with targeted biallelic genome modification.

  3. Altering genomic integrity: heavy metal exposure promotes trans-posable element-mediated damage

    PubMed Central

    Morales, Maria E.; Servant, Geraldine; Ade, Catherine; Roy-Enge, Astrid M.

    2015-01-01

    Maintenance of genomic integrity is critical for cellular homeostasis and survival. The active transposable elements (TEs) composed primarily of three mobile element lineages LINE-1, Alu, and SVA comprise approximately 30% of the mass of the human genome. For the past two decades, studies have shown that TEs significantly contribute to genetic instability and that TE-caused damages are associated with genetic diseases and cancer. Different environmental exposures, including several heavy metals, influence how TEs interact with its host genome increasing their negative impact. This mini-review provides some basic knowledge on TEs, their contribution to disease and an overview of the current knowledge on how heavy metals influence TE-mediated damage. PMID:25774044

  4. miR146a-mediated targeting of FANCM during inflammation compromises genome integrity

    PubMed Central

    Kim, Hyun Hee; Lee, Hyun-Seo; Jun, Semo; Cha, Jeong-Heon; Kee, Younghoon; You, Ho Jin; Lee, Jung-Hee

    2016-01-01

    Inflammation is a potent inducer of tumorigenesis. Increased DNA damage or loss of genome integrity is thought to be one of the mechanisms linking inflammation and cancer development. It has been suggested that NF-κB-induced microRNA-146 (miR146a) may be a mediator of the inflammatory response. Based on our initial observation that miR146a overexpression strongly increases DNA damage, we investigated its potential role as a modulator of DNA repair. Here, we demonstrate that FANCM, a component in the Fanconi Anemia pathway, is a novel target of miR146a. miR146a suppressed FANCM expression by directly binding to the 3′ untranslated region of the gene. miR146a-induced downregulation of FANCM was associated with inhibition of FANCD2 monoubiquitination, reduced DNA homologous recombination repair and checkpoint response, failed recovery from replication stress, and increased cellular sensitivity to cisplatin. These phenotypes were recapitulated when miR146a expression was induced by overexpressing the NF-κB subunit p65/RelA or Helicobacter pylori infection in a human gastric cell line; the phenotypes were effectively reversed with an anti-miR146a antagomir. These results suggest that undesired inflammation events caused by a pathogen or over-induction of miR146a can impair genome integrity via suppression of FANCM. PMID:27351285

  5. Site-specific gene integration in rice genome mediated by the FLP-FRT recombination system.

    PubMed

    Nandy, Soumen; Srivastava, Vibha

    2011-08-01

    Plant transformation based on random integration of foreign DNA often generates complex integration structures. Precision in the integration process is necessary to ensure the formation of full-length, single-copy integration. Site-specific recombination systems are versatile tools for precise genomic manipulations such as DNA excision, inversion or integration. The yeast FLP-FRT recombination system has been widely used for DNA excision in higher plants. Here, we report the use of FLP-FRT system for efficient targeting of foreign gene into the engineered genomic site in rice. The transgene vector containing a pair of directly oriented FRT sites was introduced by particle bombardment into the cells containing the target locus. FLP activity generated by the co-bombarded FLP gene efficiently separated the transgene construct from the vector-backbone and integrated the backbone-free construct into the target site. Strong FLP activity, derived from the enhanced FLP protein, FLPe, was important for the successful site-specific integration (SSI). The majority of the transgenic events contained a precise integration and expressed the transgene. Interestingly, each transgenic event lacked the co-bombarded FLPe gene, suggesting reversion of the integration structure in the presence of the constitutive FLPe expression. Progeny of the precise transgenic lines inherited the stable SSI locus and expressed the transgene. This work demonstrates the application of FLP-FRT system for site-specific gene integration in plants using rice as a model.

  6. In vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration

    PubMed Central

    Suzuki, Keiichiro; Tsunekawa, Yuji; Hernandez-Benitez, Reyna; Wu, Jun; Zhu, Jie; Kim, Euiseok J.; Hatanaka, Fumiyuki; Yamamoto, Mako; Araoka, Toshikazu; Li, Zhe; Kurita, Masakazu; Hishida, Tomoaki; Li, Mo; Aizawa, Emi; Guo, Shicheng; Chen, Song; Goebl, April; Soligalla, Rupa Devi; Qu, Jing; Jiang, Tingshuai; Fu, Xin; Jafari, Maryam; Esteban, Concepcion Rodriguez; Berggren, W. Travis; Lajara, Jeronimo; Nuñez-Delicado, Estrella; Guillen, Pedro; Campistol, Josep M.; Matsuzaki, Fumio; Liu, Guang-Hui; Magistretti, Pierre; Zhang, Kun; Callaway, Edward M.; Zhang, Kang; Belmonte, Juan Carlos Izpisua

    2017-01-01

    Targeted genome editing via engineered nucleases is an exciting area of biomedical research and holds potential for clinical applications. Despite rapid advances in the field, in vivo targeted transgene integration is still infeasible because current tools are inefficient1, especially for non-dividing cells, which compose most adult tissues. This poses a barrier for uncovering fundamental biological principles and developing treatments for a broad range of genetic disorders2. Based on clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9)3,4 technology, here we devise a homology-independent targeted integration (HITI) strategy, which allows for robust DNA knock-in in both dividing and non-dividing cells in vitro and, more importantly, in vivo (for example, in neurons of postnatal mammals). As a proof of concept of its therapeutic potential, we demonstrate the efficacy of HITI in improving visual function using a rat model of the retinal degeneration condition retinitis pigmentosa. The HITI method presented here establishes new avenues for basic research and targeted gene therapies. PMID:27851729

  7. Analyses of germline, chromosomally integrated human herpesvirus 6A and B genomes indicate emergent infection and new inflammatory mediators.

    PubMed

    Tweedy, J; Spyrou, M A; Hubacek, P; Kuhl, U; Lassner, D; Gompels, U A

    2015-02-01

    Human herpesvirus-6A (HHV-6A) is rarer than HHV-6B in many infant populations. However, they are similarly prevalent as germline, chromosomally integrated genomes (ciHHV-6A/B). This integrated form affects 0.1-1 % of the human population, where potentially virus gene expression could be in every cell, although virus relationships and health effects are not clear. In a Czech/German patient cohort ciHHV-6A was more common and diverse than ciHHV-6B. Quantitative PCR, nucleotide sequencing and telomeric integration site amplification characterized ciHHV-6 in 44 German myocarditis/cardiomyopathy and Czech malignancy/inflammatory disease (MI) patients plus donors. Comparisons were made to sequences from global virus reference strains, and blood DNA from childhood-infections from Zambia (HHV-6A mainly) and Japan (HHV-6B). The MI cohort were 86 % (18/21) ciHHV-6A, the cardiac cohort 65 % (13/20) ciHHV-6B, suggesting different disease links. Reactivation was supported by findings of 1) recombination between ciHHV-6A and HHV-6B genes in 20 % (4/21) of the MI cohort; 2) expression in a patient subset, of early/late transcripts from the inflammatory mediator genes chemokine receptor U51 and chemokine U83, both identical to ciHHV-6A DNA sequences; and 3) superinfection shown by deep sequencing identifying minor virus-variants only in ciHHV-6A, which expressed transcripts, indicating virus infection reactivates latent ciHHV-6A. Half the MI cohort had more than two copies per cell, median 5.2, indicative of reactivation. Remarkably, the integrated genomes encoded the secreted-active form of virus chemokines, rare in virus from childhood-infections. This shows integrated virus genomes can contribute new human genes with links to inflammatory pathology and supports ciHHV-6A reactivation as a source for emergent infection.

  8. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity

    PubMed Central

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  9. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  10. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-04

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search.

  11. Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability

    PubMed Central

    Akagi, Keiko; Li, Jingfeng; Broutian, Tatevik R.; Padilla-Nash, Hesed; Xiao, Weihong; Jiang, Bo; Rocco, James W.; Teknos, Theodoros N.; Kumar, Bhavna; Wangsa, Danny; He, Dandan; Ried, Thomas; Symer, David E.; Gillison, Maura L.

    2014-01-01

    Genomic instability is a hallmark of human cancers, including the 5% caused by human papillomavirus (HPV). Here we report a striking association between HPV integration and adjacent host genomic structural variation in human cancer cell lines and primary tumors. Whole-genome sequencing revealed HPV integrants flanking and bridging extensive host genomic amplifications and rearrangements, including deletions, inversions, and chromosomal translocations. We present a model of “looping” by which HPV integrant-mediated DNA replication and recombination may result in viral–host DNA concatemers, frequently disrupting genes involved in oncogenesis and amplifying HPV oncogenes E6 and E7. Our high-resolution results shed new light on a catastrophic process, distinct from chromothripsis and other mutational processes, by which HPV directly promotes genomic instability. PMID:24201445

  12. A New Era of Genome Integration-Simply Cut and Paste!

    PubMed

    Liu, Zihe; Liang, Youyun; Ang, Ee Lui; Zhao, Huimin

    2017-01-24

    Genome integration is a powerful tool in both basic and applied biological research. However, traditional genome integration, which is typically mediated by homologous recombination, has been constrained by low efficiencies and limited host range. In recent years, the emergence of homing endonucleases and programmable nucleases has greatly enhanced integration efficiencies and allowed alternative integration mechanisms such as nonhomologous end joining and microhomology-mediated end joining, enabling integration in hosts deficient in homologous recombination. In this review, we will highlight recent advances and breakthroughs in genome integration methods made possible by programmable nucleases, and their new applications in synthetic biology and metabolic engineering.

  13. Reverse transcriptase: mediator of genomic plasticity.

    PubMed

    Brosius, J; Tiedge, H

    1995-01-01

    Reverse transcription has been an important mediator of genomic change. This influence dates back more than three billion years, when the RNA genome was converted into the DNA genome. While the current cellular role(s) of reverse transcriptase are not yet completely understood, it has become clear over the last few years that this enzyme is still responsible for generating significant genomic change and that its activities are one of the driving forces of evolution. Reverse transcriptase generates, for example, extra gene copies (retrogenes), using as a template mature messenger RNAs. Such retrogenes do not always end up as nonfunctional pseudogenes but form, after reinsertion into the genome, new unions with resident promoter elements that may alter the gene's temporal and/or spatial expression levels. More frequently, reverse transcriptase produces copies of nonmessenger RNAs, such as small nuclear or cytoplasmic RNAs. Extremely high copy numbers can be generated by this process. The resulting reinserted DNA copies are therefore referred to as short interspersed repetitive elements (SINEs). SINEs have long been considered selfish DNA, littering the genome via exponential propagation but not contributing to the host's fitness. Many SINEs, however, can give rise to novel genes encoding small RNAs, and are the migrant carriers of numerous control elements and sequence motifs that can equip resident genes with novel regulatory elements [Brosius J. and Gould S.J., Proc Natl Acad Sci USA 89, 10706-10710, 1992]. Retrosequences, such as SINEs and portions of retroelements (e.g., long terminal repeats, LTRs), are capable of donating sequence motifs for nucleosome positioning, DNA methylation, transcriptional enhancers and silencers, poly(A) addition sequences, determinants of RNA stability or transport, splice sites, and even amino acid codons for incorporation into open reading frames as novel protein domains. Retroposition can therefore be considered as a major

  14. Bovine Genome Database: integrated tools for genome annotation and discovery.

    PubMed

    Childers, Christopher P; Reese, Justin T; Sundaram, Jaideep P; Vile, Donald C; Dickens, C Michael; Childs, Kevin L; Salih, Hanni; Bennett, Anna K; Hagen, Darren E; Adelson, David L; Elsik, Christine G

    2011-01-01

    The Bovine Genome Database (BGD; http://BovineGenome.org) strives to improve annotation of the bovine genome and to integrate the genome sequence with other genomics data. BGD includes GBrowse genome browsers, the Apollo Annotation Editor, a quantitative trait loci (QTL) viewer, BLAST databases and gene pages. Genome browsers, available for both scaffold and chromosome coordinate systems, display the bovine Official Gene Set (OGS), RefSeq and Ensembl gene models, non-coding RNA, repeats, pseudogenes, single-nucleotide polymorphism, markers, QTL and alignments to complementary DNAs, ESTs and protein homologs. The Bovine QTL viewer is connected to the BGD Chromosome GBrowse, allowing for the identification of candidate genes underlying QTL. The Apollo Annotation Editor connects directly to the BGD Chado database to provide researchers with remote access to gene evidence in a graphical interface that allows editing and creating new gene models. Researchers may upload their annotations to the BGD server for review and integration into the subsequent release of the OGS. Gene pages display information for individual OGS gene models, including gene structure, transcript variants, functional descriptions, gene symbols, Gene Ontology terms, annotator comments and links to National Center for Biotechnology Information and Ensembl. Each gene page is linked to a wiki page to allow input from the research community.

  15. Protecting genome integrity during CRISPR immune adaptation.

    PubMed

    Wright, Addison V; Doudna, Jennifer A

    2016-10-01

    Bacterial CRISPR-Cas systems include genomic arrays of short repeats flanking foreign DNA sequences and provide adaptive immunity against viruses. Integration of foreign DNA must occur specifically to avoid damaging the genome or the CRISPR array, but surprisingly promiscuous activity occurs in vitro. Here we reconstituted full-site DNA integration and show that the Streptococcus pyogenes type II-A Cas1-Cas2 integrase maintains specificity in part through limitations on the second integration step. At non-CRISPR sites, integration stalls at the half-site intermediate, thereby enabling reaction reversal. S. pyogenes Cas1-Cas2 is highly specific for the leader-proximal repeat and recognizes the repeat's palindromic ends, thus fitting a model of independent recognition by distal Cas1 active sites. These findings suggest that DNA-insertion sites are less common than suggested by previous work, thereby preventing toxicity during CRISPR immune adaptation and maintaining host genome integrity.

  16. Integrated genome browser: visual analytics platform for genomics

    PubMed Central

    Norris, David C.; Loraine, Ann E.

    2016-01-01

    Motivation: Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Results: Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB’s ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. Availability and implementation: IGB is open source and is freely available from http://bioviz.org/igb. Contact: aloraine@uncc.edu PMID:27153568

  17. Transcription as a Threat to Genome Integrity.

    PubMed

    Gaillard, Hélène; Aguilera, Andrés

    2016-06-02

    Genomes undergo different types of sporadic alterations, including DNA damage, point mutations, and genome rearrangements, that constitute the basis for evolution. However, these changes may occur at high levels as a result of cell pathology and trigger genome instability, a hallmark of cancer and a number of genetic diseases. In the last two decades, evidence has accumulated that transcription constitutes an important natural source of DNA metabolic errors that can compromise the integrity of the genome. Transcription can create the conditions for high levels of mutations and recombination by its ability to open the DNA structure and remodel chromatin, making it more accessible to DNA insulting agents, and by its ability to become a barrier to DNA replication. Here we review the molecular basis of such events from a mechanistic perspective with particular emphasis on the role of transcription as a genome instability determinant.

  18. Methods of Genomic Competency Integration in Practice

    PubMed Central

    Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

    2015-01-01

    Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through

  19. Integrating Mediators and Moderators in Research Design

    ERIC Educational Resources Information Center

    MacKinnon, David P.

    2011-01-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating…

  20. PGWD: Integrating Personal Genome for Warfarin Dosing.

    PubMed

    Pan, Yidan; Cheng, Ronghai; Li, Zhoufang; Zhao, Yujun; He, Jiankui

    2016-03-01

    Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR. In recent years, due to the rapidly dropping cost of whole genome sequencing and genotyping, more and more people get their whole genome sequenced or genotyped. However, current software for warfarin dosing prediction is based on low-throughput genetic information from either real-time PCR or melting curve methods. There is no bioinformatics tool available that can take the high-throughput genome sequencing data as input and determine the accurate dosage of warfarin. Here, we present PGWD, a web tool that analyzes personal genome sequencing data and integrates with clinical information for warfarin dosing.

  1. Integrative bayesian network analysis of genomic data.

    PubMed

    Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran

    2014-01-01

    Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.

  2. Viral sequences integrated into plant genomes.

    PubMed

    Harper, Glyn; Hull, Roger; Lockhart, Ben; Olszewski, Neil

    2002-01-01

    Sequences of various DNA plant viruses have been found integrated into the host genome. There are two forms of integrant, those that can form episomal viral infections and those that cannot. Integrants of three pararetroviruses, Banana streak virus (BSV), Tobacco vein clearing virus (TVCV), and Petunia vein clearing virus (PVCV), can generate episomal infections in certain hybrid plant hosts in response to stress. In the case of BSV and TVCV, one of the parents contains the integrant but is has not been seen to be activated in that parent; the other parent does not contain the integrant. The number of integrant loci is low for BSV and PVCV and high in TVCV. The structure of the integrants is complex, and it is thought that episomal virus is released by recombination and/or reverse transcription. Geminiviral and pararetroviral sequences are found in plant genomes although not so far associated with a virus disease. It appears that integration of viral sequences is widespread in the plant kingdom and has been occurring for a long period of time.

  3. An Integrated System for Precise Genome Modification in Escherichia coli

    PubMed Central

    Tas, Huseyin; Nguyen, Cac T.; Patel, Ravish; Kim, Neil H.; Kuhlman, Thomas E.

    2015-01-01

    We describe an optimized system for the easy, effective, and precise modification of the Escherichia coli genome. Genome changes are introduced first through the integration of a 1.3 kbp Landing Pad consisting of a gene conferring resistance to tetracycline (tetA) or the ability to metabolize the sugar galactose (galK). The Landing Pad is then excised as a result of double-strand breaks by the homing endonuclease I-SceI, and replaced with DNA fragments bearing the desired change via λ-Red mediated homologous recombination. Repair of the double strand breaks and counterselection against the Landing Pad (using NiCl2 for tetA or 2-deoxy-galactose for galK) allows the isolation of modified bacteria without the use of additional antibiotic selection. We demonstrate the power of this method to make a variety of genome modifications: the exact integration, without any extraneous sequence, of the lac operon (~6.5 kbp) to any desired location in the genome and without the integration of antibiotic markers; the scarless deletion of ribosomal rrn operons (~6 kbp) through either intrachromosomal or oligonucleotide recombination; and the in situ fusion of native genes to fluorescent reporter genes without additional perturbation. PMID:26332675

  4. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  5. Integrative Genomics and Computational Systems Medicine

    SciTech Connect

    McDermott, Jason E.; Huang, Yufei; Zhang, Bing; Xu, Hua; Zhao, Zhongming

    2014-01-01

    The exponential growth in generation of large amounts of genomic data from biological samples has driven the emerging field of systems medicine. This field is promising because it improves our understanding of disease processes at the systems level. However, the field is still in its young stage. There exists a great need for novel computational methods and approaches to effectively utilize and integrate various omics data.

  6. Genomic integrity and the ageing brain.

    PubMed

    Chow, Hei-man; Herrup, Karl

    2015-11-01

    DNA damage is correlated with and may drive the ageing process. Neurons in the brain are postmitotic and are excluded from many forms of DNA repair; therefore, neurons are vulnerable to various neurodegenerative diseases. The challenges facing the field are to understand how and when neuronal DNA damage accumulates, how this loss of genomic integrity might serve as a 'time keeper' of nerve cell ageing and why this process manifests itself as different diseases in different individuals.

  7. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational

  8. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    PubMed Central

    Carvalho, Claudia M. B.; Ramocki, Melissa B.; Pehlivan, Davut; Franco, Luis M.; Gonzaga-Jauregui, Claudia; Fang, Ping; McCall, Alanna; Pivnick, Eniko Karman; Hines-Dowell, Stacy; Seaver, Laurie; Friehling, Linda; Lee, Sansan; Smith, Rosemarie; del Gaudio, Daniela; Withers, Marjorie; Liu, Pengfei; Cheung, Sau Wai; Belmont, John W.; Zoghbi, Huda Y.; Hastings, P. J.; Lupski, James R.

    2011-01-01

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at both the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 12 unrelated subjects. Interestingly, only two novel breakpoint junctions were generated during each rearrangement formation. Remarkably, all the complex rearrangement products share the common genomic organization duplication-inverted triplication-duplication (DUP-TRP/INV-DUP) wherein the triplicated segment is inverted and located between directly oriented duplicated genomic segments. We provide evidence that the DUP-TRP/INV-DUP structures are mediated by inverted repeats that can be separated by over 300 kb; a genomic architecture that apparently leads to susceptibility to such complex rearrangements. A similar inverted repeat mediated mechanism may underlie structural variation in many other regions of the human genome. We propose a mechanism that involves both homology driven, via inverted repeats, and microhomologous/nonhomologous events. PMID:21964572

  9. Integrating Computer-Mediated Communication Strategy Instruction

    ERIC Educational Resources Information Center

    McNeil, Levi

    2016-01-01

    Communication strategies (CSs) play important roles in resolving problematic second language interaction and facilitating language learning. While studies in face-to-face contexts demonstrate the benefits of communication strategy instruction (CSI), there have been few attempts to integrate computer-mediated communication and CSI. The study…

  10. RNA-Mediated Epigenetic Programming of Genome Rearrangements

    PubMed Central

    Nowacki, Mariusz; Shetty, Keerthi; Landweber, Laura F.

    2012-01-01

    RNA, normally thought of as a conduit in gene expression, has a novel mode of action in ciliated protozoa. Maternal RNA templates provide both an organizing guide for DNA rearrangements and a template that can transport somatic mutations to the next generation. This opportunity for RNA-mediated genome rearrangement and DNA repair is profound in the ciliate Oxytricha, which deletes 95% of its germline genome during development in a process that severely fragments its chromosomes and then sorts and reorders the hundreds of thousands of pieces remaining. Oxytricha’s somatic nuclear genome is therefore an epigenome formed through RNA templates and signals arising from the previous generation. Furthermore, this mechanism of RNA-mediated epigenetic inheritance can function across multiple generations, and the discovery of maternal template RNA molecules has revealed new biological roles for RNA and has hinted at the power of RNA molecules to sculpt genomic information in cells. PMID:21801022

  11. Multidimensional Genome-wide Analyses Show Accurate FVIII Integration by ZFN in Primary Human Cells

    PubMed Central

    Sivalingam, Jaichandran; Kenanov, Dimitar; Han, Hao; Nirmal, Ajit Johnson; Ng, Wai Har; Lee, Sze Sing; Masilamani, Jeyakumar; Phan, Toan Thang; Maurer-Stroh, Sebastian; Kon, Oi Lian

    2016-01-01

    Costly coagulation factor VIII (FVIII) replacement therapy is a barrier to optimal clinical management of hemophilia A. Therapy using FVIII-secreting autologous primary cells is potentially efficacious and more affordable. Zinc finger nucleases (ZFN) mediate transgene integration into the AAVS1 locus but comprehensive evaluation of off-target genome effects is currently lacking. In light of serious adverse effects in clinical trials which employed genome-integrating viral vectors, this study evaluated potential genotoxicity of ZFN-mediated transgenesis using different techniques. We employed deep sequencing of predicted off-target sites, copy number analysis, whole-genome sequencing, and RNA-seq in primary human umbilical cord-lining epithelial cells (CLECs) with AAVS1 ZFN-mediated FVIII transgene integration. We combined molecular features to enhance the accuracy and activity of ZFN-mediated transgenesis. Our data showed a low frequency of ZFN-associated indels, no detectable off-target transgene integrations or chromosomal rearrangements. ZFN-modified CLECs had very few dysregulated transcripts and no evidence of activated oncogenic pathways. We also showed AAVS1 ZFN activity and durable FVIII transgene secretion in primary human dermal fibroblasts, bone marrow- and adipose tissue-derived stromal cells. Our study suggests that, with close attention to the molecular design of genome-modifying constructs, AAVS1 ZFN-mediated FVIII integration in several primary human cell types may be safe and efficacious. PMID:26689265

  12. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  13. An integrated 3-Dimensional Genome Modeling Engine for data-driven simulation of spatial genome organization.

    PubMed

    Szałaj, Przemysław; Tang, Zhonghui; Michalski, Paul; Pietal, Michal J; Luo, Oscar J; Sadowski, Michał; Li, Xingwang; Radew, Kamen; Ruan, Yijun; Plewczynski, Dariusz

    2016-12-01

    ChIA-PET is a high-throughput mapping technology that reveals long-range chromatin interactions and provides insights into the basic principles of spatial genome organization and gene regulation mediated by specific protein factors. Recently, we showed that a single ChIA-PET experiment provides information at all genomic scales of interest, from the high-resolution locations of binding sites and enriched chromatin interactions mediated by specific protein factors, to the low resolution of nonenriched interactions that reflect topological neighborhoods of higher-order chromosome folding. This multilevel nature of ChIA-PET data offers an opportunity to use multiscale 3D models to study structural-functional relationships at multiple length scales, but doing so requires a structural modeling platform. Here, we report the development of 3D-GNOME (3-Dimensional Genome Modeling Engine), a complete computational pipeline for 3D simulation using ChIA-PET data. 3D-GNOME consists of three integrated components: a graph-distance-based heat map normalization tool, a 3D modeling platform, and an interactive 3D visualization tool. Using ChIA-PET and Hi-C data derived from human B-lymphocytes, we demonstrate the effectiveness of 3D-GNOME in building 3D genome models at multiple levels, including the entire genome, individual chromosomes, and specific segments at megabase (Mb) and kilobase (kb) resolutions of single average and ensemble structures. Further incorporation of CTCF-motif orientation and high-resolution looping patterns in 3D simulation provided additional reliability of potential biologically plausible topological structures.

  14. Integrated Genomic Characterization of Endometrial Carcinoma

    PubMed Central

    2013-01-01

    Summary We performed an integrated genomic, transcriptomic, and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumors and ~25% of high-grade endometrioid tumors have extensive copy number alterations, few DNA methylation changes, low ER/PR levels, and frequent TP53 mutations. Most endometrioid tumors have few copy number alterations or TP53 mutations but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A, KRAS and novel mutations in the SWI/SNF gene ARID5B. A subset of endometrioid tumors we identified had a dramatically increased transversion mutation frequency, and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy number low, and copy number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may impact post-surgical adjuvant treatment for women with aggressive tumors. PMID:23636398

  15. Site-specific recombination in the chicken genome using Flipase recombinase-mediated cassette exchange.

    PubMed

    Lee, Hong Jo; Lee, Hyung Chul; Kim, Young Min; Hwang, Young Sun; Park, Young Hyun; Park, Tae Sub; Han, Jae Yong

    2016-02-01

    Targeted genome recombination has been applied in diverse research fields and has a wide range of possible applications. In particular, the discovery of specific loci in the genome that support robust and ubiquitous expression of integrated genes and the development of genome-editing technology have facilitated rapid advances in various scientific areas. In this study, we produced transgenic (TG) chickens that can induce recombinase-mediated gene cassette exchange (RMCE), one of the site-specific recombination technologies, and confirmed RMCE in TG chicken-derived cells. As a result, we established TG chicken lines that have, Flipase (Flp) recognition target (FRT) pairs in the chicken genome, mediated by piggyBac transposition. The transgene integration patterns were diverse in each TG chicken line, and the integration diversity resulted in diverse levels of expression of exogenous genes in each tissue of the TG chickens. In addition, the replaced gene cassette was expressed successfully and maintained by RMCE in the FRT predominant loci of TG chicken-derived cells. These results indicate that targeted genome recombination technology with RMCE could be adaptable to TG chicken models and that the technology would be applicable to specific gene regulation by cis-element insertion and customized expression of functional proteins at predicted levels without epigenetic influence.

  16. Genome integrity, stem cells and hyaluronan

    PubMed Central

    Darzynkiewicz, Zbigniew; Balazs, Endre A.

    2012-01-01

    Faithful preservation of genome integrity is the critical mission of stem cells as well as of germ cells. Reviewed are the following mechanisms involved in protecting DNA in these cells: (a) The efflux machinery that can pump out variety of genotoxins in ATP-dependent manner; (b) the mechanisms maintaining minimal metabolic activity which reduces generation of reactive oxidants, by-products of aerobic respiration; (c) the role of hypoxic niche of stem cells providing a gradient of variable oxygen tension; (d) (e) the presence of hyaluronan (HA) and HA receptors on stem cells and in the niche; (f) the role of HA in protecting DNA from oxidative damage; (g) the specific function of HA in protecting DNA in stem cells; (h) the interactions of HA with sperm cells and oocytes that also may shield their DNA from oxidative damage, and (e) mechanisms by which HA exerts the anti-oxidant activity. While HA has multitude of functions its anti-oxidant capabilities are often overlooked but may be of significance in preservation of integrity of stem and germ cells genome. PMID:22383371

  17. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish.

    PubMed

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-05-13

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish.

  18. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    PubMed Central

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-01-01

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish. PMID:27187373

  19. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  20. LINE-1 Retrotransposons: Mediators of Somatic Variation in Neuronal Genomes?

    PubMed Central

    Singer, Tatjana; McConnell, Michael J.; Marchetto, Maria C.N.; Coufal, Nicole G.; Gage, Fred H.

    2010-01-01

    LINE-1 (L1) elements are retrotransposons that insert extra copies of themselves throughout the genome using a “copy and paste” mechanism. L1s have contributed ~20% to total human genome content and are able to influence chromosome integrity and gene expression upon reinsertion. Recent studies show that L1 elements are active and “jumping” during neuronal differentiation. New somatic L1 insertions may generate “genomic plasticity” in neurons by causing variation in genomic DNA sequences and by altering the transcriptome of individual cells. Thus, L1-induced variation may affect neuronal plasticity and behavior. Here, we discuss potential consequences of L1-induced neuronal diversity and propose that a mechanism generating diversity in the brain could broaden the spectrum of behavioral phenotypes that can originate from any single genome. PMID:20471112

  1. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  2. Nuclear pore complexes in the maintenance of genome integrity.

    PubMed

    Bukata, Lucas; Parker, Stephanie L; D'Angelo, Maximiliano A

    2013-06-01

    Maintaining genome integrity is crucial for successful organismal propagation and for cell and tissue homeostasis. Several processes contribute to safeguarding the genomic information of cells. These include accurate replication of genetic information, detection and repair of DNA damage, efficient segregation of chromosomes, protection of chromosome ends, and proper organization of genome architecture. Interestingly, recent evidence shows that nuclear pore complexes, the channels connecting the nucleus with the cytoplasm, play important roles in these processes suggesting that these multiprotein platforms are key regulators of genome integrity.

  3. MAR-Mediated transgene integration into permissive chromatin and increased expression by recombination pathway engineering.

    PubMed

    Kostyrko, Kaja; Neuenschwander, Samuel; Junier, Thomas; Regamey, Alexandre; Iseli, Christian; Schmid-Siegert, Emanuel; Bosshard, Sandra; Majocchi, Stefano; Le Fourn, Valérie; Girod, Pierre-Alain; Xenarios, Ioannis; Mermod, Nicolas

    2017-02-01

    Untargeted plasmid integration into mammalian cell genomes remains a poorly understood and inefficient process. The formation of plasmid concatemers and their genomic integration has been ascribed either to non-homologous end-joining (NHEJ) or homologous recombination (HR) DNA repair pathways. However, a direct involvement of these pathways has remained unclear. Here, we show that the silencing of many HR factors enhanced plasmid concatemer formation and stable expression of the gene of interest in Chinese hamster ovary (CHO) cells, while the inhibition of NHEJ had no effect. However, genomic integration was decreased by the silencing of specific HR components, such as Rad51, and DNA synthesis-dependent microhomology-mediated end-joining (SD-MMEJ) activities. Genome-wide analysis of the integration loci and junction sequences validated the prevalent use of the SD-MMEJ pathway for transgene integration close to cellular genes, an effect shared with matrix attachment region (MAR) DNA elements that stimulate plasmid integration and expression. Overall, we conclude that SD-MMEJ is the main mechanism driving the illegitimate genomic integration of foreign DNA in CHO cells, and we provide a recombination engineering approach that increases transgene integration and recombinant protein expression in these cells. Biotechnol. Bioeng. 2017;114: 384-396. © 2016 The Authors. Biotechnology and Bioengineering published by Wiley Periodicals, Inc.

  4. T-DNA integration in plants results from polymerase-θ-mediated DNA repair.

    PubMed

    van Kregten, Maartje; de Pater, Sylvia; Romeijn, Ron; van Schendel, Robin; Hooykaas, Paul J J; Tijsterman, Marcel

    2016-10-31

    Agrobacterium tumefaciens is a pathogenic bacterium, which transforms plants by transferring a discrete segment of its DNA, the T-DNA, to plant cells. The T-DNA then integrates into the plant genome. T-DNA biotechnology is widely exploited in the genetic engineering of model plants and crops. However, the molecular mechanism underlying T-DNA integration remains unknown(1). Here we demonstrate that in Arabidopsis thaliana T-DNA integration critically depends on polymerase theta (Pol θ). We find that TEBICHI/POLQ mutant plants (which have mutated Pol θ), although susceptible to Agrobacterium infection, are resistant to T-DNA integration. Characterization of >10,000 T-DNA-plant genome junctions reveals a distinct signature of Pol θ action and also indicates that 3' end capture at genomic breaks is the prevalent mechanism of T-DNA integration. The primer-template switching ability of Pol θ can explain the molecular patchwork known as filler DNA that is frequently observed at sites of integration. T-DNA integration signatures in other plant species closely resemble those of Arabidopsis, suggesting that Pol-θ-mediated integration is evolutionarily conserved. Thus, Pol θ provides the mechanism for T-DNA random integration into the plant genome, demonstrating a potential to disrupt random integration so as to improve the quality and biosafety of plant transgenesis.

  5. CRISPR mediated somatic cell genome engineering in the chicken.

    PubMed

    Véron, Nadège; Qu, Zhengdong; Kipen, Phoebe A S; Hirst, Claire E; Marcelle, Christophe

    2015-11-01

    Gene-targeted knockout technologies are invaluable tools for understanding the functions of genes in vivo. CRISPR/Cas9 system of RNA-guided genome editing is revolutionizing genetics research in a wide spectrum of organisms. Here, we combined CRISPR with in vivo electroporation in the chicken embryo to efficiently target the transcription factor PAX7 in tissues of the developing embryo. This approach generated mosaic genetic mutations within a wild-type cellular background. This series of proof-of-principle experiments indicate that in vivo CRISPR-mediated cell genome engineering is an effective method to achieve gene loss-of-function in the tissues of the chicken embryo and it completes the growing genetic toolbox to study the molecular mechanisms regulating development in this important animal model.

  6. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    PubMed

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org.

  7. Hotspots of MLV integration in the hematopoietic tumor genome

    PubMed Central

    Tsuruyama, T; Hiratsuka, T; Yamada, N

    2017-01-01

    Extensive research has been performed regarding the integration sites of murine leukemia retrovirus (MLV) for the identification of proto-oncogenes. To date, the overlap of mutations within specific oligonucleotides across different tumor genomes has been regarded as a rare event; however, a recent study of MLV integration into the oncogene Zfp521 suggested the existence of a hotspot oligonucleotide for MLV integration. In the current review, we discuss the hotspots of MLV integration into several genes: c-Myc, Stat5a and N-myc, as well as ZFP521, as examined in tumor genomes. From this, MLV integration convergence within specific oligonucleotides is not necessarily a rare event. This short review aims to promote re-consideration of MLV integration within the tumor genome, which involves both well-known and potentially newly identified and novel mechanisms and specifications. PMID:27721401

  8. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Zhou, Jizhong; He, Zhili

    2014-04-08

    As a part of the Shewanella Federation project, we have used integrated genomic, proteomic and computational technologies to study various aspects of energy metabolism of two Shewanella strains from a systems-level perspective.

  9. Integrated proteomic and genomic analysis of colorectal cancer

    Cancer.gov

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  10. Precise in-frame integration of exogenous DNA mediated by CRISPR/Cas9 system in zebrafish.

    PubMed

    Hisano, Yu; Sakuma, Tetsushi; Nakade, Shota; Ohga, Rie; Ota, Satoshi; Okamoto, Hitoshi; Yamamoto, Takashi; Kawahara, Atsuo

    2015-03-05

    The CRISPR/Cas9 system provides a powerful tool for genome editing in various model organisms, including zebrafish. The establishment of targeted gene-disrupted zebrafish (knockouts) is readily achieved by CRISPR/Cas9-mediated genome modification. Recently, exogenous DNA integration into the zebrafish genome via homology-independent DNA repair was reported, but this integration contained various mutations at the junctions of genomic and integrated DNA. Thus, precise genome modification into targeted genomic loci remains to be achieved. Here, we describe efficient, precise CRISPR/Cas9-mediated integration using a donor vector harbouring short homologous sequences (10-40 bp) flanking the genomic target locus. We succeeded in integrating with high efficiency an exogenous mCherry or eGFP gene into targeted genes (tyrosinase and krtt1c19e) in frame. We found the precise in-frame integration of exogenous DNA without backbone vector sequences when Cas9 cleavage sites were introduced at both sides of the left homology arm, the eGFP sequence and the right homology arm. Furthermore, we confirmed that this precise genome modification was heritable. This simple method enables precise targeted gene knock-in in zebrafish.

  11. WheatGenome.info: an integrated database and portal for wheat genome information.

    PubMed

    Lai, Kaitao; Berkman, Paul J; Lorenc, Michal Tadeusz; Duran, Chris; Smits, Lars; Manoli, Sahana; Stiller, Jiri; Edwards, David

    2012-02-01

    Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.

  12. Principles and methods of integrative genomic analyses in cancer.

    PubMed

    Kristensen, Vessela N; Lingjærde, Ole Christian; Russnes, Hege G; Vollan, Hans Kristian M; Frigessi, Arnoldo; Børresen-Dale, Anne-Lise

    2014-05-01

    Combined analyses of molecular data, such as DNA copy-number alteration, mRNA and protein expression, point to biological functions and molecular pathways being deregulated in multiple cancers. Genomic, metabolomic and clinical data from various solid cancers and model systems are emerging and can be used to identify novel patient subgroups for tailored therapy and monitoring. The integrative genomics methodologies that are used to interpret these data require expertise in different disciplines, such as biology, medicine, mathematics, statistics and bioinformatics, and they can seem daunting. The objectives, methods and computational tools of integrative genomics that are available to date are reviewed here, as is their implementation in cancer research.

  13. [Investigation on the integrative course of genetics and genomics].

    PubMed

    Liu, Zhi-Xiang; Xu, Gang-Biao; Zeng, Chao-Zhen; Wang, Ai-Yun; Wu, Ruo-Yan

    2011-07-01

    Genomics is an important subdiscipline of genetics, and it forms a complete research system based on novel theories and techniques. Incorporating genomics in undergraduate curriculum is a response to the need of the development of genetics. The teaching of genomics has significant advantages on developing scientific thinking, enhances bioethics accomplishment, and professional interests in undergraduate students. The integration of genomics into genetics is in accordance with the principles of subject development and education. Related textbooks for undergraduate education are currently available in China, and it is feasible to set up a genetics and genomics integrative course by modifying teaching contents of the genetics course, selecting appropriate teaching approaches, and optimal application of the computer-assisted instruction.

  14. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma

    PubMed Central

    Zhao, Ling-Hao; Liu, Xiao; Yan, He-Xin; Li, Wei-Yang; Zeng, Xi; Yang, Yuan; Zhao, Jie; Liu, Shi-Ping; Zhuang, Xue-Han; Lin, Chuan; Qin, Chen-Jie; Zhao, Yi; Pan, Ze-Ya; Huang, Gang; Liu, Hui; Zhang, Jin; Wang, Ruo-Yu; Yang, Yun; Wen, Wen; Lv, Gui-Shuai; Zhang, Hui-Lu; Wu, Han; Huang, Shuai; Wang, Ming-Da; Tang, Liang; Cao, Hong-Zhi; Wang, Ling; Lee, Tin-Lap; Jiang, Hui; Tan, Ye-Xiong; Yuan, Sheng-Xian; Hou, Guo-Jun; Tao, Qi-Fei; Xu, Qin-Guo; Zhang, Xiu-Qing; Wu, Meng-Chao; Xu, Xun; Wang, Jun; Yang, Huan-Ming; Zhou, Wei-Ping; Wang, Hong-Yang

    2016-01-01

    Hepatitis B virus (HBV) can integrate into the human genome, contributing to genomic instability and hepatocarcinogenesis. Here by conducting high-throughput viral integration detection and RNA sequencing, we identify 4,225 HBV integration events in tumour and adjacent non-tumour samples from 426 patients with HCC. We show that HBV is prone to integrate into rare fragile sites and functional genomic regions including CpG islands. We observe a distinct pattern in the preferential sites of HBV integration between tumour and non-tumour tissues. HBV insertional sites are significantly enriched in the proximity of telomeres in tumours. Recurrent HBV target genes are identified with few that overlap. The overall HBV integration frequency is much higher in tumour genomes of males than in females, with a significant enrichment of integration into chromosome 17. Furthermore, a cirrhosis-dependent HBV integration pattern is observed, affecting distinct targeted genes. Our data suggest that HBV integration has a high potential to drive oncogenic transformation. PMID:27703150

  15. Integrated Microbial Genomes (IMG) System from the DOE Joint Genome Institute (JGI)

    DOE Data Explorer

    The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov. [Abstract from The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions; Victor M. Markowitz, Ernest Szeto, Krishna Palaniappan, Yuri Grechkin, Ken Chu, I-Min A. Chen, Inna Dubchak, Iain Anderson, Athanasios Lykidis, Konstantinos Mavromatis, Natalia N. Ivanova and Nikos C. Kyrpides; Nucleic Acids Research, 2008, Vol. 36. (Database Issue) See also the companion system, Integrated Microbial Genomes with Microbiome Samples.

  16. Target-specific variants of Flp recombinase mediate genome engineering reactions in mammalian cells.

    PubMed

    Shah, Riddhi; Li, Feng; Voziyanova, Eugenia; Voziyanov, Yuri

    2015-09-01

    Genome engineering relies on DNA-modifying enzymes that are able to locate a DNA sequence of interest and initiate a desired genome rearrangement. Currently, the field predominantly utilizes site-specific DNA nucleases that depend on the host DNA repair machinery to complete a genome modification task. We show here that genome engineering approaches that employ target-specific variants of the self-sufficient, versatile site-specific DNA recombinase Flp can be developed into promising alternatives. We demonstrate that the Flp variant evolved to recombine an FRT-like sequence, FL-IL10A, which is located upstream of the human interleukin-10 gene, and can target this sequence in the model setting of Chinese hamster ovary and human embryonic kidney 293 cells. This target-specific Flp variant is able to perform the integration reaction and, when paired with another recombinase, the dual recombinase-mediated cassette exchange reaction. The efficiency of the integration reaction in human cells can be enhanced by 'humanizing' the Flp variant gene and by adding the nuclear localization sequence to the recombinase.

  17. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.

    PubMed

    Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora

    2015-05-01

    Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders.

  18. Nuclease-mediated genome editing: At the front-line of functional genomics technology.

    PubMed

    Sakuma, Tetsushi; Woltjen, Knut

    2014-01-01

    Genome editing with engineered endonucleases is rapidly becoming a staple method in developmental biology studies. Engineered nucleases permit random or designed genomic modification at precise loci through the stimulation of endogenous double-strand break repair. Homology-directed repair following targeted DNA damage is mediated by co-introduction of a custom repair template, allowing the derivation of knock-out and knock-in alleles in animal models previously refractory to classic gene targeting procedures. Currently there are three main types of customizable site-specific nucleases delineated by the source mechanism of DNA binding that guides nuclease activity to a genomic target: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). Among these genome engineering tools, characteristics such as the ease of design and construction, mechanism of inducing DNA damage, and DNA sequence specificity all differ, making their application complementary. By understanding the advantages and disadvantages of each method, one may make the best choice for their particular purpose.

  19. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.

  20. An integrated approach to structural genomics.

    PubMed

    Heinemann, U; Frevert, J; Hofmann, K; Illing, G; Maurer, C; Oschkinat, H; Saenger, W

    2000-01-01

    Structural genomics aims at determining a set of protein structures that will represent all domain folds present in the biosphere. These structures can be used as the basis for the homology modelling of the majority of all remaining protein domains or, indeed, proteins. Structural genomics therefore promises to provide a comprehensive structural description of the protein universe. To achieve this, a broad scientific effort is required. The Berlin-based "Protein Structure Factory" (PSF) plans to contribute to this effort by setting up a local infrastructure for the low-cost, high-throughput analysis of soluble human proteins. In close collaboration with the German Human Genome Project (DHGP) protein-coding genes will be expressed in Escherichia coli or yeast. Affinity-tagged proteins will be purified semi-automatically for biophysical characterization and structure analysis by X-ray diffraction methods and NMR spectroscopy. In all steps of the structure analysis process, possibilities for automation, parallelization and standardization will be explored. Major new facilities that are created for the PSF include a robotic station for large-scale protein crystallization, an NMR center and an experimental station for protein crystallography at the synchrotron storage ring BESSY II in Berlin.

  1. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M.; Micheals, G.S.; Taylor, R.

    1992-12-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator`s tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  2. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M. ); Micheals, G.S.; Taylor, R. . Div. of Computer Resources and Technology)

    1992-01-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator's tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  3. CRISPR-Cas9-Mediated Genome Editing in Leishmania donovani

    PubMed Central

    Zhang, Wen-Wei

    2015-01-01

    ABSTRACT The prokaryotic CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9, an RNA-guided endonuclease, has been shown to mediate efficient genome editing in a wide variety of organisms. In the present study, the CRISPR-Cas9 system has been adapted to Leishmania donovani, a protozoan parasite that causes fatal human visceral leishmaniasis. We introduced the Cas9 nuclease into L. donovani and generated guide RNA (gRNA) expression vectors by using the L. donovani rRNA promoter and the hepatitis delta virus (HDV) ribozyme. It is demonstrated within that L. donovani mainly used homology-directed repair (HDR) and microhomology-mediated end joining (MMEJ) to repair the Cas9 nuclease-created double-strand DNA break (DSB). The nonhomologous end-joining (NHEJ) pathway appears to be absent in L. donovani. With this CRISPR-Cas9 system, it was possible to generate knockouts without selection by insertion of an oligonucleotide donor with stop codons and 25-nucleotide homology arms into the Cas9 cleavage site. Likewise, we disrupted and precisely tagged endogenous genes by inserting a bleomycin drug selection marker and GFP gene into the Cas9 cleavage site. With the use of Hammerhead and HDV ribozymes, a double-gRNA expression vector that further improved gene-targeting efficiency was developed, and it was used to make precise deletion of the 3-kb miltefosine transporter gene (LdMT). In addition, this study identified a novel single point mutation caused by CRISPR-Cas9 in LdMT (M381T) that led to miltefosine resistance, a concern for the only available oral antileishmanial drug. Together, these results demonstrate that the CRISPR-Cas9 system represents an effective genome engineering tool for L. donovani. PMID:26199327

  4. Identifying potential cancer driver genes by genomic data integration

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-12-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis.

  5. Transcriptionally active genome regions are preferred targets for retrovirus integration.

    PubMed Central

    Scherdin, U; Rhodes, K; Breindl, M

    1990-01-01

    We have analyzed the transcriptional activity of cellular target sequences for Moloney murine leukemia virus integration in mouse fibroblasts. At least five of the nine random, unselected integration target sequences studied showed direct evidence for transcriptional activity by hybridization to nuclear run-on transcripts prepared from uninfected cells. At least four of the sequences contained multiple recognition sites for several restriction enzymes that cut preferentially in CpG-rich islands, indicating integration into 5' or 3' ends or flanking regions of genes. Assuming that only a minor fraction (less than 20%) of the genome is transcribed in mammalian cells, we calculated the probability that this association of retroviral integration sites with transcribed sequences is due to chance to be very low (1.6 x 10(-2]. Thus, our results strongly suggest that transcriptionally active genome regions are preferred targets for retrovirus integration. Images PMID:2296087

  6. Integrated Genomic Characterization of Papillary Thyroid Carcinoma

    PubMed Central

    Agrawal, Nishant; Akbani, Rehan; Aksoy, B. Arman; Ally, Adrian; Arachchi, Harindra; Asa, Sylvia L.; Auman, J. Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B.; Behera, Madhusmita; Bernard, Brady; Beroukhim, Rameen; Bishop, Justin A.; Black, Aaron D.; Bodenheimer, Tom; Boice, Lori; Bootwalla, Moiz S.; Bowen, Jay; Bowlby, Reanne; Bristow, Christopher A.; Brookens, Robin; Brooks, Denise; Bryant, Robert; Buda, Elizabeth; Butterfield, Yaron S.N.; Carling, Tobias; Carlsen, Rebecca; Carter, Scott L.; Carty, Sally E.; Chan, Timothy A.; Chen, Amy Y.; Cherniack, Andrew D.; Cheung, Dorothy; Chin, Lynda; Cho, Juok; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Ciriello, Giovanni; Clarke, Amanda; Clayman, Gary L.; Cope, Leslie; Copland, John; Covington, Kyle; Danilova, Ludmila; Davidsen, Tanja; Demchok, John A.; DiCara, Daniel; Dhalla, Noreen; Dhir, Rajiv; Dookran, Sheliann S.; Dresdner, Gideon; Eldridge, Jonathan; Eley, Greg; El-Naggar, Adel K.; Eng, Stephanie; Fagin, James A.; Fennell, Timothy; Ferris, Robert L.; Fisher, Sheila; Frazer, Scott; Frick, Jessica; Gabriel, Stacey B.; Ganly, Ian; Gao, Jianjiong; Garraway, Levi A.; Gastier-Foster, Julie M.; Getz, Gad; Gehlenborg, Nils; Ghossein, Ronald; Gibbs, Richard A.; Giordano, Thomas J.; Gomez-Hernandez, Karen; Grimsby, Jonna; Gross, Benjamin; Guin, Ranabir; Hadjipanayis, Angela; Harper, Hollie A.; Hayes, D. Neil; Heiman, David I.; Herman, James G.; Hoadley, Katherine A.; Hofree, Matan; Holt, Robert A.; Hoyle, Alan P.; Huang, Franklin W.; Huang, Mei; Hutter, Carolyn M.; Ideker, Trey; Iype, Lisa; Jacobsen, Anders; Jefferys, Stuart R.; Jones, Corbin D.; Jones, Steven J.M.; Kasaian, Katayoon; Kebebew, Electron; Khuri, Fadlo R.; Kim, Jaegil; Kramer, Roger; Kreisberg, Richard; Kucherlapati, Raju; Kwiatkowski, David J.; Ladanyi, Marc; Lai, Phillip H.; Laird, Peter W.; Lander, Eric; Lawrence, Michael S.; Lee, Darlene; Lee, Eunjung; Lee, Semin; Lee, William; Leraas, Kristen M.; Lichtenberg, Tara M.; Lichtenstein, Lee; Lin, Pei; Ling, Shiyun; Liu, Jinze; Liu, Wenbin; Liu, Yingchun; LiVolsi, Virginia A.; Lu, Yiling; Ma, Yussanne; Mahadeshwar, Harshad S.; Marra, Marco A.; Mayo, Michael; McFadden, David G.; Meng, Shaowu; Meyerson, Matthew; Mieczkowski, Piotr A.; Miller, Michael; Mills, Gordon; Moore, Richard A.; Mose, Lisle E.; Mungall, Andrew J.; Murray, Bradley A.; Nikiforov, Yuri E.; Noble, Michael S.; Ojesina, Akinyemi I.; Owonikoko, Taofeek K.; Ozenberger, Bradley A.; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J.; Parker, Joel S.; Paull, Evan O.; Pedamallu, Chandra Sekhar; Perou, Charles M.; Prins, Jan F.; Protopopov, Alexei; Ramalingam, Suresh S.; Ramirez, Nilsa C.; Ramirez, Ricardo; Raphael, Benjamin J.; Rathmell, W. Kimryn; Ren, Xiaojia; Reynolds, Sheila M.; Rheinbay, Esther; Ringel, Matthew D.; Rivera, Michael; Roach, Jeffrey; Robertson, A. Gordon; Rosenberg, Mara W.; Rosenthall, Matthew; Sadeghi, Sara; Saksena, Gordon; Sander, Chris; Santoso, Netty; Schein, Jacqueline E.; Schultz, Nikolaus; Schumacher, Steven E.; Seethala, Raja R.; Seidman, Jonathan; Senbabaoglu, Yasin; Seth, Sahil; Sharpe, Samantha; Mills Shaw, Kenna R.; Shen, John P.; Shen, Ronglai; Sherman, Steven; Sheth, Margi; Shi, Yan; Shmulevich, Ilya; Sica, Gabriel L.; Simons, Janae V.; Sipahimalani, Payal; Smallridge, Robert C.; Sofia, Heidi J.; Soloway, Matthew G.; Song, Xingzhi; Sougnez, Carrie; Stewart, Chip; Stojanov, Petar; Stuart, Joshua M.; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Tarnuzzer, Roy; Taylor, Barry S.; Thiessen, Nina; Thorne, Leigh; Thorsson, Vésteinn; Tuttle, R. Michael; Umbricht, Christopher B.; Van Den Berg, David J.; Vandin, Fabio; Veluvolu, Umadevi; Verhaak, Roel G.W.; Vinco, Michelle; Voet, Doug; Walter, Vonn; Wang, Zhining; Waring, Scot; Weinberger, Paul M.; Weinstein, John N.; Weisenberger, Daniel J.; Wheeler, David; Wilkerson, Matthew D.; Wilson, Jocelyn; Williams, Michelle; Winer, Daniel A.; Wise, Lisa; Wu, Junyuan; Xi, Liu; Xu, Andrew W.; Yang, Liming; Yang, Lixing; Zack, Travis I.; Zeiger, Martha A.; Zeng, Dong; Zenklusen, Jean Claude; Zhao, Ni; Zhang, Hailei; Zhang, Jianhua; Zhang, Jiashan (Julia); Zhang, Wei; Zmuda, Erik; Zou., Lihua

    2014-01-01

    Summary Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  7. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  8. A first generation integrated map of the rainbow trout genome

    PubMed Central

    2011-01-01

    Background Rainbow trout (Oncorhynchus mykiss) are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL) have been identified for production and life-history traits in rainbow trout. An integrated physical and genetic map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS) programs for improving rainbow trout aquaculture production. Results The first generation integrated map of the rainbow trout genome is composed of 238 BAC contigs anchored to chromosomes of the genetic map. It covers more than 10% of the genome across segments from all 29 chromosomes. Anchoring of 203 contigs to chromosomes of the National Center for Cool and Cold Water Aquaculture (NCCCWA) genetic map was achieved through mapping of 288 genetic markers derived from BAC end sequences (BES), screening of the BAC library with previously mapped markers and matching of SNPs with BES reads. In addition, 35 contigs were anchored to linkage groups of the INRA (French National Institute of Agricultural Research) genetic map through markers that were not informative for linkage analysis in the NCCCWA mapping panel. The ratio of physical to genetic linkage distances varied substantially among chromosomes and BAC contigs with an average of 3,033 Kb/cM. Conclusions The integrated map described here provides a framework for a robust composite genome map for rainbow trout. This resource is needed for genomic analyses in this research model and economically important species and will facilitate comparative genome mapping with other salmonids and with model fish species. This resource will also facilitate efforts to

  9. Integrated translational genomics for analysis of complex traits in sorghum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  10. Integrated genomic characterization of oesophageal carcinoma.

    PubMed

    2017-01-12

    Oesophageal cancers are prominent worldwide; however, there are few targeted therapies and survival rates for these cancers remain dismal. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies.

  11. Integrated Genomic Analyses of Ovarian Carcinoma

    PubMed Central

    2011-01-01

    Summary The Cancer Genome Atlas (TCGA) project has analyzed mRNA expression, miRNA expression, promoter methylation, and DNA copy number in 489 high-grade serous ovarian adenocarcinomas (HGS-OvCa) and the DNA sequences of exons from coding genes in 316 of these tumors. These results show that HGS-OvCa is characterized by TP53 mutations in almost all tumors (96%); low prevalence but statistically recurrent somatic mutations in 9 additional genes including NF1, BRCA1, BRCA2, RB1, and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three miRNA subtypes, four promoter methylation subtypes, a transcriptional signature associated with survival duration and shed new light on the impact on survival of tumors with BRCA1/2 and CCNE1 aberrations. Pathway analyses suggested that homologous recombination is defective in about half of tumors, and that Notch and FOXM1 signaling are involved in serous ovarian cancer pathophysiology. PMID:21720365

  12. Integrative clinical genomics of advanced prostate cancer

    PubMed Central

    Dan, Robinson; Van Allen, Eliezer M.; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J.; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim M.; Bradley, Robert K.; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P.; Hussain, Maha; Feng, Felix Y.; Tomlins, Scott A.; Cooney, Kathleen A.; Smith, David C.; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E.; Morris, Michael J.; Solomon, Stephen B.; Durack, Jeremy C.; Reuter, Victor E.; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T.; Bowden, Michaela; Balk, Stephen P.; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y.; Mostaghel, Elahe A.; Cheng, Heather H.; Mulcahy, Hyojeong; True, Lawrence D.; Plymate, Stephen R.; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Lopez, Raquel Perez; Demichelis, Francesca; Robinson, Brian D.; Schiffman, Marc A.; Nanus, David M.; Tagawa, Scott T.; Sigaras, Alexandros; Eng, Kenneth W.; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I.; Scher, Howard I.; Pienta, Kenneth J.; Kantoff, Philip; de Bono, Johann S.; Rubin, Mark A.; Nelson, Peter S.; Garraway, Levi A.; Sawyers, Charles L.; Chinnaiyan, Arul M.

    2015-01-01

    SUMMARY Toward development of a precision medicine framework for metastatic, castration resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53 and PTEN were frequent (40–60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified novel genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin and ZBTB16/PLZF. Aberrations of BRCA2, BRCA1 and ATM were observed at substantially higher frequencies (19.3% overall) than seen in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides evidence that clinical sequencing in mCRPC is feasible and could impact treatment decisions in significant numbers of affected individuals. PMID:26000489

  13. Mutation Detection with Next-Generation Resequencing through a Mediator Genome

    SciTech Connect

    Wurtzel, Omri; Dori-Bachash, Mally; Pietrokovski, Shmuel; Jurkevitch, Edouard; Sorek, Rotem; Ben-Jacob, Eshel

    2010-12-31

    The affordability of next generation sequencing (NGS) is transforming the field of mutation analysis in bacteria. The genetic basis for phenotype alteration can be identified directly by sequencing the entire genome of the mutant and comparing it to the wild-type (WT) genome, thus identifying acquired mutations. A major limitation for this approach is the need for an a-priori sequenced reference genome for the WT organism, as the short reads of most current NGS approaches usually prohibit de-novo genome assembly. To overcome this limitation we propose a general framework that utilizes the genome of relative organisms as mediators for comparing WT and mutant bacteria. Under this framework, both mutant and WT genomes are sequenced with NGS, and the short sequencing reads are mapped to the mediator genome. Variations between the mutant and the mediator that recur in the WT are ignored, thus pinpointing the differences between the mutant and the WT. To validate this approach we sequenced the genome of Bdellovibrio bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent mutant, and compared both to the mediator species Bdellovibrio bacteriovorus HD100. Although the mutant and the mediator sequences differed in more than 28,000 nucleotide positions, our approach enabled pinpointing the single causative mutation. Experimental validation in 53 additional mutants further established the implicated gene. Our approach extends the applicability of NGS-based mutant analyses beyond the domain of available reference genomes.

  14. Integrative genomics to dissect retinoid functions.

    PubMed

    Mendoza-Parra, Marco-Antonio; Gronemeyer, Hinrich

    2014-01-01

    Retinoids and rexinoids, as all other ligands of the nuclear receptor (NR) family, act as ligand-regulated trans-acting transcription factors that bind to cis-acting DNA regulatory elements in the promoter regions of target genes (for reviews see [12, 22, 23, 26, 36]). Ligand binding modulates the communication functions of the receptor with the intracellular environment, which essentially entails receptor-protein and receptor-DNA or receptor-chromatin interactions. In this communication network, the receptor simultaneously serves as both intracellular sensor and regulator of cell/organ functions. Receptors are "intelligent" mediators of the information encoded in the chemical structure of a nuclear receptor ligand, as they interpret this information in the context of cellular identity and cell-physiological status and convert it into a dynamic chain of receptor-protein and receptor-DNA interactions. To process input and output information, they are composed of a modular structure with several domains that have evolved to exert particular molecular recognition functions. As detailed in other chapters in this volume, the main functional domains are the DNA-binding (DBD) and ligand-binding (LBD) [5-7, 38, 56, 71]. The LBD serves as a dual input-output information processor. Inputs, such as ligand binding or receptor phosphorylations, induce allosteric changes in receptor surfaces that serve as docking sites for outputs, such as subunits of transcription and epigenetic machineries or enzyme complexes. The complexity of input and output signals and their interdependencies is far from being understood.

  15. Precise Genome Modification via Sequence-Specific Nucleases-Mediated Gene Targeting for Crop Improvement.

    PubMed

    Sun, Yongwei; Li, Jingying; Xia, Lanqin

    2016-01-01

    Genome editing technologies enable precise modifications of DNA sequences in vivo and offer a great promise for harnessing plant genes in crop improvement. The precise manipulation of plant genomes relies on the induction of DNA double-strand breaks by sequence-specific nucleases (SSNs) to initiate DNA repair reactions that are based on either non-homologous end joining (NHEJ) or homology-directed repair (HDR). While complete knock-outs and loss-of-function mutations generated by NHEJ are very valuable in defining gene functions, their applications in crop improvement are somewhat limited because many agriculturally important traits are conferred by random point mutations or indels at specific loci in either the genes' encoding or promoter regions. Therefore, genome modification through SSNs-mediated HDR for gene targeting (GT) that enables either gene replacement or knock-in will provide an unprecedented ability to facilitate plant breeding by allowing introduction of precise point mutations and new gene functions, or integration of foreign genes at specific and desired "safe" harbor in a predefined manner. The emergence of three programmable SSNs, such as zinc finger nucleases, transcriptional activator-like effector nucleases, and the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems has revolutionized genome modification in plants in a more controlled manner. However, while targeted mutagenesis is becoming routine in plants, the potential of GT technology has not been well realized for traits improvement in crops, mainly due to the fact that NHEJ predominates DNA repair process in somatic cells and competes with the HDR pathway, and thus HDR-mediated GT is a relative rare event in plants. Here, we review recent research findings mainly focusing on development and applications of precise GT in plants using three SSNs systems described above, and the potential mechanisms underlying HDR events in plant

  16. Precise Genome Modification via Sequence-Specific Nucleases-Mediated Gene Targeting for Crop Improvement

    PubMed Central

    Sun, Yongwei; Li, Jingying; Xia, Lanqin

    2016-01-01

    Genome editing technologies enable precise modifications of DNA sequences in vivo and offer a great promise for harnessing plant genes in crop improvement. The precise manipulation of plant genomes relies on the induction of DNA double-strand breaks by sequence-specific nucleases (SSNs) to initiate DNA repair reactions that are based on either non-homologous end joining (NHEJ) or homology-directed repair (HDR). While complete knock-outs and loss-of-function mutations generated by NHEJ are very valuable in defining gene functions, their applications in crop improvement are somewhat limited because many agriculturally important traits are conferred by random point mutations or indels at specific loci in either the genes’ encoding or promoter regions. Therefore, genome modification through SSNs-mediated HDR for gene targeting (GT) that enables either gene replacement or knock-in will provide an unprecedented ability to facilitate plant breeding by allowing introduction of precise point mutations and new gene functions, or integration of foreign genes at specific and desired “safe” harbor in a predefined manner. The emergence of three programmable SSNs, such as zinc finger nucleases, transcriptional activator-like effector nucleases, and the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems has revolutionized genome modification in plants in a more controlled manner. However, while targeted mutagenesis is becoming routine in plants, the potential of GT technology has not been well realized for traits improvement in crops, mainly due to the fact that NHEJ predominates DNA repair process in somatic cells and competes with the HDR pathway, and thus HDR-mediated GT is a relative rare event in plants. Here, we review recent research findings mainly focusing on development and applications of precise GT in plants using three SSNs systems described above, and the potential mechanisms underlying HDR events in

  17. G protein-coupled receptors: extranuclear mediators for the non-genomic actions of steroids.

    PubMed

    Wang, Chen; Liu, Yi; Cao, Ji-Min

    2014-09-01

    Steroids hormones possess two distinct actions, a delayed genomic effect and a rapid non-genomic effect. Rapid steroid-triggered signaling is mediated by specific receptors localized most often to the plasma membrane. The nature of these receptors is of great interest and accumulated data suggest that G protein-coupled receptors (GPCRs) are appealing candidates. Increasing evidence regarding the interaction between steroids and specific membrane proteins, as well as the involvement of G protein and corresponding downstream signaling, have led to identification of physiologically relevant GPCRs as steroid extranuclear receptors. Examples include G protein-coupled receptor 30 (GPR30) for estrogen, membrane progestin receptor for progesterone, G protein-coupled receptor family C group 6 member A (GPRC6A) and zinc transporter member 9 (ZIP9) for androgen, and trace amine associated receptor 1 (TAAR1) for thyroid hormone. These receptor-mediated biological effects have been extended to reproductive development, cardiovascular function, neuroendocrinology and cancer pathophysiology. However, although great progress have been achieved, there are still important questions that need to be answered, including the identities of GPCRs responsible for the remaining steroids (e.g., glucocorticoid), the structural basis of steroids and GPCRs' interaction and the integration of extranuclear and nuclear signaling to the final physiological function. Here, we reviewed the several significant developments in this field and highlighted a hypothesis that attempts to explain the general interaction between steroids and GPCRs.

  18. Integrating genomic resources of flatfish (Pleuronectiformes) to boost aquaculture production.

    PubMed

    Robledo, Diego; Hermida, Miguel; Rubiolo, Juan A; Fernández, Carlos; Blanco, Andrés; Bouza, Carmen; Martínez, Paulino

    2017-03-01

    Flatfish have a high market acceptance thus representing a profitable aquaculture production. The main farmed species is the turbot (Scophthalmus maximus) followed by Japanese flounder (Paralichthys olivaceous) and tongue sole (Cynoglossus semilaevis), but other species like Atlantic halibut (Hippoglossus hippoglossus), Senegalese sole (Solea senegalensis) and common sole (Solea solea) also register an important production and are very promising for farming. Important genomic resources are available for most of these species including whole genome sequencing projects, genetic maps and transcriptomes. In this work, we integrate all available genomic information of these species within a common framework, taking as reference the whole assembled genomes of turbot and tongue sole (>210× coverage). New insights related to the genetic basis of productive traits and new data useful to understand the evolutionary origin and diversification of this group were obtained. Despite a general 1:1 chromosome syntenic relationship between species, the comparison of turbot and tongue sole genomes showed huge intrachromosomic reorganizations. The integration of available mapping information supported specific chromosome fusions along flatfish evolution and facilitated the comparison between species of previously reported genetic associations for productive traits. When comparing transcriptomic resources of the six species, a common set of ~2500 othologues and ~150 common miRNAs were identified, and specific sets of putative missing genes were detected in flatfish transcriptomes, likely reflecting their evolutionary diversification.

  19. Why Mitochondria Must Fuse to Maintain Their Genome Integrity

    PubMed Central

    Vidoni, Sara; Zanna, Claudia; Sarzi, Emmanuelle

    2013-01-01

    Abstract Significance: The maintenance of mitochondrial genome integrity is a major challenge for cells to sustain energy production by respiration. Recent Advances: Recently, mitochondrial membrane dynamics emerged as a key process contributing to prevent mitochondrial DNA (mtDNA) alterations. Indeed, both fundamental and clinical data suggest that disruption of mitochondrial fusion, related to mutations in the OPA1, MFN2, PINK1, and PARK2 genes, leads to the accumulation of mutations in the mitochondrial genome. Critical Issues: We discuss here the possibility that mitochondrial fusion acts as a direct mechanism to prevent the generation of altered mtDNA and to eliminate mutated deleterious genomes either by trans-complementation or by mitophagy. Future Directions: Finally, we conclude this review with a short evolutionary comparison between the mechanisms involved in mitochondrial and bacterial modes of genome distribution and plasticity, highlighting possible common conserved processes required for the maintenance of their genome integrity, which should inspire our future investigations. Antioxid. Redox Signal. 19, 379–388. PMID:23350575

  20. DemaDb: an integrated dematiaceous fungal genomes database.

    PubMed

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my.

  1. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  2. GEMINI: integrative exploration of genetic variation and genome annotations.

    PubMed

    Paila, Umadevi; Chapman, Brad A; Kirchner, Rory; Quinlan, Aaron R

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.

  3. GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

    PubMed Central

    Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191

  4. Megx.net: integrated database resource for marine ecological genomics

    PubMed Central

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  5. iCAGES: integrated CAncer GEnome Score for comprehensively prioritizing driver genes in personal cancer genomes.

    PubMed

    Dong, Chengliang; Guo, Yunfei; Yang, Hui; He, Zeyu; Liu, Xiaoming; Wang, Kai

    2016-12-22

    Cancer results from the acquisition of somatic driver mutations. Several computational tools can predict driver genes from population-scale genomic data, but tools for analyzing personal cancer genomes are underdeveloped. Here we developed iCAGES, a novel statistical framework that infers driver variants by integrating contributions from coding, non-coding, and structural variants, identifies driver genes by combining genomic information and prior biological knowledge, then generates prioritized drug treatment. Analysis on The Cancer Genome Atlas (TCGA) data showed that iCAGES predicts whether patients respond to drug treatment (P = 0.006 by Fisher's exact test) and long-term survival (P = 0.003 from Cox regression). iCAGES is available at http://icages.wglab.org .

  6. inGeno – an integrated genome and ortholog viewer for improved genome to genome comparisons

    PubMed Central

    Liang, Chunguang; Dandekar, Thomas

    2006-01-01

    Background Systematic genome comparisons are an important tool to reveal gene functions, pathogenic features, metabolic pathways and genome evolution in the era of post-genomics. Furthermore, such comparisons provide important clues for vaccines and drug development. Existing genome comparison software often lacks accurate information on orthologs, the function of similar genes identified and genome-wide reports and lists on specific functions. All these features and further analyses are provided here in the context of a modular software tool "inGeno" written in Java with Biojava subroutines. Results InGeno provides a user-friendly interactive visualization platform for sequence comparisons (comprehensive reciprocal protein – protein comparisons) between complete genome sequences and all associated annotations and features. The comparison data can be acquired from several different sequence analysis programs in flexible formats. Automatic dot-plot analysis includes output reduction, filtering, ortholog testing and linear regression, followed by smart clustering (local collinear blocks; LCBs) to reveal similar genome regions. Further, the system provides genome alignment and visualization editor, collinear relationships and strain-specific islands. Specific annotations and functions are parsed, recognized, clustered, logically concatenated and visualized and summarized in reports. Conclusion As shown in this study, inGeno can be applied to study and compare in particular prokaryotic genomes against each other (gram positive and negative as well as close and more distantly related species) and has been proven to be sensitive and accurate. This modular software is user-friendly and easily accommodates new routines to meet specific user-defined requirements. PMID:17054788

  7. Enhancers as information integration hubs in development: lessons from genomics

    PubMed Central

    Buecker, Christa; Wysocka, Joanna

    2016-01-01

    Transcriptional enhancers are the primary determinants of tissue-specific gene expression. Although the majority of our current knowledge of enhancer elements comes from detailed analyses of individual loci, recent progress in epigenomics has led to the development of methods for comprehensive and conservation-independent annotation of cell type-specific enhancers. Here, we discuss the advantages and limitations of different genomic approaches to enhancer mapping and summarize observations that have been afforded by the genome-wide views of enhancer landscapes, with a focus on development. We propose that enhancers serve as information integration hubs, at which instructions encoded by the genome are read in the context of a specific cellular state, signaling milieu and chromatin environment, allowing for exquisitely precise spatiotemporal control of gene expression during embryogenesis. PMID:22487374

  8. Integration of maternal genome into the neonate genome through breast milk mRNA transcripts and reverse transcriptase.

    PubMed

    Irmak, M Kemal; Oztas, Yesim; Oztas, Emin

    2012-06-07

    Human milk samples contain microvesicles similar to the retroviruses. These microvesicles contain mRNA transcripts and possess reverse transcriptase activity. They contain about 14,000 transcripts representing the milk transcriptome. Microvesicles are also enriched with proteins related to "caveolar-mediated endocytosis signaling" pathway. It has recently been reported that microvesicles could be transferred to other cells by endocytosis and their RNA content can be translated and be functional in their new location. A significant percentage of the mammalian genome appears to be the product of reverse transcription, containing sequences whose characteristics point to RNA as a template precursor. These are mobile elements that move by way of transposition and are called retrotransposons. We thought that retrotransposons may stem from about 14,000 transcriptome of breast milk microvesicles, and reviewed the literature.The enhanced acceptance of maternal allografts in children who were breast-fed and tolerance to the maternal MHC antigens after breastfeeding may stem from RNAs of the breast milk microvesicles that can be taken up by the breastfed infant and receiving maternal genomic information. We conclude that milk microvesicles may transfer genetic signals from mother to neonate during breastfeeding. Moreover, transfer of wild type RNA from a healthy wet-nurse to the suckling neonate through the milk microvesicles and its subsequent reverse transcription and integration into the neonate genome could result in permanent correction of the clinical manifestations in genetic diseases.

  9. Computational and molecular tools for scalable rAAV-mediated genome editing

    PubMed Central

    Stoimenov, Ivaylo; Ali, Muhammad Akhtar; Pandzic, Tatjana; Sjöblom, Tobias

    2015-01-01

    The rapid discovery of potential driver mutations through large-scale mutational analyses of human cancers generates a need to characterize their cellular phenotypes. Among the techniques for genome editing, recombinant adeno-associated virus (rAAV)-mediated gene targeting is suited for knock-in of single nucleotide substitutions and to a lesser degree for gene knock-outs. However, the generation of gene targeting constructs and the targeting process is time-consuming and labor-intense. To facilitate rAAV-mediated gene targeting, we developed the first software and complementary automation-friendly vector tools to generate optimized targeting constructs for editing human protein encoding genes. By computational approaches, rAAV constructs for editing ∼71% of bases in protein-coding exons were designed. Similarly, ∼81% of genes were predicted to be targetable by rAAV-mediated knock-out. A Gateway-based cloning system for facile generation of rAAV constructs suitable for robotic automation was developed and used in successful generation of targeting constructs. Together, these tools enable automated rAAV targeting construct design, generation as well as enrichment and expansion of targeted cells with desired integrations. PMID:25488813

  10. Using biological networks to integrate, visualize and analyze genomics data.

    PubMed

    Charitou, Theodosia; Bryan, Kenneth; Lynn, David J

    2016-03-31

    Network biology is a rapidly developing area of biomedical research and reflects the current view that complex phenotypes, such as disease susceptibility, are not the result of single gene mutations that act in isolation but are rather due to the perturbation of a gene's network context. Understanding the topology of these molecular interaction networks and identifying the molecules that play central roles in their structure and regulation is a key to understanding complex systems. The falling cost of next-generation sequencing is now enabling researchers to routinely catalogue the molecular components of these networks at a genome-wide scale and over a large number of different conditions. In this review, we describe how to use publicly available bioinformatics tools to integrate genome-wide 'omics' data into a network of experimentally-supported molecular interactions. In addition, we describe how to visualize and analyze these networks to identify topological features of likely functional relevance, including network hubs, bottlenecks and modules. We show that network biology provides a powerful conceptual approach to integrate and find patterns in genome-wide genomic data but we also discuss the limitations and caveats of these methods, of which researchers adopting these methods must remain aware.

  11. Multidimensional Integrative Genomics Approaches to Dissecting Cardiovascular Disease

    PubMed Central

    Arneson, Douglas; Shu, Le; Tsai, Brandon; Barrere-Cain, Rio; Sun, Christine; Yang, Xia

    2017-01-01

    Elucidating the mechanisms of complex diseases such as cardiovascular disease (CVD) remains a significant challenge due to multidimensional alterations at molecular, cellular, tissue, and organ levels. To better understand CVD and offer insights into the underlying mechanisms and potential therapeutic strategies, data from multiple omics types (genomics, epigenomics, transcriptomics, metabolomics, proteomics, microbiomics) from both humans and model organisms have become available. However, individual omics data types capture only a fraction of the molecular mechanisms. To address this challenge, there have been numerous efforts to develop integrative genomics methods that can leverage multidimensional information from diverse data types to derive comprehensive molecular insights. In this review, we summarize recent methodological advances in multidimensional omics integration, exemplify their applications in cardiovascular research, and pinpoint challenges and future directions in this incipient field. PMID:28289683

  12. The Tousled-Like Kinases as Guardians of Genome Integrity.

    PubMed

    De Benedetti, Arrigo

    2012-01-01

    The Tousled-like kinases (TLKs) function in processes of chromatin assembly, including replication, transcription, repair, and chromosome segregation. TLKs interact specifically (and phosphorylate) with the chromatin assembly factor Asf1, a histone H3-H4 chaperone, histone H3 itself at Ser10, and also Rad9, a key protein involved in DNA repair and cell cycle signaling following DNA damage. These interactions are believed to be responsible for the action of TLKs in double-stranded break repair and radioprotection and also in the propagation of the DNA damage response. Hence, I propose that TLKs play key roles in maintenance of genome integrity in many organisms of both kingdoms. In this paper, I highlight key issues of the known roles of these proteins, particularly in the context of DNA repair (IR and UV), their possible relevance to genome integrity and cancer development, and as possible targets for intervention in cancer management.

  13. Cre/lox-Recombinase-Mediated Cassette Exchange for Reversible Site-Specific Genomic Targeting of the Disease Vector, Aedes aegypti

    PubMed Central

    Häcker, Irina; Harrell II, Robert A.; Eichner, Gerrit; Pilitt, Kristina L.; O’Brochta, David A.; Handler, Alfred M.; Schetelig, Marc F.

    2017-01-01

    Site-specific genome modification (SSM) is an important tool for mosquito functional genomics and comparative gene expression studies, which contribute to a better understanding of mosquito biology and are thus a key to finding new strategies to eliminate vector-borne diseases. Moreover, it allows for the creation of advanced transgenic strains for vector control programs. SSM circumvents the drawbacks of transposon-mediated transgenesis, where random transgene integration into the host genome results in insertional mutagenesis and variable position effects. We applied the Cre/lox recombinase-mediated cassette exchange (RMCE) system to Aedes aegypti, the vector of dengue, chikungunya, and Zika viruses. In this context we created four target site lines for RMCE and evaluated their fitness costs. Cre-RMCE is functional in a two-step mechanism and with good efficiency in Ae. aegypti. The advantages of Cre-RMCE over existing site-specific modification systems for Ae. aegypti, phiC31-RMCE and CRISPR, originate in the preservation of the recombination sites, which 1) allows successive modifications and rapid expansion or adaptation of existing systems by repeated targeting of the same site; and 2) provides reversibility, thus allowing the excision of undesired sequences. Thereby, Cre-RMCE complements existing genomic modification tools, adding flexibility and versatility to vector genome targeting. PMID:28266580

  14. Cre/lox-Recombinase-Mediated Cassette Exchange for Reversible Site-Specific Genomic Targeting of the Disease Vector, Aedes aegypti.

    PubMed

    Häcker, Irina; Harrell Ii, Robert A; Eichner, Gerrit; Pilitt, Kristina L; O'Brochta, David A; Handler, Alfred M; Schetelig, Marc F

    2017-03-07

    Site-specific genome modification (SSM) is an important tool for mosquito functional genomics and comparative gene expression studies, which contribute to a better understanding of mosquito biology and are thus a key to finding new strategies to eliminate vector-borne diseases. Moreover, it allows for the creation of advanced transgenic strains for vector control programs. SSM circumvents the drawbacks of transposon-mediated transgenesis, where random transgene integration into the host genome results in insertional mutagenesis and variable position effects. We applied the Cre/lox recombinase-mediated cassette exchange (RMCE) system to Aedes aegypti, the vector of dengue, chikungunya, and Zika viruses. In this context we created four target site lines for RMCE and evaluated their fitness costs. Cre-RMCE is functional in a two-step mechanism and with good efficiency in Ae. aegypti. The advantages of Cre-RMCE over existing site-specific modification systems for Ae. aegypti, phiC31-RMCE and CRISPR, originate in the preservation of the recombination sites, which 1) allows successive modifications and rapid expansion or adaptation of existing systems by repeated targeting of the same site; and 2) provides reversibility, thus allowing the excision of undesired sequences. Thereby, Cre-RMCE complements existing genomic modification tools, adding flexibility and versatility to vector genome targeting.

  15. TALEN-mediated genome engineering to generate targeted mice.

    PubMed

    Sommer, Daniel; Peters, Annika E; Baumgart, Ann-Kathrin; Beyer, Marc

    2015-02-01

    Genetic mouse models are critical for biomedical research to understand gene function and pathophysiology. In the last years, the generation of genetic mouse models has been revolutionized by the emergence of transcription activator-like effector nucleases (TALENs). TALENs are programmable, sequence-specific DNA-binding proteins fused to a non-specific endonuclease domain used as powerful tools for site-specific induction of DNA double-strand breaks. These result in disruption of the gene product of the targeted locus by mutations induced during repair by error-prone non-homologous end-joining. Alternatively, these DNA double-strand breaks can be exploited to integrate a user-defined sequence by homologous recombination if an appropriate repair plasmid is provided. In this review, we highlight the major technological improvements for genome editing in murine oocytes which have been achieved using TALENs, discuss current limitations of the technology, suggest strategies to broadly apply TALENs, and describe possible future directions to facilitate gene editing in murine oocytes.

  16. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  17. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  18. INTEGRATE: gene fusion discovery using whole genome and transcriptome data.

    PubMed

    Zhang, Jin; White, Nicole M; Schmidt, Heather K; Fulton, Robert S; Tomlinson, Chad; Warren, Wesley C; Wilson, Richard K; Maher, Christopher A

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use.

  19. Construction of an integrated database to support genomic sequence analysis

    SciTech Connect

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  20. Improved transgene integration into the Chinese hamster ovary cell genome using the Cre-loxP system.

    PubMed

    Inao, Takanori; Kawabe, Yoshinori; Yamashiro, Takuro; Kameyama, Yujiro; Wang, Xue; Ito, Akira; Kamihira, Masamichi

    2015-07-01

    Genetic engineering of cellular genomes has provided useful tools for biomedical and pharmaceutical studies such as the generation of transgenic animals and producer cells of biopharmaceutical proteins. Gene integration using site-specific recombinases enables precise transgene insertion into predetermined genomic sites if the target site sequence is introduced into a specific chromosomal locus. We previously developed an accumulative site-specific gene integration system (AGIS) using Cre and mutated loxPs. The system enabled the repeated integration of multiple transgenes into a predetermined locus of a genome. In this study, we explored applicable mutated loxP pairs for AGIS to improve the integration efficiency. The integration efficiencies of 52 mutated loxP sequences, including novel sequences, were measured using an in vitro evaluation system. Among mutated loxP pairs that exhibited a high integration efficiency, the applicability of the selected pairs to AGIS was confirmed for transgene integration into the Chinese hamster ovary cell genome. The newly found mutated loxP pairs should be useful for Cre-mediated integration of transgenes and AGIS.

  1. Integrating hospital information systems in healthcare institutions: a mediation architecture.

    PubMed

    El Azami, Ikram; Cherkaoui Malki, Mohammed Ouçamah; Tahon, Christian

    2012-10-01

    Many studies have examined the integration of information systems into healthcare institutions, leading to several standards in the healthcare domain (CORBAmed: Common Object Request Broker Architecture in Medicine; HL7: Health Level Seven International; DICOM: Digital Imaging and Communications in Medicine; and IHE: Integrating the Healthcare Enterprise). Due to the existence of a wide diversity of heterogeneous systems, three essential factors are necessary to fully integrate a system: data, functions and workflow. However, most of the previous studies have dealt with only one or two of these factors and this makes the system integration unsatisfactory. In this paper, we propose a flexible, scalable architecture for Hospital Information Systems (HIS). Our main purpose is to provide a practical solution to insure HIS interoperability so that healthcare institutions can communicate without being obliged to change their local information systems and without altering the tasks of the healthcare professionals. Our architecture is a mediation architecture with 3 levels: 1) a database level, 2) a middleware level and 3) a user interface level. The mediation is based on two central components: the Mediator and the Adapter. Using the XML format allows us to establish a structured, secured exchange of healthcare data. The notion of medical ontology is introduced to solve semantic conflicts and to unify the language used for the exchange. Our mediation architecture provides an effective, promising model that promotes the integration of hospital information systems that are autonomous, heterogeneous, semantically interoperable and platform-independent.

  2. Replication termination at eukaryotic chromosomes is mediated by Top2 and occurs at genomic loci containing pausing elements.

    PubMed

    Fachinetti, Daniele; Bermejo, Rodrigo; Cocito, Andrea; Minardi, Simone; Katou, Yuki; Kanoh, Yutaka; Shirahige, Katsuhiko; Azvolinsky, Anna; Zakian, Virginia A; Foiani, Marco

    2010-08-27

    Chromosome replication initiates at multiple replicons and terminates when forks converge. In E. coli, the Tus-TER complex mediates polar fork converging at the terminator region, and aberrant termination events challenge chromosome integrity and segregation. Since in eukaryotes, termination is less characterized, we used budding yeast to identify the factors assisting fork fusion at replicating chromosomes. Using genomic and mechanistic studies, we have identified and characterized 71 chromosomal termination regions (TERs). TERs contain fork pausing elements that influence fork progression and merging. The Rrm3 DNA helicase assists fork progression across TERs, counteracting the accumulation of X-shaped structures. The Top2 DNA topoisomerase associates at TERs in S phase, and G2/M facilitates fork fusion and prevents DNA breaks and genome rearrangements at TERs. We propose that in eukaryotes, replication fork barriers, Rrm3, and Top2 coordinate replication fork progression and fusion at TERs, thus counteracting abnormal genomic transitions.

  3. An integrated genomics approach identifies drivers of proliferation in luminal-subtype human breast cancer.

    PubMed

    Gatza, Michael L; Silva, Grace O; Parker, Joel S; Fan, Cheng; Perou, Charles M

    2014-10-01

    Elucidating the molecular drivers of human breast cancers requires a strategy that is capable of integrating multiple forms of data and an ability to interpret the functional consequences of a given genetic aberration. Here we present an integrated genomic strategy based on the use of gene expression signatures of oncogenic pathway activity (n = 52) as a framework to analyze DNA copy number alterations in combination with data from a genome-wide RNA-mediated interference screen. We identify specific DNA amplifications and essential genes within these amplicons representing key genetic drivers, including known and new regulators of oncogenesis. The genes identified include eight that are essential for cell proliferation (FGD5, METTL6, CPT1A, DTX3, MRPS23, EIF2S2, EIF6 and SLC2A10) and are uniquely amplified in patients with highly proliferative luminal breast tumors, a clinical subset of patients for which few therapeutic options are effective. This general strategy has the potential to identify therapeutic targets within amplicons through an integrated use of genomic data sets.

  4. MarinegenomicsDB: an integrated genome viewer for community-based annotation of genomes.

    PubMed

    Koyanagi, Ryo; Takeuchi, Takeshi; Hisata, Kanako; Gyoja, Fuki; Shoguchi, Eiichi; Satoh, Nori; Kawashima, Takeshi

    2013-10-01

    We constructed a web-based genome annotation platform, MarinegenomicsDB, to integrate genome data from various marine organisms including the pearl oyster Pinctada fucata and the coral Acropora digitifera. This newly developed viewer application provides open access to published data and a user-friendly environment for community-based manual gene annotation. Development on a flexible framework enables easy expansion of the website on demand. To date, more than 2000 genes have been annotated using this system. In the future, the website will be expanded to host a wider variety of data, more species, and different types of genome-wide analyses. The website is available at the following URL: http://marinegenomics.oist.jp.

  5. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Andrei L. Osterman, Ph.D.

    2012-12-17

    Integration of bioinformatics and experimental techniques was applied to mapping and characterization of the key components (pathways, enzymes, transporters, regulators) of the core metabolic machinery in Shewanella oneidensis and related species with main focus was on metabolic and regulatory pathways involved in utilization of various carbon and energy sources. Among the main accomplishments reflected in ten joint publications with other participants of Shewanella Federation are: (i) A systems-level reconstruction of carbohydrate utilization pathways in the genus of Shewanella (19 species). This analysis yielded reconstruction of 18 sugar utilization pathways including 10 novel pathway variants and prediction of > 60 novel protein families of enzymes, transporters and regulators involved in these pathways. Selected functional predictions were verified by focused biochemical and genetic experiments. Observed growth phenotypes were consistent with bioinformatic predictions providing strong validation of the technology and (ii) Global genomic reconstruction of transcriptional regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors, 8 riboswitches and 6 translational attenuators. Of those, 45 regulons were inferred directly from the genome context analysis, whereas others were propagated from previously characterized regulons in other species. Selected regulatory predictions were experimentally tested. Integration of this analysis with microarray data revealed overall consistency and provided additional layer of interactions between regulons. All the results were captured in the new database RegPrecise, which is a joint development with the LBNL team. A more detailed analysis of the individual subsystems, pathways and regulons in Shewanella spp included bioinfiormatics-based prediction and experimental characterization of: (i) N-Acetylglucosamine catabolic pathway; (ii)Lactate utilization machinery; (iii) Novel Nrt

  6. BK Polyomavirus Genomic Integration and Large T Antigen Expression: Evolving Paradigms in Human Oncogenesis.

    PubMed

    Kenan, D J; Mieczkowski, P A; Latulippe, E; Côté, I; Singh, H K; Nickeleit, V

    2016-12-31

    Human polyomaviruses are ubiquitous, with primary infections that typically occur during childhood and subsequent latency that may last a lifetime. Polyomavirus-mediated disease has been described in immunocompromised patients; its relationship to oncogenesis is poorly understood. We present deep sequencing data from a high-grade BK virus-associated tumor expressing large T antigen. The carcinoma arose in a kidney allograft 6 years after transplantation. We identified a novel genotype 1a BK polyomavirus, called Chapel Hill BK polyomavirus 2 (CH-2), that was integrated into the BRE gene in chromosome 2 of tumor cells. At the chromosomal integration site, viral break points were found, disrupting late BK gene sequences encoding capsid proteins VP1 and VP2/3. Immunohistochemistry and in situ hybridization studies demonstrated that the integrated BK virus was replication incompetent. We propose that the BK virus CH-2 was integrated into the human genome as a concatemer, resulting in alterations of feedback loops and overexpression of large T antigen. Collectively, these findings support the emerging understanding that viral integration is a nearly ubiquitous feature in polyomavirus-associated malignancy and that unregulated large T antigen expression drives a proliferative state that is conducive to oncogenesis. Based on the current observations, we present an updated model of polyomavirus-mediated oncogenesis.

  7. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation.

    PubMed

    Brenna, Andrea; Montanini, Barbara; Muggiano, Eleonora; Proietto, Marco; Filetici, Patrizia; Ottonello, Simone; Ballario, Paola

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi ("truffles") with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites.

  8. CRISPR/Cas9-mediated genome modification in the mollusc, Crepidula fornicata.

    PubMed

    Perry, Kimberly J; Henry, Jonathan Q

    2015-02-01

    The discovery and application of the CRISPR/Cas9 genome editing method has greatly enhanced the ease with which transgenic manipulation can occur. We applied this technology to the mollusc, Crepidula fornicata, and have successfully created transgenic embryos expressing mCherry fused to endogenous β-catenin. Specific integration of the fluorescent reporter was achieved by homologous recombination with a β-catenin-specific donor DNA containing the mCherry coding sequence. This fluorescent gene knock-in strategy permits in vivo observations of β-catenin expression during embryonic development and represents the first demonstration of CRISPR/Cas9-mediated transgenesis in the Lophotrochozoa superphylum. The CRISPR/Cas9 method is a powerful and economical tool for genome modification and presents an option for analysis of gene expression in not only major model systems, but also in those more diverse species that may not have been amenable to the classic methods of transgenesis. This approach will allow one to generate transgenic lines of snails for future studies.

  9. On the road with WRAP53β: guardian of Cajal bodies and genome integrity.

    PubMed

    Henriksson, Sofia; Farnebo, Marianne

    2015-01-01

    The WRAP53 gene encodes both an antisense transcript (WRAP53α) that stabilizes the tumor suppressor p53 and a protein (WRAP53β) involved in maintenance of Cajal bodies, telomere elongation and DNA repair. WRAP53β is one of many proteins containing WD40 domains, known to mediate a variety of cellular processes. These proteins lack enzymatic activity, acting instead as platforms for the assembly of large complexes of proteins and RNAs thus facilitating their interactions. WRAP53β mediates site-specific interactions between Cajal body factors and DNA repair proteins. Moreover, dysfunction of this protein has been linked to premature aging, cancer and neurodegeneration. Here we summarize the current state of knowledge concerning the multifaceted roles of WRAP53β in intracellular trafficking, formation of the Cajal body, DNA repair and maintenance of genomic integrity and discuss potential crosstalk between these processes.

  10. Integration of competing ancillary assertions in genome assembly

    SciTech Connect

    Burks, C.; Parsons, R.J.; Engle, M.L.

    1994-12-31

    Assembly of genomic sequences and maps relies on a primary set of experimental data (e.g., the sequences of individual DNA fragments, or hybridization fingerprints of individual clone inserts), but almost always also relies on several streams of related but distinct kinds of data for completeness and accuracy of the final construction. These secondary data sets, which we term ancillary information, usually contain errors (as do the primary data sets, therefore creating the possibility of conflict between data sets), often arise from different experimental protocols and correspond to different scales of measurement, and occasionally include non-quantitative statements about the data. We present an approach for integration of ancillary assertions in the optimization of genome assembly, based on simultaneous balancing among the primary and secondary data sets, and include specific examples in the context of assembling DNA sequencing fragments to reconstruct a parent sequence.

  11. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

    PubMed

    Moss, Nathan A; Bertin, Matthew J; Kleigrewe, Karin; Leão, Tiago F; Gerwick, Lena; Gerwick, William H

    2016-03-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques.

  12. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery

    PubMed Central

    Bertin, Matthew J.; Kleigrewe, Karin; Leão, Tiago F.; Gerwick, Lena

    2016-01-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313

  13. Integration Preferences of Wildtype AAV-2 for Consensus Rep-Binding Sites at Numerous Loci in the Human Genome

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Lutter, Timo; Weger, Stefan; Winter, Kerstin; Hammer, Eva-Maria; Cathomen, Toni; Reinert, Knut; Heilbronn, Regine

    2010-01-01

    Adeno-associated virus type 2 (AAV) is known to establish latency by preferential integration in human chromosome 19q13.42. The AAV non-structural protein Rep appears to target a site called AAVS1 by simultaneously binding to Rep-binding sites (RBS) present on the AAV genome and within AAVS1. In the absence of Rep, as is the case with AAV vectors, chromosomal integration is rare and random. For a genome-wide survey of wildtype AAV integration a linker-selection-mediated (LSM)-PCR strategy was designed to retrieve AAV-chromosomal junctions. DNA sequence determination revealed wildtype AAV integration sites scattered over the entire human genome. The bioinformatic analysis of these integration sites compared to those of rep-deficient AAV vectors revealed a highly significant overrepresentation of integration events near to consensus RBS. Integration hotspots included AAVS1 with 10% of total events. Novel hotspots near consensus RBS were identified on chromosome 5p13.3 denoted AAVS2 and on chromsome 3p24.3 denoted AAVS3. AAVS2 displayed seven independent junctions clustered within only 14 bp of a consensus RBS which proved to bind Rep in vitro similar to the RBS in AAVS3. Expression of Rep in the presence of rep-deficient AAV vectors shifted targeting preferences from random integration back to the neighbourhood of consensus RBS at hotspots and numerous additional sites in the human genome. In summary, targeted AAV integration is not as specific for AAVS1 as previously assumed. Rather, Rep targets AAV to integrate into open chromatin regions in the reach of various, consensus RBS homologues in the human genome. PMID:20628575

  14. GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research

    PubMed Central

    Jung, Sook; Jesudurai, Christopher; Staton, Margaret; Du, Zhidian; Ficklin, Stephen; Cho, Ilhyung; Abbott, Albert; Tomkins, Jeffrey; Main, Dorrie

    2004-01-01

    Background Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. Description The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. Conclusions The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at . PMID:15357877

  15. The Npl3 hnRNP prevents R-loop-mediated transcription-replication conflicts and genome instability.

    PubMed

    Santos-Pereira, José M; Herrero, Ana B; García-Rubio, María L; Marín, Antonio; Moreno, Sergio; Aguilera, Andrés

    2013-11-15

    Transcription is a major obstacle for replication fork (RF) progression and a cause of genome instability. Part of this instability is mediated by cotranscriptional R loops, which are believed to increase by suboptimal assembly of the nascent messenger ribonucleoprotein particle (mRNP). However, no clear evidence exists that heterogeneous nuclear RNPs (hnRNPs), the basic mRNP components, prevent R-loop stabilization. Here we show that yeast Npl3, the most abundant RNA-binding hnRNP, prevents R-loop-mediated genome instability. npl3Δ cells show transcription-dependent and R-loop-dependent hyperrecombination and genome-wide replication obstacles as determined by accumulation of the Rrm3 helicase. Such obstacles preferentially occur at long and highly expressed genes, to which Npl3 is preferentially bound in wild-type cells, and are reduced by RNase H1 overexpression. The resulting replication stress confers hypersensitivity to double-strand break-inducing agents. Therefore, our work demonstrates that mRNP factors are critical for genome integrity and opens the option of using them as therapeutic targets in anti-cancer treatment.

  16. Integrated Genome-Based Studies of Shewanella Echophysiology

    SciTech Connect

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high

  17. An integrative computational approach for prioritization of genomic variants

    SciTech Connect

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad; Huang, Qingyang

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.

  18. An integrative computational approach for prioritization of genomic variants

    DOE PAGES

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; ...

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less

  19. An Integrative Computational Approach for Prioritization of Genomic Variants

    PubMed Central

    Wang, Sheng; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  20. Integrated Genomic Map from Uropathogenic Escherichia coli J96

    PubMed Central

    Melkerson-Watson, Lyla J.; Rode, Christopher K.; Zhang, Lixin; Foxman, Betsy; Bloch, Craig A.

    2000-01-01

    Escherichia coli J96 is a uropathogen having both broad similarities to and striking differences from nonpathogenic, laboratory E. coli K-12. Strain J96 contains three large (>100-kb) unique genomic segments integrated on the chromosome; two are recognized as pathogenicity islands containing urovirulence genes. Additionally, the strain possesses a fourth smaller accessory segment of 28 kb and two deletions relative to strain K-12. We report an integrated physical and genetic map of the 5,120-kb J96 genome. The chromosome contains 26 NotI, 13 BlnI, and 7 I-CeuI macrorestriction sites. Macrorestriction mapping was rapidly accomplished by a novel transposon-based procedure: analysis of modified minitransposon insertions served to align the overlapping macrorestriction fragments generated by three different enzymes (each sharing a common cleavage site within the insert), thus integrating the three different digestion patterns and ordering the fragments. The resulting map, generated from a total of 54 mini-Tn10 insertions, was supplemented with auxanography and Southern analysis to indicate the positions of insertionally disrupted aminosynthetic genes and cloned virulence genes, respectively. Thus, it contains not only physical, macrorestriction landmarks but also the loci for eight housekeeping genes shared with strain K-12 and eight acknowledged urovirulence genes; the latter confirmed clustering of virulence genes at the large unique accessory chromosomal segments. The 115-kb J96 plasmid was resolved by pulsed-field gel electrophoresis in NotI digests. However, because the plasmid lacks restriction sites for the enzymes BlnI and I-CeuI, it was visualized in BlnI and I-CeuI digests only of derivatives carrying plasmid inserts artificially introducing these sites. Owing to an I-SceI site on the transposon, the plasmid could also be visualized and sized from plasmid insertion mutants after digestion with this enzyme. The insertional strains generated in construction of

  1. Efficient homologous recombination-mediated genome engineering in zebrafish using TALE nucleases.

    PubMed

    Shin, Jimann; Chen, Jiakun; Solnica-Krezel, Lilianna

    2014-10-01

    Custom-designed nucleases afford a powerful reverse genetic tool for direct gene disruption and genome modification in vivo. Among various applications of the nucleases, homologous recombination (HR)-mediated genome editing is particularly useful for inserting heterologous DNA fragments, such as GFP, into a specific genomic locus in a sequence-specific fashion. However, precise HR-mediated genome editing is still technically challenging in zebrafish. Here, we establish a GFP reporter system for measuring the frequency of HR events in live zebrafish embryos. By co-injecting a TALE nuclease and GFP reporter targeting constructs with homology arms of different size, we defined the length of homology arms that increases the recombination efficiency. In addition, we found that the configuration of the targeting construct can be a crucial parameter in determining the efficiency of HR-mediated genome engineering. Implementing these modifications improved the efficiency of zebrafish knock-in generation, with over 10% of the injected F0 animals transmitting gene-targeting events through their germline. We generated two HR-mediated insertion alleles of sox2 and gfap loci that express either superfolder GFP (sfGFP) or tandem dimeric Tomato (tdTomato) in a spatiotemporal pattern that mirrors the endogenous loci. This efficient strategy provides new opportunities not only to monitor expression of endogenous genes and proteins and follow specific cell types in vivo, but it also paves the way for other sophisticated genetic manipulations of the zebrafish genome.

  2. Bacteriophage WO Can Mediate Horizontal Gene Transfer in Endosymbiotic Wolbachia Genomes

    PubMed Central

    Wang, Guan H.; Sun, Bao F.; Xiong, Tuan L.; Wang, Yan K.; Murfin, Kristen E.; Xiao, Jin H.; Huang, Da W.

    2016-01-01

    Phage-mediated horizontal gene transfer (HGT) is common in free-living bacteria, and many transferred genes can play a significant role in their new bacterial hosts. However, there are few reports concerning phage-mediated HGT in endosymbionts (obligate intracellular bacteria within animal or plant hosts), such as Wolbachia. The Wolbachia-infecting temperate phage WO can actively shift among Wolbachia genomes and has the potential to mediate HGT between Wolbachia strains. In the present study, we extend previous findings by validating that the phage WO can mediate transfer of non-phage genes. To do so, we utilized bioinformatic, phylogenetic, and molecular analyses based on all sequenced Wolbachia and phage WO genomes. Our results show that the phage WO can mediate HGT between Wolbachia strains, regardless of whether the transferred genes originate from Wolbachia or other unrelated bacteria. PMID:27965627

  3. Integrative Genomic Characterization and a Genomic Staging System for Gastrointestinal Stromal Tumors

    PubMed Central

    Ylipää, Antti; Hunt, Kelly K.; Yang, Jilong; Lazar, Alexander J. F.; Torres, Keila E.; Lev, Dina Chelouche; Nykter, Matti; Pollock, Raphael E.; Trent, Jonathan; Zhang, Wei

    2010-01-01

    Gastrointestinal stromal tumors (GISTs) were historically grouped with leiomyosarcomas (LMSs) based on their morphological similarities, but recently they have been unequivocally established as a distinct type of sarcoma based on the molecular features and response to imatinib treatment. To gain further insight into the genomic differences between GISTs and LMSs, we mapped gene copy number aberrations (CNAs) in 42 GISTs and 30 LMSs and integrated them with gene expression profiles. Our studies revealed distinct patterns of CNAs between GISTs and LMSs. Losses in chromosomes 1p, 14q, 15q, and 22q were significantly more frequent in GISTs than in LMSs (P < 0.001), whereas losses in chromosomes 10 and 16 as well as gains in 1q, 14q, and 15q (P < 0.001) were more common in LMSs. By integrating CNAs with gene expression data and clinical information, we found several clinically relevant CNAs that were prognostic of survival in patients with GIST. Furthermore, GISTs were categorized into four groups according to an accumulating pattern of genetic alterations. Many key cellular pathways were differently expressed in the four groups and the patients had increasingly worse prognosis as the extent of genomic alterations increased. These findings lead us to propose a new tumor-progression genetic staging system termed Genomic Instability Stage (GIS) to complement the current prognostic predictive system based on tumor size, mitotic index (MI), and KIT mutation. PMID:20818650

  4. The Proteins API: accessing key integrated protein and genome information.

    PubMed

    Nightingale, Andrew; Antunes, Ricardo; Alpi, Emanuele; Bursteinas, Borisas; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd; Martin, Maria

    2017-04-05

    The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to 'talk' to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc).

  5. Differences in Vector Genome Processing and Illegitimate Integration of Non-Integrating Lentiviral Vectors

    PubMed Central

    Shaw, Aaron M.; Joseph, Guiandre L.; Jasti, Aparna C.; Sastry-Dent, Lakshmi; Witting, Scott; Cornetta, Kenneth

    2016-01-01

    A variety of mutations in lentiviral vector expression systems have been shown to generate a non-integrating phenotype. We studied a novel 12 base-pair U3-LTR integrase attachment site deletion (U3-LTR att site) mutant and found similar physical titers to the previously reported integrase catalytic core mutant IN/D116N. Both mutations led to a greater than two log reduction in vector integration; with IN/D116N providing lower illegitimate integration frequency, while the U3-LTR att site mutant provided a higher level of transgene expression. The improved expression of the U3-LTR att site mutant could not be explained solely based on an observed modest increase in integration frequency. In evaluating processing, we noted significant differences in unintegrated vector forms, with the U3-LTR att site mutant leading to a predominance of 1-LTR circles. The mutations also differed in the manner of illegitimate integration. The U3-LTR att site mutant vector demonstrated integrase-mediated integration at the intact U5-LTR att site and non-integrase mediated integration at the mutated U3-LTR att site. Finally, we combined a variety of mutations and modifications and assessed transgene expression and integration frequency to show that combining modifications can improve the potential clinical utility of non-integrating lentiviral vectors. PMID:27682478

  6. TALEN-mediated genome editing: prospects and perspectives.

    PubMed

    Wright, David A; Li, Ting; Yang, Bing; Spalding, Martin H

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges and future prospects of this quickly evolving area of study and application.

  7. TALEN-mediated genome editing: prospects and perspectives

    SciTech Connect

    Wright, DA; Li, T; Yang, B; Spalding, MH

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges, and future prospects of this quickly evolving area of study and application.

  8. CRISPR/Cas9 mediated genome engineering in Drosophila.

    PubMed

    Bassett, Andrew; Liu, Ji-Long

    2014-09-01

    Genome engineering has revolutionised genetic analysis in many organisms. Here we describe a simple and efficient technique to generate and detect novel mutations in desired target genes in Drosophila melanogaster. We target double strand breaks to specific sites within the genome by injecting mRNA encoding the Cas9 endonuclease and in vitro transcribed synthetic guide RNA into Drosophila embryos. The small insertion and deletion mutations that result from inefficient non-homologous end joining at this site are detected by high resolution melt analysis of whole flies and individual wings, allowing stable lines to be made within 1 month.

  9. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    SciTech Connect

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  10. Integrated cytogenetics and genomics analysis of transposable elements in the Nile tilapia, Oreochromis niloticus.

    PubMed

    Valente, Guilherme; Kocher, Thomas; Eickbush, Thomas; Simões, Rafael P; Martins, Cesar

    2016-06-01

    Integration of cytogenetics and genomics has become essential to a better view of architecture and function of genomes. Although the advances on genomic sequencing have contributed to study genes and genomes, the repetitive DNA fraction of the genome is still enigmatic and poorly understood. Among repeated DNAs, transposable elements (TEs) are major components of eukaryotic chromatin and their investigation has been hindered even after the availability of whole sequenced genomes. The cytogenetic mapping of TEs in chromosomes has proved to be of high value to integrate information from the micro level of nucleotide sequence to a cytological view of chromosomes. Different TEs have been cytogenetically mapped in cichlids; however, neither details about their genomic arrangement nor appropriated copy number are well defined by these approaches. The current study integrates TEs distribution in Nile tilapia Oreochromis niloticus genome based on cytogenetic and genomics/bioinformatics approach. The results showed that some elements are not randomly distributed and that some are genomic dependent on each other. Moreover, we found extensive overlap between genomics and cytogenetics data and that tandem duplication may be the major mechanism responsible for the genomic dynamics of TEs here analyzed. This paper provides insights in the genomic organization of TEs under an integrated view based on cytogenetics and genomics.

  11. Bilayer-thickness-mediated interactions between integral membrane proteins.

    PubMed

    Kahraman, Osman; Koch, Peter D; Klug, William S; Haselwandter, Christoph A

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  12. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two brea...

  13. Integration-defective lentiviral vector mediates efficient gene editing through homology-directed repair in human embryonic stem cells.

    PubMed

    Wang, Yebo; Wang, Yingjia; Chang, Tammy; Huang, He; Yee, Jiing-Kuan

    2016-11-28

    Human embryonic stem cells (hESCs) are used as platforms for disease study, drug screening and cell-based therapy. To facilitate these applications, it is frequently necessary to genetically manipulate the hESC genome. Gene editing with engineered nucleases enables site-specific genetic modification of the human genome through homology-directed repair (HDR). However, the frequency of HDR remains low in hESCs. We combined efficient expression of engineered nucleases and integration-defective lentiviral vector (IDLV) transduction for donor template delivery to mediate HDR in hESC line WA09. This strategy led to highly efficient HDR with more than 80% of the selected WA09 clones harboring the transgene inserted at the targeted genomic locus. However, certain portions of the HDR clones contained the concatemeric IDLV genomic structure at the target site, probably resulted from recombination of the IDLV genomic input before HDR with the target. We found that the integrase protein of IDLV mediated the highly efficient HDR through the recruitment of a cellular protein, LEDGF/p75. This study demonstrates that IDLV-mediated HDR is a powerful and broadly applicable technology to carry out site-specific gene modification in hESCs.

  14. University of California San Francisco (UCSF-2): Integrative Genomic Approaches in Neuroblastoma (NBL) | Office of Cancer Genomics

    Cancer.gov

    The CTD2 Center at University of California San Francisco (UCSF-2) used an integrative genomics approach to reveal unidentified mRNA splicing patterns in neuroblastoma. Read the abstract Experimental Approaches Read the detailed Experimental Approaches

  15. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  16. IMG 4 version of the integrated microbial genomes comparative analysis system

    PubMed Central

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  17. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  18. Integrated genome-wide analysis of genomic changes and gene regulation in human adrenocortical tissue samples.

    PubMed

    Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S; Kebebew, Electron

    2015-10-30

    To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics.

  19. Potential pitfalls of CRISPR/Cas9-mediated genome editing.

    PubMed

    Peng, Rongxue; Lin, Guigao; Li, Jinming

    2016-04-01

    Recently, a novel technique named the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas)9 system has been rapidly developed. This genome editing tool has improved our ability tremendously with respect to exploring the pathogenesis of diseases and correcting disease mutations, as well as phenotypes. With a short guide RNA, Cas9 can be precisely directed to target sites, and functions as an endonuclease to efficiently produce breaks in DNA double strands. Over the past 30 years, CRISPR has evolved from the 'curious sequences of unknown biological function' into a promising genome editing tool. As a result of the incessant development in the CRISPR/Cas9 system, Cas9 co-expressed with custom guide RNAs has been successfully used in a variety of cells and organisms. This genome editing technology can also be applied to synthetic biology, functional genomic screening, transcriptional modulation and gene therapy. However, although CRISPR/Cas9 has a broad range of action in science, there are several aspects that affect its efficiency and specificity, including Cas9 activity, target site selection and short guide RNA design, delivery methods, off-target effects and the incidence of homology-directed repair. In the present review, we highlight the factors that affect the utilization of CRISPR/Cas9, as well as possible strategies for handling any problems. Addressing these issues will allow us to take better advantage of this technique. In addition, we also review the history and rapid development of the CRISPR/Cas system from the time of its initial discovery in 2012.

  20. An integrative characterization of recurrent molecular aberrations in glioblastoma genomes.

    PubMed

    Sintupisut, Nardnisa; Liu, Pei-Ling; Yeang, Chen-Hsiang

    2013-10-01

    Glioblastoma multiforme (GBM) is the most common and malignant primary brain tumor in adults. Decades of investigations and the recent effort of the Cancer Genome Atlas (TCGA) project have mapped many molecular alterations in GBM cells. Alterations on DNAs may dysregulate gene expressions and drive malignancy of tumors. It is thus important to uncover causal and statistical dependency between 'effector' molecular aberrations and 'target' gene expressions in GBMs. A rich collection of prior studies attempted to combine copy number variation (CNV) and mRNA expression data. However, systematic methods to integrate multiple types of cancer genomic data-gene mutations, single nucleotide polymorphisms, CNVs, DNA methylations, mRNA and microRNA expressions and clinical information-are relatively scarce. We proposed an algorithm to build 'association modules' linking effector molecular aberrations and target gene expressions and applied the module-finding algorithm to the integrated TCGA GBM data sets. The inferred association modules were validated by six tests using external information and datasets of central nervous system tumors: (i) indication of prognostic effects among patients; (ii) coherence of target gene expressions; (iii) retention of effector-target associations in external data sets; (iv) recurrence of effector molecular aberrations in GBM; (v) functional enrichment of target genes; and (vi) co-citations between effectors and targets. Modules associated with well-known molecular aberrations of GBM-such as chromosome 7 amplifications, chromosome 10 deletions, EGFR and NF1 mutations-passed the majority of the validation tests. Furthermore, several modules associated with less well-reported molecular aberrations-such as chromosome 11 CNVs, CD40, PLXNB1 and GSTM1 methylations, and mir-21 expressions-were also validated by external information. In particular, modules constituting trans-acting effects with chromosome 11 CNVs and cis-acting effects with chromosome

  1. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data.

    PubMed

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org.

  2. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data

    PubMed Central

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org. PMID:17932055

  3. CRISPR/Cas9-mediated genome editing of Epstein-Barr virus in human cells.

    PubMed

    Yuen, Kit-San; Chan, Chi-Ping; Wong, Nok-Hei Mickey; Ho, Chau-Ha; Ho, Ting-Hin; Lei, Ting; Deng, Wen; Tsao, Sai Wah; Chen, Honglin; Kok, Kin-Hang; Jin, Dong-Yan

    2015-03-01

    The CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR-associated 9) system is a highly efficient and powerful tool for RNA-guided editing of the cellular genome. Whether CRISPR/Cas9 can also cleave the genome of DNA viruses such as Epstein-Barr virus (EBV), which undergo episomal replication in human cells, remains to be established. Here, we reported on CRISPR/Cas9-mediated editing of the EBV genome in human cells. Two guide RNAs (gRNAs) were used to direct a targeted deletion of 558 bp in the promoter region of BART (BamHI A rightward transcript) which encodes viral microRNAs (miRNAs). Targeted editing was achieved in several human epithelial cell lines latently infected with EBV, including nasopharyngeal carcinoma C666-1 cells. CRISPR/Cas9-mediated editing of the EBV genome was efficient. A recombinant virus with the desired deletion was obtained after puromycin selection of cells expressing Cas9 and gRNAs. No off-target cleavage was found by deep sequencing. The loss of BART miRNA expression and activity was verified, supporting the BART promoter as the major promoter of BART RNA. Although CRISPR/Cas9-mediated editing of the multicopy episome of EBV in infected HEK293 cells was mostly incomplete, viruses could be recovered and introduced into other cells at low m.o.i. Recombinant viruses with an edited genome could be further isolated through single-cell sorting. Finally, a DsRed selectable marker was successfully introduced into the EBV genome during the course of CRISPR/Cas9-mediated editing. Taken together, our work provided not only the first genetic evidence that the BART promoter drives the expression of the BART transcript, but also a new and efficient method for targeted editing of EBV genome in human cells.

  4. High-Content Genome-Wide RNAi Screen Reveals CCR3 as a Key Mediator of Neuronal Cell Death

    PubMed Central

    Wang, Huaishan; Sherbini, Omar; Ling-lin Pai, Emily; Kwon, Ji-Sun; He, Wei; Wang, Hong; Chi, Zhikai; Xu, Jinchong; Jiang, Haisong; Andrabi, Shaida A.

    2016-01-01

    Neuronal loss caused by ischemic injury, trauma, or disease can lead to devastating consequences for the individual. With the goal of limiting neuronal loss, a number of cell death pathways have been studied, but there may be additional contributors to neuronal death that are yet unknown. To identify previously unknown cell death mediators, we performed a high-content genome-wide screening of short, interfering RNA (siRNA) with an siRNA library in murine neural stem cells after exposure to N-methyl-N-nitroso-N′-nitroguanidine (MNNG), which leads to DNA damage and cell death. Eighty genes were identified as key mediators for cell death. Among them, 14 are known cell death mediators and 66 have not previously been linked to cell death pathways. Using an integrated approach with functional and bioinformatics analysis, we provide possible molecular networks, interconnected pathways, and/or protein complexes that may participate in cell death. Of the 66 genes, we selected CCR3 for further evaluation and found that CCR3 is a mediator of neuronal injury. CCR3 inhibition or deletion protects murine cortical cultures from oxygen-glucose deprivation–induced cell death, and CCR3 deletion in mice provides protection from ischemia in vivo. Taken together, our findings suggest that CCR3 is a previously unknown mediator of cell death. Future identification of the neural cell death network in which CCR3 participates will enhance our understanding of the molecular mechanisms of neural cell death. PMID:27822494

  5. High-Content Genome-Wide RNAi Screen Reveals CCR3 as a Key Mediator of Neuronal Cell Death.

    PubMed

    Zhang, Jianmin; Wang, Huaishan; Sherbini, Omar; Ling-Lin Pai, Emily; Kang, Sung-Ung; Kwon, Ji-Sun; Yang, Jia; He, Wei; Wang, Hong; Eacker, Stephen M; Chi, Zhikai; Mao, Xiaobo; Xu, Jinchong; Jiang, Haisong; Andrabi, Shaida A; Dawson, Ted M; Dawson, Valina L

    2016-01-01

    Neuronal loss caused by ischemic injury, trauma, or disease can lead to devastating consequences for the individual. With the goal of limiting neuronal loss, a number of cell death pathways have been studied, but there may be additional contributors to neuronal death that are yet unknown. To identify previously unknown cell death mediators, we performed a high-content genome-wide screening of short, interfering RNA (siRNA) with an siRNA library in murine neural stem cells after exposure to N-methyl-N-nitroso-N'-nitroguanidine (MNNG), which leads to DNA damage and cell death. Eighty genes were identified as key mediators for cell death. Among them, 14 are known cell death mediators and 66 have not previously been linked to cell death pathways. Using an integrated approach with functional and bioinformatics analysis, we provide possible molecular networks, interconnected pathways, and/or protein complexes that may participate in cell death. Of the 66 genes, we selected CCR3 for further evaluation and found that CCR3 is a mediator of neuronal injury. CCR3 inhibition or deletion protects murine cortical cultures from oxygen-glucose deprivation-induced cell death, and CCR3 deletion in mice provides protection from ischemia in vivo. Taken together, our findings suggest that CCR3 is a previously unknown mediator of cell death. Future identification of the neural cell death network in which CCR3 participates will enhance our understanding of the molecular mechanisms of neural cell death.

  6. Oryzabase. An integrated biological and genome information database for rice.

    PubMed

    Kurata, Nori; Yamazaki, Yukiko

    2006-01-01

    The aim of Oryzabase is to create a comprehensive view of rice (Oryza sativa) as a model monocot plant by integrating biological data with molecular genomic information (http://www.shigen.nig.ac.jp/rice/oryzabase/top/top.jsp). The database contains information about rice development and anatomy, rice mutants, and genetic resources, especially for wild varieties of rice. The anatomical description of rice development is unique and is the first known representation for rice. Developmental and anatomical descriptions include in situ gene expression data serving as stage and tissue markers. The systematic presentation of a large number of rice mutant and mutant trait genes is indispensable, as is description of research in wild strains, core collections, and their detailed characterization. Several genetic, physical, and expression maps with full genome and cDNA sequences are also combined with biological data in Oryzabase. These datasets, when pooled together, could provide a useful tool for gaining greater knowledge about the life cycle of rice, the relationship between phenotype and gene function, and rice genetic diversity. For exchanging community information, Oryzabase publishes the Rice Genetics Newsletter organized by the Rice Genetics Cooperative and provides a mailing service, rice-e-net/rice-net.

  7. An affinity-based genome walking method to find transgene integration loci in transgenic genome.

    PubMed

    Thirulogachandar, V; Pandey, Prachi; Vaishnavi, C S; Reddy, Malireddy K

    2011-09-15

    Identifying a good transgenic event from the pool of putative transgenics is crucial for further characterization. In transgenic plants, the transgene can integrate in either single or multiple locations by disrupting the endogenes and/or in heterochromatin regions causing the positional effect. Apart from this, to protect the unauthorized use of transgenic plants, the signature of transgene integration for every commercial transgenic event needs to be characterized. Here we show an affinity-based genome walking method, named locus-finding (LF) PCR (polymerase chain reaction), to determine the transgene flanking sequences of rice plants transformed by Agrobacterium tumefaciens. LF PCR includes a primary PCR by a degenerated primer and transfer DNA (T-DNA)-specific primer, a nested PCR, and a method of enriching the desired amplicons by using a biotin-tagged primer that is complementary to the T-DNA. This enrichment technique separates the single strands of desired amplicons from the off-target amplicons, reducing the template complexity by several orders of magnitude. We analyzed eight transgenic rice plants and found the transgene integration loci in three different chromosomes. The characteristic illegitimate recombination of the Agrobacterium sp. was also observed from the sequenced integration loci. We believe that the LF PCR should be an indispensable technique in transgenic analysis.

  8. Integrated genomic and epigenomic analysis of breast cancer brain metastasis.

    PubMed

    Salhia, Bodour; Kiefer, Jeff; Ross, Julianna T D; Metapally, Raghu; Martinez, Rae Anne; Johnson, Kyle N; DiPerna, Danielle M; Paquette, Kimberly M; Jung, Sungwon; Nasser, Sara; Wallstrom, Garrick; Tembe, Waibhav; Baker, Angela; Carpten, John; Resau, Jim; Ryken, Timothy; Sibenaller, Zita; Petricoin, Emanuel F; Liotta, Lance A; Ramanathan, Ramesh K; Berens, Michael E; Tran, Nhan L

    2014-01-01

    The brain is a common site of metastatic disease in patients with breast cancer, which has few therapeutic options and dismal outcomes. The purpose of our study was to identify common and rare events that underlie breast cancer brain metastasis. We performed deep genomic profiling, which integrated gene copy number, gene expression and DNA methylation datasets on a collection of breast brain metastases. We identified frequent large chromosomal gains in 1q, 5p, 8q, 11q, and 20q and frequent broad-level deletions involving 8p, 17p, 21p and Xq. Frequently amplified and overexpressed genes included ATAD2, BRAF, DERL1, DNMTRB and NEK2A. The ATM, CRYAB and HSPB2 genes were commonly deleted and underexpressed. Knowledge mining revealed enrichment in cell cycle and G2/M transition pathways, which contained AURKA, AURKB and FOXM1. Using the PAM50 breast cancer intrinsic classifier, Luminal B, Her2+/ER negative, and basal-like tumors were identified as the most commonly represented breast cancer subtypes in our brain metastasis cohort. While overall methylation levels were increased in breast cancer brain metastasis, basal-like brain metastases were associated with significantly lower levels of methylation. Integrating DNA methylation data with gene expression revealed defects in cell migration and adhesion due to hypermethylation and downregulation of PENK, EDN3, and ITGAM. Hypomethylation and upregulation of KRT8 likely affects adhesion and permeability. Genomic and epigenomic profiling of breast brain metastasis has provided insight into the somatic events underlying this disease, which have potential in forming the basis of future therapeutic strategies.

  9. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease.

  10. Integrative pathway genomics of lung function and airflow obstruction

    PubMed Central

    Gharib, Sina A.; Loth, Daan W.; Soler Artigas, María; Birkland, Timothy P.; Wilk, Jemma B.; Wain, Louise V.; Brody, Jennifer A.; Obeidat, Ma'en; Hancock, Dana B.; Tang, Wenbo; Rawal, Rajesh; Boezen, H. Marike; Imboden, Medea; Huffman, Jennifer E.; Lahousse, Lies; Alves, Alexessander C.; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C.; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M.; Strachan, David P.; Deary, Ian J.; Hofman, Albert; Gläser, Sven; Wilson, James F.; North, Kari E.; Zhao, Jing Hua; Heckbert, Susan R.; Jarvis, Deborah L.; Probst-Hensch, Nicole; Schulz, Holger; Barr, R. Graham; Jarvelin, Marjo-Riitta; O'Connor, George T.; Kähönen, Mika; Cassano, Patricia A.; Hysi, Pirro G.; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M.; Hall, Ian P.; Parks, William C.; Tobin, Martin D.; London, Stephanie J.

    2015-01-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease. PMID:26395457

  11. Examination of host genome for the presence of integrated fragments of Solenopsis invicta virus 1

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A series of oligonucleotide primer pairs covering the entire genome of Solenopsis invicta virus 1 (SINV-1) were used to probe the Solenopsis invicta genome for integrated fragments of the viral genome. All of the oligonucleotide primer sets yielded amplicons of anticipated size from cDNA created f...

  12. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    SciTech Connect

    Baliga, Nitin S

    2011-05-26

    applied to the manually curated training set. Applying this method to the data representing around a quarter of the fraction space for water soluble proteins in D. vulgaris, we obtained 854 reliable pair wise interactions. Further, we have developed algorithms to analyze and assign significance to protein interaction data from bait pull-down experiments and integrate these data with other systems biology data through associative biclustering in a parallel computing environment. We will 'fill-in' missing information in these interaction data using a 'Transitive Closure' algorithm and subsequently use 'Between Commonality Decomposition' algorithm to discover complexes within these large graphs of protein interactions. To characterize the metabolic activities of proteins and their complexes we are developing algorithms to deconvolute pure mass spectra, estimate chemical formula for m/z values, and fit isotopic fine structure to metabolomics data. We have discovered that in comparison to isotopic pattern fitting methods restricting the chemical formula by these two dimensions actually facilitates unique solutions for chemical formula generators. To understand how microbial functions are regulated we have developed complementary algorithms for reconstructing gene regulatory networks (GRNs). Whereas the network inference algorithms cMonkey and Inferelator developed enable de novo reconstruction of predictive models for GRNs from diverse systems biology data, the RegPrecise and RegPredict framework developed uses evolutionary comparisons of genomes from closely related organisms to reconstruct conserved regulons. We have integrated the two complementary algorithms to rapidly generate comprehensive models for gene regulation of understudied organisms. Our preliminary analyses of these reconstructed GRNs have revealed novel regulatory mechanisms and cis-regulatory motifs, as well asothers that are conserved across species. Finally, we are supporting scientific efforts in ENIGMA

  13. Genomic RNA folding mediates assembly of human parechovirus.

    PubMed

    Shakeel, Shabih; Dykeman, Eric C; White, Simon J; Ora, Ari; Cockburn, Joseph J B; Butcher, Sarah J; Stockley, Peter G; Twarock, Reidun

    2017-12-01

    Assembly of the major viral pathogens of the Picornaviridae family is poorly understood. Human parechovirus 1 is an example of such viruses that contains 60 short regions of ordered RNA density making identical contacts with the protein shell. We show here via a combination of RNA-based systematic evolution of ligands by exponential enrichment, bioinformatics analysis and reverse genetics that these RNA segments are bound to the coat proteins in a sequence-specific manner. Disruption of either the RNA coat protein recognition motif or its contact amino acid residues is deleterious for viral assembly. The data are consistent with RNA packaging signals playing essential roles in virion assembly. Their binding sites on the coat proteins are evolutionarily conserved across the Parechovirus genus, suggesting that they represent potential broad-spectrum anti-viral targets.The mechanism underlying packaging of genomic RNA into viral particles is not well understood for human parechoviruses. Here the authors identify short RNA motifs in the parechovirus genome that bind capsid proteins, providing approximately 60 specific interactions for virion assembly.

  14. Integrated genomic and molecular characterization of cervical cancer.

    PubMed

    2017-03-16

    Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, one of the largest comprehensive genomic studies of cervical cancer to date. We observed notable APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered amplifications in immune targets CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2), and the BCAR4 long non-coding RNA, which has been associated with response to lapatinib. Integration of human papilloma virus (HPV) was observed in all HPV18-related samples and 76% of HPV16-related samples, and was associated with structural aberrations and increased target-gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumours with relatively high frequencies of KRAS, ARID1A and PTEN mutations. Integrative clustering of 178 samples identified keratin-low squamous, keratin-high squamous and adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers.

  15. The integrated web service and genome database for agricultural plants with biotechnology information.

    PubMed

    Kim, Changkug; Park, Dongsuk; Seol, Youngjoo; Hahn, Jangho

    2011-01-01

    The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.

  16. Accessing integrated genomic data using GenoBase: A tutorial, Part 1

    SciTech Connect

    Overbeek, R.; Price, M.

    1993-01-01

    GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.

  17. Causes and consequences of genetic background effects illuminated by integrative genomic analysis.

    PubMed

    Chandler, Christopher H; Chari, Sudarshan; Tack, David; Dworkin, Ian

    2014-04-01

    The phenotypic consequences of individual mutations are modulated by the wild-type genetic background in which they occur. Although such background dependence is widely observed, we do not know whether general patterns across species and traits exist or about the mechanisms underlying it. We also lack knowledge on how mutations interact with genetic background to influence gene expression and how this in turn mediates mutant phenotypes. Furthermore, how genetic background influences patterns of epistasis remains unclear. To investigate the genetic basis and genomic consequences of genetic background dependence of the scalloped(E3) allele on the Drosophila melanogaster wing, we generated multiple novel genome-level datasets from a mapping-by-introgression experiment and a tagged RNA gene expression dataset. In addition we used whole genome resequencing of the parental lines-two commonly used laboratory strains-to predict polymorphic transcription factor binding sites for SD. We integrated these data with previously published genomic datasets from expression microarrays and a modifier mutation screen. By searching for genes showing a congruent signal across multiple datasets, we were able to identify a robust set of candidate loci contributing to the background-dependent effects of mutations in sd. We also show that the majority of background-dependent modifiers previously reported are caused by higher-order epistasis, not quantitative noncomplementation. These findings provide a useful foundation for more detailed investigations of genetic background dependence in this system, and this approach is likely to prove useful in exploring the genetic basis of other traits as well.

  18. CRISPR/Cas9-mediated genome engineering of CHO cell factories: Application and perspectives.

    PubMed

    Lee, Jae Seong; Grav, Lise Marie; Lewis, Nathan E; Faustrup Kildegaard, Helene

    2015-07-01

    Chinese hamster ovary (CHO) cells are the most widely used production host for therapeutic proteins. With the recent emergence of CHO genome sequences, CHO cell line engineering has taken on a new aspect through targeted genome editing. The bacterial clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system enables rapid, easy and efficient engineering of mammalian genomes. It has a wide range of applications from modification of individual genes to genome-wide screening or regulation of genes. Facile genome editing using CRISPR/Cas9 empowers researchers in the CHO community to elucidate the mechanistic basis behind high level production of proteins and product quality attributes of interest. In this review, we describe the basis of CRISPR/Cas9-mediated genome editing and its application for development of next generation CHO cell factories while highlighting both future perspectives and challenges. As one of the main drivers for the CHO systems biology era, genome engineering with CRISPR/Cas9 will pave the way for rational design of CHO cell factories.

  19. Application of oocyte cryopreservation technology in TALEN-mediated mouse genome editing.

    PubMed

    Nakagawa, Yoshiko; Sakuma, Tetsushi; Nakagata, Naomi; Yamasaki, Sho; Takeda, Naoki; Ohmuraya, Masaki; Yamamoto, Takashi

    2014-01-01

    Reproductive engineering techniques, such as in vitro fertilization (IVF) and cryopreservation of embryos or spermatozoa, are essential for preservation, reproduction, and transportation of genetically engineered mice. However, it has not yet been elucidated whether these techniques can be applied for the generation of genome-edited mice using engineered nucleases such as transcription activator-like effector nucleases (TALENs). Here, we demonstrate the usefulness of frozen oocytes fertilized in vitro using frozen sperm for TALEN-mediated genome editing in mice. We examined side-by-side comparisons concerning sperm (fresh vs. frozen), fertilization method (mating vs. IVF), and fertilized oocytes (fresh vs. frozen) for the source of oocytes used for TALEN injection; we found that fertilized oocytes created under all tested conditions were applicable for TALEN-mediated mutagenesis. In addition, we investigated whether the ages in weeks of parental female mice can affect the efficiency of gene modification, by comparing 5-week-old and 8-12-week-old mice as the source of oocytes used for TALEN injection. The genome editing efficiency of an endogenous gene was consistently 95-100% when either 5-week-old or 8-12-week-old mice were used with or without freezing the oocytes. Thus, our report describes the availability of freeze-thawed oocytes and oocytes from female mice at various weeks of age for TALEN-mediated genome editing, thus boosting the convenience of such innovative gene targeting strategies.

  20. Cas9-mediated genome editing in the methanogenic archaeon Methanosarcina acetivorans.

    PubMed

    Nayak, Dipti D; Metcalf, William W

    2017-03-14

    Although Cas9-mediated genome editing has proven to be a powerful genetic tool in eukaryotes, its application in Bacteria has been limited because of inefficient targeting or repair; and its application to Archaea has yet to be reported. Here we describe the development of a Cas9-mediated genome-editing tool that allows facile genetic manipulation of the slow-growing methanogenic archaeon Methanosarcina acetivorans Introduction of both insertions and deletions by homology-directed repair was remarkably efficient and precise, occurring at a frequency of approximately 20% relative to the transformation efficiency, with the desired mutation being found in essentially all transformants examined. Off-target activity was not observed. We also observed that multiple single-guide RNAs could be expressed in the same transcript, reducing the size of mutagenic plasmids and simultaneously simplifying their design. Cas9-mediated genome editing reduces the time needed to construct mutants by more than half (3 vs. 8 wk) and allows simultaneous construction of double mutants with high efficiency, exponentially decreasing the time needed for complex strain constructions. Furthermore, coexpression the nonhomologous end-joining (NHEJ) machinery from the closely related archaeon, Methanocella paludicola, allowed efficient Cas9-mediated genome editing without the need for a repair template. The NHEJ-dependent mutations included deletions ranging from 75 to 2.7 kb in length, most of which appear to have occurred at regions of naturally occurring microhomology. The combination of homology-directed repair-dependent and NHEJ-dependent genome-editing tools comprises a powerful genetic system that enables facile insertion and deletion of genes, rational modification of gene expression, and testing of gene essentiality.

  1. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  2. Assessing the integration of genomic medicine in genetic counseling training programs.

    PubMed

    Profato, Jessica; Gordon, Erynn S; Dixon, Shannan; Kwan, Andrea

    2014-08-01

    Medical genetics has entered a period of transition from genetics to genomics. Genetic counselors (GCs) may take on roles in the clinical implementation of genomics. This study explores the perspectives of program directors (PDs) on including genomic medicine in GC training programs, as well as the status of this integration. Study methods included an online survey, an optional one-on-one telephone interview, and an optional curricula content analysis. The majority of respondents (15/16) reported that it is important to include genomic medicine in program curricula. Most topics of genomic medicine are either "currently taught" or "under development" in all participating programs. Interview data from five PDs and one faculty member supported the survey data. Integrating genomics in training programs is challenging, and it is essential to develop genomics resources for curricula.

  3. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement.

    PubMed

    Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K

    2016-04-18

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.

  4. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement

    PubMed Central

    Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  5. The 3M complex maintains microtubule and genome integrity

    PubMed Central

    Yan, Jun; Yan, Feng; Li, Zhijun; Sinnott, Becky; Cappell, Kathryn M.; Yu, Yanbao; Mo, Jinyao; Duncan, Joseph A.; Chen, Xian; Cormier-Daire, Valerie; Whitehurst, Angelique W.; Xiong, Yue

    2014-01-01

    SUMMARY CUL7, OBSL1, and CCDC8 genes are mutated in a mutually exclusive manner in 3M and other growth retardation syndromes. The mechanism underlying the function of the three 3M genes in development is not known. We found that OBSL1 and CCDC8 form a complex with CUL7 and regulate the level and centrosomal localization of CUL7, respectively. CUL7 depletion results in altered microtubule dynamics, prometaphase arrest, tetraploidy and mitotic cell death. These defects are recaptured in CUL7 mutated 3M cells and can be rescued by wild-type, but not 3M patients-derived CUL7 mutants. Depletion of either OBSL1 or CCDC8 results in similar defects and sensitizes cells to microtubule damage as loss of CUL7 function. Microtubule damage reduces the level of CCDC8 that is required for the centrosomal localization of CUL7. We propose that CUL7, OBSL1, and CCDC8 proteins form a 3M complex that functions in maintaining microtubule and genome integrity and normal development. PMID:24793695

  6. Phenotype-Genotype Integrator (PheGenI): synthesizing genome-wide association study (GWAS) data with existing genomic resources.

    PubMed

    Ramos, Erin M; Hoffman, Douglas; Junkins, Heather A; Maglott, Donna; Phan, Lon; Sherry, Stephen T; Feolo, Mike; Hindorff, Lucia A

    2014-01-01

    Rapidly accumulating data from genome-wide association studies (GWASs) and other large-scale studies are most useful when synthesized with existing databases. To address this opportunity, we developed the Phenotype-Genotype Integrator (PheGenI), a user-friendly web interface that integrates various National Center for Biotechnology Information (NCBI) genomic databases with association data from the National Human Genome Research Institute GWAS Catalog and supports downloads of search results. Here, we describe the rationale for and development of this resource. Integrating over 66,000 association records with extensive single nucleotide polymorphism (SNP), gene, and expression quantitative trait loci data already available from the NCBI, PheGenI enables deeper investigation and interrogation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between genetic variation and human diseases.

  7. Integrative genomic analysis implicates limited peripheral adipose storage capacity in the pathogenesis of human insulin resistance.

    PubMed

    Lotta, Luca A; Gulati, Pawan; Day, Felix R; Payne, Felicity; Ongen, Halit; van de Bunt, Martijn; Gaulton, Kyle J; Eicher, John D; Sharp, Stephen J; Luan, Jian'an; De Lucia Rolfe, Emanuella; Stewart, Isobel D; Wheeler, Eleanor; Willems, Sara M; Adams, Claire; Yaghootkar, Hanieh; Forouhi, Nita G; Khaw, Kay-Tee; Johnson, Andrew D; Semple, Robert K; Frayling, Timothy; Perry, John R B; Dermitzakis, Emmanouil; McCarthy, Mark I; Barroso, Inês; Wareham, Nicholas J; Savage, David B; Langenberg, Claudia; O'Rahilly, Stephen; Scott, Robert A

    2017-01-01

    Insulin resistance is a key mediator of obesity-related cardiometabolic disease, yet the mechanisms underlying this link remain obscure. Using an integrative genomic approach, we identify 53 genomic regions associated with insulin resistance phenotypes (higher fasting insulin levels adjusted for BMI, lower HDL cholesterol levels and higher triglyceride levels) and provide evidence that their link with higher cardiometabolic risk is underpinned by an association with lower adipose mass in peripheral compartments. Using these 53 loci, we show a polygenic contribution to familial partial lipodystrophy type 1, a severe form of insulin resistance, and highlight shared molecular mechanisms in common/mild and rare/severe insulin resistance. Population-level genetic analyses combined with experiments in cellular models implicate CCDC92, DNAH10 and L3MBTL3 as previously unrecognized molecules influencing adipocyte differentiation. Our findings support the notion that limited storage capacity of peripheral adipose tissue is an important etiological component in insulin-resistant cardiometabolic disease and highlight genes and mechanisms underpinning this link.

  8. Integrating genetics and genomics into nursing curricula: you can do it too!

    PubMed

    Daack-Hirsch, Sandra; Jackson, Barbara; Belchez, Chito A; Elder, Betty; Hurley, Roxanne; Kerr, Peg; Nissen, Mary Kay

    2013-12-01

    Rapid advances in knowledge and technology related to genomics cross health care disciplines and touch almost every aspect of patient care. The ability to sequence a genome holds the promise that health care can be personalized. Health care professionals are faced with a gap in the ability to use the rapidly expanding technology and knowledge related to genomics in practice. Yet, nurses are key to bridging the gap between genomic discoveries and the human experience of illness. This article presents a case study documenting the experience of five nursing schools/colleges of nursing as they work to integrate genetics and genomics into their curricula.

  9. Stakeholder engagement: a key component of integrating genomic information into electronic health records

    PubMed Central

    Hartzler, Andrea; McCarty, Catherine A.; Rasmussen, Luke V.; Williams, Marc S.; Brilliant, Murray; Bowton, Erica A.; Clayton, Ellen Wright; Faucett, William A.; Ferryman, Kadija; Field, Julie R.; Fullerton, Stephanie M.; Horowitz, Carol R.; Koenig, Barbara A.; McCormick, Jennifer B.; Ralston, James D.; Sanderson, Saskia C.; Smith, Maureen E.; Trinidad, Susan Brown

    2014-01-01

    Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine. PMID:24030437

  10. CRISPR/Cas9-Mediated Genome Editing of Mouse Small Intestinal Organoids.

    PubMed

    Schwank, Gerald; Clevers, Hans

    2016-01-01

    The CRISPR/Cas9 system is an RNA-guided genome-editing tool that has been recently developed based on the bacterial CRISPR-Cas immune defense system. Due to its versatility and simplicity, it rapidly became the method of choice for genome editing in various biological systems, including mammalian cells. Here we describe a protocol for CRISPR/Cas9-mediated genome editing in murine small intestinal organoids, a culture system in which somatic stem cells are maintained by self-renewal, while giving rise to all major cell types of the intestinal epithelium. This protocol allows the study of gene function in intestinal epithelial homeostasis and pathophysiology and can be extended to epithelial organoids derived from other internal mouse and human organs.

  11. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome.

    PubMed

    Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

    2015-06-19

    Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors.

  12. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  13. A high utility integrated map of the pig genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The domestic pig is being increasingly exploited as a system for modeling human disease. It also has substantial economic importance for meat-based protein production. Physical clone maps have underpinned large-scale genomic sequencing and enabled focused cloning efforts for many genome...

  14. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas

    PubMed Central

    2015-01-01

    BACKGROUND Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. METHODS We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. RESULTS Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. CONCLUSIONS The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q

  15. Collaboration of MLLT1/ENL, Polycomb and ATM for transcription and genome integrity.

    PubMed

    Ui, Ayako; Yasui, Akira

    2016-04-25

    Polycomb group (PcG) repress, whereas Trithorax group (TrxG) activate transcription for tissue development and cellular proliferation, and misregulation of these factors is often associated with cancer. ENL (MLLT1) and AF9 (MLLT3) are fusion partners of Mixed Lineage Leukemia (MLL), TrxG proteins, and are factors in Super Elongation Complex (SEC). SEC controls transcriptional elongation to release RNA polymerase II, paused around transcription start site. In MLL rearranged leukemia, several components of SEC have been found as MLL-fusion partners and the control of transcriptional elongation is misregulated leading to tumorigenesis in MLL-SEC fused Leukemia. It has been suggested that unexpected collaboration of ENL/AF9-MLL and PcG are involved in tumorigenesis in leukemia. Recently, we found that the collaboration of ENL/AF9 and PcG led to a novel mechanism of transcriptional switch from elongation to repression under ATM-signaling for genome integrity. Activated ATM phosphorylates ENL/AF9 in SEC, and the phosphorylated ENL/AF9 binds BMI1 and RING1B, a heterodimeric E3-ubiquitin-ligase complex in Polycomb Repressive complex 1 (PRC1), and recruits PRC1 at transcriptional elongation sites to rapidly repress transcription. The ENL/AF9 in SEC- and PcG-mediated transcriptional repression promotes DSB repair near transcription sites. The implication of this is that the collaboration of ENL/AF9 in SEC and PcG ensures a rapid response of transcriptional switching from elongation to repression to neighboring genotoxic stresses for DSB repair. Therefore, these results suggested that the collaboration of ENL/AF9 and PcG in transcriptional control is required to maintain genome integrity and may be link to the MLL-ENL/AF9 leukemia.

  16. Integrated Consensus Map of Cultivated Peanut and Wild Relatives Reveals Structures of the A and B Genomes of Arachis and Divergence of the Legume Genomes

    PubMed Central

    Shirasawa, Kenta; Bertioli, David J.; Varshney, Rajeev K.; Moretzsohn, Marcio C.; Leal-Bertioli, Soraya C. M.; Thudi, Mahendar; Pandey, Manish K.; Rami, Jean-Francois; Foncéka, Daniel; Gowda, Makanahally V. C.; Qin, Hongde; Guo, Baozhu; Hong, Yanbin; Liang, Xuanqiang; Hirakawa, Hideki; Tabata, Satoshi; Isobe, Sachiko

    2013-01-01

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)4×, were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding. PMID:23315685

  17. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes.

    PubMed

    Shirasawa, Kenta; Bertioli, David J; Varshney, Rajeev K; Moretzsohn, Marcio C; Leal-Bertioli, Soraya C M; Thudi, Mahendar; Pandey, Manish K; Rami, Jean-Francois; Foncéka, Daniel; Gowda, Makanahally V C; Qin, Hongde; Guo, Baozhu; Hong, Yanbin; Liang, Xuanqiang; Hirakawa, Hideki; Tabata, Satoshi; Isobe, Sachiko

    2013-04-01

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)(4×), were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding.

  18. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of synteny with model fish genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper we generated DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is compose...

  19. viruSITE—integrated database for viral genomics

    PubMed Central

    Stano, Matej; Beke, Gabor; Klucar, Lubos

    2016-01-01

    Viruses are the most abundant biological entities and the reservoir of most of the genetic diversity in the Earth's biosphere. Viral genomes are very diverse, generally short in length and compared to other organisms carry only few genes. viruSITE is a novel database which brings together high-value information compiled from various resources. viruSITE covers the whole universe of viruses and focuses on viral genomes, genes and proteins. The database contains information on virus taxonomy, host range, genome features, sequential relatedness as well as the properties and functions of viral genes and proteins. All entries in the database are linked to numerous information resources. The above-mentioned features make viruSITE a comprehensive knowledge hub in the field of viral genomics. The web interface of the database was designed so as to offer an easy-to-navigate, intuitive and user-friendly environment. It provides sophisticated text searching and a taxonomy-based browsing system. viruSITE also allows for an alternative approach based on sequence search. A proprietary genome browser generates a graphical representation of viral genomes. In addition to retrieving and visualising data, users can perform comparative genomics analyses using a variety of tools. Database URL: http://www.virusite.org/ PMID:28025349

  20. Genetic and statistical study of HIV integration in the human genome

    NASA Astrophysics Data System (ADS)

    Sequeira, Inês J.; Gonçalves, Juliana; Moreira, Elsa; Mexia, João T.; Rueff, José; Brás, Aldina

    2013-10-01

    Integration of the human immunodeficiency virus (HIV) DNA into human genome is essential for HIV-induced disease. The human genome is organized into chromosomes and within these we can define the chromosomal fragile sites. Our aim is to contribute to help clarifying the integration sites preferences of HIV1 and HIV2 in fragile or non-fragile regions. Here we apply statistical techniques, namely non-parametric tests and analysis of variance for analyzing two sets of data of HIV1 and HIV2 integrations in the human genome. The results show that the integrations occur significantly with more intensity in the non-fragile regions of the human genome and that the HIV1 in particular has the major contribution to this fact. This study could have implications in human disease.

  1. GenomeVISTA—an integrated software package for whole-genome alignment and visualization

    PubMed Central

    Poliakov, Alexandre; Foong, Justin; Brudno, Michael; Dubchak, Inna

    2014-01-01

    Summary: With the ubiquitous generation of complete genome assemblies for a variety of species, efficient tools for whole-genome alignment along with user-friendly visualization are critically important. Our VISTA family of tools for comparative genomics, based on algorithms for pairwise and multiple alignments of genomic sequences and whole-genome assemblies, has become one of the standard techniques for comparative analysis. Most of the VISTA programs have been implemented as Web-accessible servers and are extensively used by the biomedical community. In this manuscript, we introduce GenomeVISTA: a novel implementation that incorporates most features of the VISTA family—fast and accurate alignment, visualization capabilities, GUI and analytical tools within a stand-alone software package. GenomeVISTA thus provides flexibility and security for users who need to conduct whole-genome comparisons on their own computers. Availability and implementation: Implemented in Perl, C/C++ and Java, the source code is freely available for download at the VISTA Web site: http://genome.lbl.gov/vista/ Contact: avpoliakov@lbl.gov or ildubchak@lbl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24860159

  2. Ku-mediated coupling of DNA cleavage and repair during programmed genome rearrangements in the ciliate Paramecium tetraurelia.

    PubMed

    Marmignon, Antoine; Bischerour, Julien; Silve, Aude; Fojcik, Clémentine; Dubois, Emeline; Arnaiz, Olivier; Kapusta, Aurélie; Malinsky, Sophie; Bétermier, Mireille

    2014-08-01

    During somatic differentiation, physiological DNA double-strand breaks (DSB) can drive programmed genome rearrangements (PGR), during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES). IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ) pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium DNA cleavage

  3. Ku-Mediated Coupling of DNA Cleavage and Repair during Programmed Genome Rearrangements in the Ciliate Paramecium tetraurelia

    PubMed Central

    Marmignon, Antoine; Bischerour, Julien; Silve, Aude; Fojcik, Clémentine; Dubois, Emeline; Arnaiz, Olivier; Kapusta, Aurélie; Malinsky, Sophie; Bétermier, Mireille

    2014-01-01

    During somatic differentiation, physiological DNA double-strand breaks (DSB) can drive programmed genome rearrangements (PGR), during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES). IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ) pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium DNA cleavage

  4. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    NEALSON, KENNETH H.

    2013-10-15

    products of dissimilatory iron reduction. Geochim. Cosmochim. Acta. 74:574-583. 10. Karpinets, T.V., A.Y Obraztsova, Y. Wang, D.D. Schmoyer, G.H. Kora, B.H. Park, M.H. Serres, M.F. Ropmine, M.L. Land, T.B. Kothe, J.K. Fredrickson, K.H. Nealson, and E.C. Uberbacher 2010. Conserved synteny at the protein family level reveals genes underlying Shewanella species? cold tolerance and predicts their novel phenotypes. Funct. Integr. Genomics 10: 97 ? 110. (DOI 10.1007/s10143-009-0142-y) 11. Bretschger, O., A.C.M. Cheung, F. Mansfeld, and K.H. Nealson. 2010. Comparative microbial fuel cell evaluations of Shewanella spp. Electroanalysis 22: 883-894. 12. McLean, J.S., G. Wanger, Y.A. Gorby, M. Wainstein, J. McQuaid, Shun?ichi Ishii, O. Bretschger, H. Beyanal, K.H. Nealson. 2010. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Env. Sci. Technol. 44:2721-2717. 13. El-Naggar, M., G. Wanger, K.M. Leung, T.D. Yuzvinsky, G. Southam, J. Yang, W.M. Lau, K.H. Nealson, and Y.A. Gorby. 2010. Electrical Transport Along Bacterial Nanowires from Shewanella oneidensis MR-1 Proc. Nat. Acad. Sci. USA 107:18127-18131. 14. Biffinger, J.C., L.A. Fitzgerald, R. Ray, B.J. Little, S.E. Lizewski, E.R. Petersen, B.R. Ringeisen, W.C. Sanders, P.E. Sheehan, J.J. Pietron, J.W. Baldwin, L.J. Nadeau, G.R. Johnson, M. Ribbens, S.E. Finkel, K.H. Nealson. 2010. The utility of Shewanella japonica for microbial fuel cells. Bioresource Technol. 102:290-297. 15. Rodionov, D. , C. Yang, X. Li, I. Rodionova, Y. Wang, A.Y. Obraztsova, O. P. Zagnitko, R. Overbeek, M. F. Romine, S. Reed, J.K. Fredrickson, K.H. Nealson, A.L. Osterman. 2010. Genomic encyclopedia of sugar utilization pathways in the Shewanella genus. BMC Genomics 2010, 11:494 16. Kan, J., L. Hsu, A.C.M. Cheung, M. Pirbazari, and K.H. Nealson. 2011. Current production by bacterial communities in microbial fuel cells enriched from wastewater sludge

  5. Genomic regions responsible for amenability to Agrobacterium-mediated transformation in barley

    PubMed Central

    Hisano, Hiroshi; Sato, Kazuhiro

    2016-01-01

    Different plant cultivars of the same genus and species can exhibit vastly different genetic transformation efficiencies. However, the genetic factors underlying these differences in transformation rate remain largely unknown. In barley, ‘Golden Promise’ is one of a few cultivars reliable for Agrobacterium-mediated transformation. By contrast, cultivar ‘Haruna Nijo’ is recalcitrant to genetic transformation. We identified genomic regions of barley important for successful transformation with Agrobacterium, utilizing the ‘Haruna Nijo’ × ‘Golden Promise’ F2 generation and genotyping by 124 genome-wide SNP markers. We observed significant segregation distortions of these markers from the expected 1:2:1 ratio toward the ‘Golden Promise’-type in regions of chromosomes 2H and 3H, indicating that the alleles of ‘Golden Promise’ in these regions might contribute to transformation efficiency. The same regions, which we termed Transformation Amenability (TFA) regions, were also conserved in transgenic F2 plants generated from a ‘Morex’ × ‘Golden Promise’ cross. The genomic regions identified herein likely include necessary factors for Agrobacterium-mediated transformation in barley. The potential to introduce these loci into any haplotype of barley opens the door to increasing the efficiency of transformation for target alleles into any haplotype of barley by the TFA-based methods proposed in this report. PMID:27874056

  6. Genome-wide signatures of male-mediated migration shaping the Indian gene pool.

    PubMed

    ArunKumar, GaneshPrasad; Tatarinova, Tatiana V; Duty, Jeff; Rollo, Debra; Syama, Adhikarla; Arun, Varatharajan Santhakumari; Kavitha, Valampuri John; Triska, Petr; Greenspan, Bennett; Wells, R Spencer; Pitchappan, Ramasamy

    2015-09-01

    Multiple questions relating to contributions of cultural and demographical factors in the process of human geographical dispersal remain largely unanswered. India, a land of early human settlement and the resulting diversity is a good place to look for some of the answers. In this study, we explored the genetic structure of India using a diverse panel of 78 males genotyped using the GenoChip. Their genome-wide single-nucleotide polymorphism (SNP) diversity was examined in the context of various covariates that influence Indian gene pool. Admixture analysis of genome-wide SNP data showed high proportion of the Southwest Asian component in all of the Indian samples. Hierarchical clustering based on admixture proportions revealed seven distinct clusters correlating to geographical and linguistic affiliations. Convex hull overlay of Y-chromosomal haplogroups on the genome-wide SNP principal component analysis brought out distinct non-overlapping polygons of F*-M89, H*-M69, L1-M27, O2a-M95 and O3a3c1-M117, suggesting a male-mediated migration and expansion of the Indian gene pool. Lack of similar correlation with mitochondrial DNA clades indicated a shared genetic ancestry of females. We suggest that ancient male-mediated migratory events and settlement in various regional niches led to the present day scenario and peopling of India.

  7. CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription

    PubMed Central

    Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M.; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T.; Wilczynski, Grzegorz M.; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

    2015-01-01

    Summary Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced ChIA-PET strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CTCF and RNAPII with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes towards CTCF-foci for coordinated transcription. Furthermore, we show that haplotype-variants and allelic-interactions have differential effects on chromosome configuration influencing gene expression and may provide mechanistic insights into functions associated with disease susceptibility. 3D-genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D-genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. PMID:26686651

  8. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    SciTech Connect

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  9. An Integrated Review of Emoticons in Computer-Mediated Communication.

    PubMed

    Aldunate, Nerea; González-Ibáñez, Roberto

    2016-01-01

    Facial expressions constitute a rich source of non-verbal cues in face-to-face communication. They provide interlocutors with resources to express and interpret verbal messages, which may affect their cognitive and emotional processing. Contrarily, computer-mediated communication (CMC), particularly text-based communication, is limited to the use of symbols to convey a message, where facial expressions cannot be transmitted naturally. In this scenario, people use emoticons as paralinguistic cues to convey emotional meaning. Research has shown that emoticons contribute to a greater social presence as a result of the enrichment of text-based communication channels. Additionally, emoticons constitute a valuable resource for language comprehension by providing expressivity to text messages. The latter findings have been supported by studies in neuroscience showing that particular brain regions involved in emotional processing are also activated when people are exposed to emoticons. To reach an integrated understanding of the influence of emoticons in human communication on both socio-cognitive and neural levels, we review the literature on emoticons in three different areas. First, we present relevant literature on emoticons in CMC. Second, we study the influence of emoticons in language comprehension. Finally, we show the incipient research in neuroscience on this topic. This mini review reveals that, while there are plenty of studies on the influence of emoticons in communication from a social psychology perspective, little is known about the neurocognitive basis of the effects of emoticons on communication dynamics.

  10. An Integrated Review of Emoticons in Computer-Mediated Communication

    PubMed Central

    Aldunate, Nerea; González-Ibáñez, Roberto

    2017-01-01

    Facial expressions constitute a rich source of non-verbal cues in face-to-face communication. They provide interlocutors with resources to express and interpret verbal messages, which may affect their cognitive and emotional processing. Contrarily, computer-mediated communication (CMC), particularly text-based communication, is limited to the use of symbols to convey a message, where facial expressions cannot be transmitted naturally. In this scenario, people use emoticons as paralinguistic cues to convey emotional meaning. Research has shown that emoticons contribute to a greater social presence as a result of the enrichment of text-based communication channels. Additionally, emoticons constitute a valuable resource for language comprehension by providing expressivity to text messages. The latter findings have been supported by studies in neuroscience showing that particular brain regions involved in emotional processing are also activated when people are exposed to emoticons. To reach an integrated understanding of the influence of emoticons in human communication on both socio-cognitive and neural levels, we review the literature on emoticons in three different areas. First, we present relevant literature on emoticons in CMC. Second, we study the influence of emoticons in language comprehension. Finally, we show the incipient research in neuroscience on this topic. This mini review reveals that, while there are plenty of studies on the influence of emoticons in communication from a social psychology perspective, little is known about the neurocognitive basis of the effects of emoticons on communication dynamics. PMID:28111564

  11. CRISPR-mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice.

    PubMed

    Xu, Li; Park, Ki Ho; Zhao, Lixia; Xu, Jing; El Refaey, Mona; Gao, Yandi; Zhu, Hua; Ma, Jianjie; Han, Renzhi

    2016-03-01

    Duchenne muscular dystrophy (DMD) is a degenerative muscle disease caused by genetic mutations that lead to the disruption of dystrophin in muscle fibers. There is no curative treatment for this devastating disease. Clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) has emerged as a powerful tool for genetic manipulation and potential therapy. Here we demonstrate that CRIPSR-mediated genome editing efficiently excised a 23-kb genomic region on the X-chromosome covering the mutant exon 23 in a mouse model of DMD, and restored dystrophin expression and the dystrophin-glycoprotein complex at the sarcolemma of skeletal muscles in live mdx mice. Electroporation-mediated transfection of the Cas9/gRNA constructs in the skeletal muscles of mdx mice normalized the calcium sparks in response to osmotic shock. Adenovirus-mediated transduction of Cas9/gRNA greatly reduced the Evans blue dye uptake of skeletal muscles at rest and after downhill treadmill running. This study provides proof evidence for permanent gene correction in DMD.

  12. Mapping the telomere integrated genome of human herpesvirus 6A and 6B.

    PubMed

    Arbuckle, Jesse H; Pantry, Shara N; Medveczky, Maria M; Prichett, Joshua; Loomis, Kristin S; Ablashi, Dharam; Medveczky, Peter G

    2013-07-20

    Human herpesvirus 6B (HHV-6B) is the causative agent of roseola infantum. HHV-6A and 6B can reactivate in immunosuppressed individuals and are linked with severe inflammatory response, organ rejection and central nervous system diseases. About 0.85% of the US and UK population carries an integrated HHV-6 genome in all nucleated cells through germline transmission. We have previously reported that the HHV-6A genome integrated in telomeres of patients suffering from neurological dysfunction and also in telomeres of tissue culture cells. We now report that HHV-6B also integrates in telomeres during latency. Detailed mapping of the integrated viral genomes demonstrates that a single HHV-6 genome integrates and telomere repeats join the left end of the integrated viral genome. When HEK-293 cells carrying integrated HHV-6A were exposed to the histone deacetylase inhibitor Trichostatin A, circularization and/or formation of concatamers were detected and this assay could be used to distinguish between lytic replication and latency.

  13. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  14. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4

    PubMed Central

    Gao, Song; Zhang, Liang; Rao, Venigalla B.

    2016-01-01

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo. On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  15. CRISPR/Cas9-Mediated Genome Editing in Soybean Hairy Roots.

    PubMed

    Cai, Yupeng; Chen, Li; Liu, Xiujie; Sun, Shi; Wu, Cunxiang; Jiang, Bingjun; Han, Tianfu; Hou, Wensheng

    2015-01-01

    As a new technology for gene editing, the CRISPR (clustered regularly interspaced short palindromic repeat)/Cas (CRISPR-associated) system has been rapidly and widely used for genome engineering in various organisms. In the present study, we successfully applied type II CRISPR/Cas9 system to generate and estimate genome editing in the desired target genes in soybean (Glycine max (L.) Merrill.). The single-guide RNA (sgRNA) and Cas9 cassettes were assembled on one vector to improve transformation efficiency, and we designed a sgRNA that targeted a transgene (bar) and six sgRNAs that targeted different sites of two endogenous soybean genes (GmFEI2 and GmSHR). The targeted DNA mutations were detected in soybean hairy roots. The results demonstrated that this customized CRISPR/Cas9 system shared the same efficiency for both endogenous and exogenous genes in soybean hairy roots. We also performed experiments to detect the potential of CRISPR/Cas9 system to simultaneously edit two endogenous soybean genes using only one customized sgRNA. Overall, generating and detecting the CRISPR/Cas9-mediated genome modifications in target genes of soybean hairy roots could rapidly assess the efficiency of each target loci. The target sites with higher efficiencies can be used for regular soybean transformation. Furthermore, this method provides a powerful tool for root-specific functional genomics studies in soybean.

  16. Efficient RNA/Cas9-mediated genome editing in Xenopus tropicalis.

    PubMed

    Guo, Xiaogang; Zhang, Tiejun; Hu, Zheng; Zhang, Yanqi; Shi, Zhaoying; Wang, Qinhu; Cui, Yan; Wang, Fengqin; Zhao, Hui; Chen, Yonglong

    2014-02-01

    For the emerging amphibian genetic model Xenopus tropicalis targeted gene disruption is dependent on zinc-finger nucleases (ZFNs) or transcription activator-like effector nucleases (TALENs), which require either complex design and selection or laborious construction. Thus, easy and efficient genome editing tools are still highly desirable for this species. Here, we report that RNA-guided Cas9 nuclease resulted in precise targeted gene disruption in all ten X. tropicalis genes that we analyzed, with efficiencies above 45% and readily up to 100%. Systematic point mutation analyses in two loci revealed that perfect matches between the spacer and the protospacer sequences proximal to the protospacer adjacent motif (PAM) were essential for Cas9 to cleave the target sites in the X. tropicalis genome. Further study showed that the Cas9 system could serve as an efficient tool for multiplexed genome engineering in Xenopus embryos. Analysis of the disruption of two genes, ptf1a/p48 and tyrosinase, indicated that Cas9-mediated gene targeting can facilitate direct phenotypic assessment in X. tropicalis embryos. Finally, five founder frogs from targeting of either elastase-T1, elastase-T2 or tyrosinase showed highly efficient transmission of targeted mutations into F1 embryos. Together, our data demonstrate that the Cas9 system is an easy, efficient and reliable tool for multiplex genome editing in X. tropicalis.

  17. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  18. An integrated encyclopedia of DNA elements in the human genome.

    PubMed

    2012-09-06

    The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

  19. Genomic landscape of human, bat, and ex vivo DNA transposon integrations.

    PubMed

    Campos-Sánchez, Rebeca; Kapusta, Aurélie; Feschotte, Cédric; Chiaromonte, Francesca; Makova, Kateryna D

    2014-07-01

    The integration and fixation preferences of DNA transposons, one of the major classes of eukaryotic transposable elements, have never been evaluated comprehensively on a genome-wide scale. Here, we present a detailed study of the distribution of DNA transposons in the human and bat genomes. We studied three groups of DNA transposons that integrated at different evolutionary times: 1) ancient (>40 My) and currently inactive human elements, 2) younger (<40 My) bat elements, and 3) ex vivo integrations of piggyBat and Sleeping Beauty elements in HeLa cells. Although the distribution of ex vivo elements reflected integration preferences, the distribution of human and (to a lesser extent) bat elements was also affected by selection. We used regression techniques (linear, negative binomial, and logistic regression models with multiple predictors) applied to 20-kb and 1-Mb windows to investigate how the genomic landscape in the vicinity of DNA transposons contributes to their integration and fixation. Our models indicate that genomic landscape explains 16-79% of variability in DNA transposon genome-wide distribution. Importantly, we not only confirmed previously identified predictors (e.g., DNA conformation and recombination hotspots) but also identified several novel predictors (e.g., signatures of double-strand breaks and telomere hexamer). Ex vivo integrations showed a bias toward actively transcribed regions. Older DNA transposons were located in genomic regions scarce in most conserved elements-likely reflecting purifying selection. Our study highlights how DNA transposons are integral to the evolution of bat and human genomes, and has implications for the development of DNA transposon assays for gene therapy and mutagenesis applications.

  20. Integrated genome-based studies of Shewanella ecophysiology

    SciTech Connect

    Segre Daniel; Beg Qasim

    2012-02-14

    This project was a component of the Shewanella Federation and, as such, contributed to the overall goal of applying the genomic tools to better understand eco-physiology and speciation of respiratory-versatile members of Shewanella genus. Our role at Boston University was to perform bioreactor and high throughput gene expression microarrays, and combine dynamic flux balance modeling with experimentally obtained transcriptional and gene expression datasets from different growth conditions. In the first part of project, we designed the S. oneidensis microarray probes for Affymetrix Inc. (based in California), then we identified the pathways of carbon utilization in the metal-reducing marine bacterium Shewanella oneidensis MR-1, using our newly designed high-density oligonucleotide Affymetrix microarray on Shewanella cells grown with various carbon sources. Next, using a combination of experimental and computational approaches, we built algorithm and methods to integrate the transcriptional and metabolic regulatory networks of S. oneidensis. Specifically, we combined mRNA microarray and metabolite measurements with statistical inference and dynamic flux balance analysis (dFBA) to study the transcriptional response of S. oneidensis MR-1 as it passes through exponential, stationary, and transition phases. By measuring time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different nutrient compositions (minimal lactate and nutritionally rich LB medium), we obtain detailed snapshots of the regulatory strategies used by this bacterium to cope with gradually changing nutrient availability. In addition to traditional clustering, which provides a first indication of major regulatory trends and transcription factors activities, we developed and implemented a new computational approach for Dynamic Detection of Transcriptional Triggers (D2T2). This new method allows us to infer a putative topology of transcriptional dependencies

  1. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes

    PubMed Central

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  2. Integrated genomic approaches to enhance genetic resistance in chickens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The chicken has led the way amongst agricultural animal species in infectious disease control and, in particular, selection for genetic resistance. The generation of the chicken genome sequence and the availability of other empowering tools and resources greatly enhance the ability to select for enh...

  3. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    TIEDJE, JAMES M; KONSTANTINIDIS, KOSTAS; WORDEN, MARK

    2014-01-08

    The aim of the work reported is to study Shewanella population genomics, and to understand the evolution, ecophysiology, and speciation of Shewanella. The tasks supporting this aim are: to study genetic and ecophysiological bases defining the core and diversification of Shewanella species; to determine gene content patterns along redox gradients; and to Investigate the evolutionary processes, patterns and mechanisms of Shewanella.

  4. Integrated genomics of Mucorales reveals novel therapeutic targets

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  5. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Cucurbitaceae includes important crops as cucumber, melon, watermelon, and squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular li...

  6. FEATnotator: A tool for integrated annotation of sequence features and variation, facilitating interpretation in genomics experiments.

    PubMed

    Podicheti, Ram; Mockaitis, Keithanne

    2015-06-01

    As approaches are sought for more efficient and democratized uses of non-model and expanded model genomics references, ease of integration of genomic feature datasets is especially desirable in multidisciplinary research communities. Valuable conclusions are often missed or slowed when researchers refer experimental results to a single reference sequence that lacks integrated pan-genomic and multi-experiment data in accessible formats. Association of genomic positional information, such as results from an expansive variety of next-generation sequencing experiments, with annotated reference features such as genes or predicted protein binding sites, provides the context essential for conclusions and ongoing research. When the experimental system includes polymorphic genomic inputs, rapid calculation of gene structural and protein translational effects of sequence variation from the reference can be invaluable. Here we present FEATnotator, a lightweight, fast and easy to use open source software program that integrates and reports overlap and proximity in genomic information from any user-defined datasets including those from next generation sequencing applications. We illustrate use of the tool by summarizing whole genome sequence variation of a widely used natural isolate of Arabidopsis thaliana in the context of gene models of the reference accession. Previous discovery of a protein coding deletion influencing root development is replicated rapidly. Appropriate even in investigations of a single gene or genic regions such as QTL, comprehensive reports provided by FEATnotator better prepare researchers for interpretation of their experimental results. The tool is available for download at http://featnotator.sourceforge.net.

  7. An integrated functional genomics approach identifies the regulatory network directed by brachyury (T) in chordoma.

    PubMed

    Nelson, Andrew C; Pillay, Nischalan; Henderson, Stephen; Presneau, Nadège; Tirabosco, Roberto; Halai, Dina; Berisha, Fitim; Flicek, Paul; Stemple, Derek L; Stern, Claudio D; Wardle, Fiona C; Flanagan, Adrienne M

    2012-11-01

    Chordoma is a rare malignant tumour of bone, the molecular marker of which is the expression of the transcription factor, brachyury. Having recently demonstrated that silencing brachyury induces growth arrest in a chordoma cell line, we now seek to identify its downstream target genes. Here we use an integrated functional genomics approach involving shRNA-mediated brachyury knockdown, gene expression microarray, ChIP-seq experiments, and bioinformatics analysis to achieve this goal. We confirm that the T-box binding motif of human brachyury is identical to that found in mouse, Xenopus, and zebrafish development, and that brachyury acts primarily as an activator of transcription. Using human chordoma samples for validation purposes, we show that brachyury binds 99 direct targets and indirectly influences the expression of 64 other genes, thereby acting as a master regulator of an elaborate oncogenic transcriptional network encompassing diverse signalling pathways including components of the cell cycle, and extracellular matrix components. Given the wide repertoire of its active binding and the relative specific localization of brachyury to the tumour cells, we propose that an RNA interference-based gene therapy approach is a plausible therapeutic avenue worthy of investigation.

  8. A physical map of the highly heterozygous Populus genome: integration with the genome sequence and genetic map

    SciTech Connect

    Kelleher, Colin; CHIU, Dr. R.; Shin, Dr. H.; Krywinski, Martin; Fjell, Chris; Wilkin, Jennifer; Yin, Tongming; Difazio, Stephen P.

    2007-01-01

    As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 {+-} 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.

  9. Integration sites of Epstein-Barr virus genome on chromosomes of human lymphoblastoid cell lines

    SciTech Connect

    Wuu, K.D.; Chen, Y.J.; Wang-Wuu, S.

    1994-09-01

    Epstein-Barr virus (EBV) is the pathogen of infectious mononucleosis. The viral genome is present in more than 95% of the African cases of Burkitt lymphoma and it is usually maintained in episomal form in the tumor cells. Viral integration has been described only for Nanalwa which is a Burkitt lymphoma cell line lacking episomes. In order to examine the role of EBV in the immortalization of human Blymphocytes, we investigated whether the EBV integration into the human genome is essential. If the integration does occur, we would like to know whether the integration is randomly distributed or whether the viral DNA integrates preferentially at certain sites. Fourteen in vitro immortalized human lymphoblastoid cell lines (LCLs) were examined by fluorescence in situ hybridization (FISH) with a biotinylated EBV BamHI w DNA fragment as probe. The episomal form of EBV DNA was found in all cells of these cell lines, while only about 65% of the cells have the integrated viral DNA. This might suggest that integration is not a pre-requisite for cell immortalization. Although all chromosomes, except Y, have been found with integrated viral genome, chromsomes 1 and 5 are the most frequent EBV DNA carrier (p<0.05). Nine chromosome bands, namely, 1p31, 1q31, 2q32, 3q13, 3q26, 5q14, 6q24, 7q31 and 12q21, are preferential targets for EBV integration (p<0.001). Eighty percent of the total 938 EBV hybridization signals were found to be at G-band-positive area. This suggests that the mechanism of EBV integration might be different from that of the retroviruses, which specifically integrate to G-band-negative areas. Thus, we conclude that the integration of EBV to host genome is non-random and it may have something to do with the structure of chromosome and DNA sequences.

  10. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

    PubMed

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  11. Childhood Acute Lymphoblastic Leukemia: Integrating Genomics into Therapy

    PubMed Central

    Tasian, Sarah K; Loh, Mignon L; Hunger, Stephen P

    2015-01-01

    Acute lymphoblastic leukemia (ALL), the most common malignancy of childhood, is a genetically complex entity that remains a major cause of childhood cancer-related mortality. Major advances in genomic and epigenomic profiling during the past decade have appreciably enhanced knowledge of the biology of de novo and relapsed ALL and have facilitated more precise risk stratification of patients. These achievements have also provided critical insights regarding potentially targetable lesions for development of new therapeutic approaches in the era of precision medicine. This review delineates the current genetic landscape of childhood ALL with emphasis upon patient outcomes with contemporary treatment regimens, as well as therapeutic implications of newly identified genomic alterations in specific subsets of ALL. PMID:26194091

  12. SOP for pathway inference in Integrated Microbial Genomes (IMG).

    PubMed

    Anderson, Iain; Chen, Amy; Markowitz, Victor; Kyrpides, Nikos; Ivanova, Natalia

    2011-12-31

    One of the most important aspects of genomic analysis is the prediction of which pathways, both metabolic and non-metabolic, are present in an organism. In IMG, this is carried out by the assignment of IMG terms, which are organized into IMG pathways. Based on manual and automatic assignment of IMG terms, the presence or absence of IMG pathways is automatically inferred. The three categories of pathway assertion are asserted (likely present), not asserted (likely absent), and unknown. In the unknown category, at least one term necessary for the pathway is missing, but an ortholog in another organism has the corresponding term assigned to it. Automatic pathway inference is an important initial step in genome analysis.

  13. Integrated metabolomics and phytochemical genomics approaches for studies on rice.

    PubMed

    Okazaki, Yozo; Saito, Kazuki

    2016-01-01

    Metabolomics is widely employed to monitor the cellular metabolic state and assess the quality of plant-derived foodstuffs because it can be used to manage datasets that include a wide range of metabolites in their analytical samples. In this review, we discuss metabolomics research on rice in order to elucidate the overall regulation of the metabolism as it is related to the growth and mechanisms of adaptation to genetic modifications and environmental stresses such as fungal infections, submergence, and oxidative stress. We also focus on phytochemical genomics studies based on a combination of metabolomics and quantitative trait locus (QTL) mapping techniques. In addition to starch, rice produces many metabolites that also serve as nutrients for human consumers. The outcomes of recent phytochemical genomics studies of diverse natural rice resources suggest there is potential for using further effective breeding strategies to improve the quality of ingredients in rice grains.

  14. NDRG1 links p53 with proliferation-mediated centrosome homeostasis and genome stability.

    PubMed

    Croessmann, Sarah; Wong, Hong Yuen; Zabransky, Daniel J; Chu, David; Mendonca, Janet; Sharma, Anup; Mohseni, Morassa; Rosen, D Marc; Scharpf, Robert B; Cidado, Justin; Cochran, Rory L; Parsons, Heather A; Dalton, W Brian; Erlanger, Bracha; Button, Berry; Cravero, Karen; Kyker-Snowman, Kelly; Beaver, Julia A; Kachhap, Sushant; Hurley, Paula J; Lauring, Josh; Park, Ben Ho

    2015-09-15

    The tumor protein 53 (TP53) tumor suppressor gene is the most frequently somatically altered gene in human cancers. Here we show expression of N-Myc down-regulated gene 1 (NDRG1) is induced by p53 during physiologic low proliferative states, and mediates centrosome homeostasis, thus maintaining genome stability. When placed in physiologic low-proliferating conditions, human TP53 null cells fail to increase expression of NDRG1 compared with isogenic wild-type controls and TP53 R248W knockin cells. Overexpression and RNA interference studies demonstrate that NDRG1 regulates centrosome number and amplification. Mechanistically, NDRG1 physically associates with γ-tubulin, a key component of the centrosome, with reduced association in p53 null cells. Strikingly, TP53 homozygous loss was mutually exclusive of NDRG1 overexpression in over 96% of human cancers, supporting the broad applicability of these results. Our study elucidates a mechanism of how TP53 loss leads to abnormal centrosome numbers and genomic instability mediated by NDRG1.

  15. Integrated genomic analyses of de novo pathways underlying atypical meningiomas

    PubMed Central

    Harmancı, Akdes Serin; Youngblood, Mark W.; Clark, Victoria E.; Coşkun, Süleyman; Henegariu, Octavian; Duran, Daniel; Erson-Omay, E. Zeynep; Kaulen, Leon D.; Lee, Tong Ihn; Abraham, Brian J.; Simon, Matthias; Krischek, Boris; Timmer, Marco; Goldbrunner, Roland; Omay, S. Bülent; Baranoski, Jacob; Baran, Burçin; Carrión-Grant, Geneive; Bai, Hanwen; Mishra-Gorur, Ketu; Schramm, Johannes; Moliterno, Jennifer; Vortmeyer, Alexander O.; Bilgüvar, Kaya; Yasuno, Katsuhito; Young, Richard A.; Günel, Murat

    2017-01-01

    Meningiomas are mostly benign brain tumours, with a potential for becoming atypical or malignant. On the basis of comprehensive genomic, transcriptomic and epigenomic analyses, we compared benign meningiomas to atypical ones. Here, we show that the majority of primary (de novo) atypical meningiomas display loss of NF2, which co-occurs either with genomic instability or recurrent SMARCB1 mutations. These tumours harbour increased H3K27me3 signal and a hypermethylated phenotype, mainly occupying the polycomb repressive complex 2 (PRC2) binding sites in human embryonic stem cells, thereby phenocopying a more primitive cellular state. Consistent with this observation, atypical meningiomas exhibit upregulation of EZH2, the catalytic subunit of the PRC2 complex, as well as the E2F2 and FOXM1 transcriptional networks. Importantly, these primary atypical meningiomas do not harbour TERT promoter mutations, which have been reported in atypical tumours that progressed from benign ones. Our results establish the genomic landscape of primary atypical meningiomas and potential therapeutic targets. PMID:28195122

  16. RNA Interference Is Responsible for Reduction of Transgene Expression after Sleeping Beauty Transposase Mediated Somatic Integration

    PubMed Central

    Rauschhuber, Christina; Ehrhardt, Anja

    2012-01-01

    Background Integrating non-viral vectors based on transposable elements are widely used for genetically engineering mammalian cells in functional genomics and therapeutic gene transfer. For the Sleeping Beauty (SB) transposase system it was demonstrated that convergent transcription driven by the SB transposase inverted repeats (IRs) in eukaryotic cells occurs after somatic integration. This could lead to formation of double-stranded RNAs potentially presenting targets for the RNA interference (RNAi) machinery and subsequently resulting into silencing of the transgene. Therefore, we aimed at investigating transgene expression upon transposition under RNA interference knockdown conditions. Principal Findings To establish RNAi knockdown cell lines we took advantage of the P19 protein, which is derived from the tomato bushy stunt virus. P19 binds and inhibits 21 nucleotides long, small-interfering RNAs and was shown to sufficiently suppress RNAi. We found that transgene expression upon SB mediated transposition was enhanced, resulting into a 3.2-fold increased amount of colony forming units (CFU) after transposition. In contrast, if the transgene cassette is insulated from the influence of chromosomal position effects by the chicken-derived cHS4 insulating sequences or when applying the Forg Prince transposon system, that displays only negligible transcriptional activity, similar numbers of CFUs were obtained. Conclusion In summary, we provide evidence for the first time that after somatic integration transposon derived transgene expression is regulated by the endogenous RNAi machinery. In the future this finding will help to further improve the molecular design of the SB transposase vector system. PMID:22570690

  17. HIV-1 Integrates Widely throughout the Genome of the Human Blood Fluke Schistosoma mansoni

    PubMed Central

    Mann, Victoria H.; Dubrovsky, Larisa; Yan, Hong-bin; Huckvale, Thomas; Protasio, Anna V.; Pushkarsky, Tatiana; Iordanskiy, Sergey; Bukrinsky, Michael I.

    2016-01-01

    Schistosomiasis is the most important helminthic disease of humanity in terms of morbidity and mortality. Facile manipulation of schistosomes using lentiviruses would enable advances in functional genomics in these and related neglected tropical diseases pathogens including tapeworms, and including their non-dividing cells. Such approaches have hitherto been unavailable. Blood stream forms of the human blood fluke, Schistosoma mansoni, the causative agent of the hepatointestinal schistosomiasis, were infected with the human HIV-1 isolate NL4-3 pseudotyped with vesicular stomatitis virus glycoprotein. The appearance of strong stop and positive strand cDNAs indicated that virions fused to schistosome cells, the nucleocapsid internalized and the RNA genome reverse transcribed. Anchored PCR analysis, sequencing HIV-1-specific anchored Illumina libraries and Whole Genome Sequencing (WGS) of schistosomes confirmed chromosomal integration; >8,000 integrations were mapped, distributed throughout the eight pairs of chromosomes including the sex chromosomes. The rate of integrations in the genome exceeded five per 1,000 kb and HIV-1 integrated into protein-encoding loci and elsewhere with integration bias dissimilar to that of human T cells. We estimated ~ 2,100 integrations per schistosomulum based on WGS, i.e. about two or three events per cell, comparable to integration rates in human cells. Accomplishment in schistosomes of post-entry processes essential for HIV-1replication, including integrase-catalyzed integration, was remarkable given the phylogenetic distance between schistosomes and primates, the natural hosts of the genus Lentivirus. These enigmatic findings revealed that HIV-1 was active within cells of S. mansoni, and provided the first demonstration that HIV-1 can integrate into the genome of an invertebrate. PMID:27764257

  18. HIV-1 Integrates Widely throughout the Genome of the Human Blood Fluke Schistosoma mansoni.

    PubMed

    Suttiprapa, Sutas; Rinaldi, Gabriel; Tsai, Isheng J; Mann, Victoria H; Dubrovsky, Larisa; Yan, Hong-Bin; Holroyd, Nancy; Huckvale, Thomas; Durrant, Caroline; Protasio, Anna V; Pushkarsky, Tatiana; Iordanskiy, Sergey; Berriman, Matthew; Bukrinsky, Michael I; Brindley, Paul J

    2016-10-01

    Schistosomiasis is the most important helminthic disease of humanity in terms of morbidity and mortality. Facile manipulation of schistosomes using lentiviruses would enable advances in functional genomics in these and related neglected tropical diseases pathogens including tapeworms, and including their non-dividing cells. Such approaches have hitherto been unavailable. Blood stream forms of the human blood fluke, Schistosoma mansoni, the causative agent of the hepatointestinal schistosomiasis, were infected with the human HIV-1 isolate NL4-3 pseudotyped with vesicular stomatitis virus glycoprotein. The appearance of strong stop and positive strand cDNAs indicated that virions fused to schistosome cells, the nucleocapsid internalized and the RNA genome reverse transcribed. Anchored PCR analysis, sequencing HIV-1-specific anchored Illumina libraries and Whole Genome Sequencing (WGS) of schistosomes confirmed chromosomal integration; >8,000 integrations were mapped, distributed throughout the eight pairs of chromosomes including the sex chromosomes. The rate of integrations in the genome exceeded five per 1,000 kb and HIV-1 integrated into protein-encoding loci and elsewhere with integration bias dissimilar to that of human T cells. We estimated ~ 2,100 integrations per schistosomulum based on WGS, i.e. about two or three events per cell, comparable to integration rates in human cells. Accomplishment in schistosomes of post-entry processes essential for HIV-1replication, including integrase-catalyzed integration, was remarkable given the phylogenetic distance between schistosomes and primates, the natural hosts of the genus Lentivirus. These enigmatic findings revealed that HIV-1 was active within cells of S. mansoni, and provided the first demonstration that HIV-1 can integrate into the genome of an invertebrate.

  19. Genomic characterization of viral integration sites in HPV-related cancers.

    PubMed

    Bodelon, Clara; Untereiner, Michael E; Machiela, Mitchell J; Vinokurova, Svetlana; Wentzensen, Nicolas

    2016-11-01

    Persistent infection with carcinogenic human papillomaviruses (HPV) causes the majority of anogenital cancers and a subset of head and neck cancers. The HPV genome is frequently found integrated into the host genome of invasive cancers. The mechanisms of how it may promote disease progression are not well understood. Thoroughly characterizing integration events can provide insights into HPV carcinogenesis. Individual studies have reported limited number of integration sites in cell lines and human samples. We performed a systematic review of published integration sites in HPV-related cancers and conducted a pooled analysis to formally test for integration hotspots and genomic features enriched in integration events using data from the Encyclopedia of DNA Elements (ENCODE). Over 1,500 integration sites were reported in the literature, of which 90.8% (N = 1,407) were in human tissues. We found 10 cytobands enriched for integration events, three previously reported ones (3q28, 8q24.21 and 13q22.1) and seven additional ones (2q22.3, 3p14.2, 8q24.22, 14q24.1, 17p11.1, 17q23.1 and 17q23.2). Cervical infections with HPV18 were more likely to have breakpoints in 8q24.21 (p = 7.68 × 10(-4) ) than those with HPV16. Overall, integration sites were more likely to be in gene regions than expected by chance (p = 6.93 × 10(-9) ). They were also significantly closer to CpG regions, fragile sites, transcriptionally active regions and enhancers. Few integration events occurred within 50 Kb of known cervical cancer driver genes. This suggests that HPV integrates in accessible regions of the genome, preferentially genes and enhancers, which may affect the expression of target genes.

  20. Integrating Genomic Resources with Electronic Health Records using the HL7 Infobutton Standard

    PubMed Central

    Overby, Casey Lynnette; Del Fiol, Guilherme; Rubinstein, Wendy S.; Maglott, Donna R.; Nelson, Tristan H.; Milosavljevic, Aleksandar; Martin, Christa L.; Goehringer, Scott R.; Freimuth, Robert R.; Williams, Marc S.

    2016-01-01

    Summary Background The Clinical Genome Resource (ClinGen) Electronic Health Record (EHR) Workgroup aims to integrate ClinGen resources with EHRs. A promising option to enable this integration is through the Health Level Seven (HL7) Infobutton Standard. EHR systems that are certified according to the US Meaningful Use program provide HL7-compliant infobutton capabilities, which can be leveraged to support clinical decision-making in genomics. Objectives To integrate genomic knowledge resources using the HL7 infobutton standard. Two tactics to achieve this objective were: (1) creating an HL7-compliant search interface for ClinGen, and (2) proposing guidance for genomic resources on achieving HL7 Infobutton standard accessibility and compliance. Methods We built a search interface utilizing OpenInfobutton, an open source reference implementation of the HL7 Infobutton standard. ClinGen resources were assessed for readiness towards HL7 compliance. Finally, based upon our experiences we provide recommendations for publishers seeking to achieve HL7 compliance. Results Eight genomic resources and two sub-resources were integrated with the ClinGen search engine via OpenInfobutton and the HL7 infobutton standard. Resources we assessed have varying levels of readiness towards HL7-compliance. Furthermore, we found that adoption of standard terminologies used by EHR systems is the main gap to achieve compliance. Conclusion Genomic resources can be integrated with EHR systems via the HL7 Infobutton standard using OpenInfobutton. Full compliance of genomic resources with the Infobutton standard would further enhance interoperability with EHR systems. PMID:27579472

  1. Multiplex genomic walking: Integration of the wet lab and computer lab into a single prototyping environment

    SciTech Connect

    Gillevet, P.M.

    1993-12-31

    The authors are presently sequencing the entire genome of Mycoplasma capricolum, one of the smallest of free living organisms by a Multiplex Genomic Walking strategy. This technique involves the repetitive hybridization of sequencing membranes with oligonucleotide probes to acquire sequence data in discrete steps along the genome. The technique allows one to walk a genome in a directed manner eliminating the problems associated with random shotgun assembly. Furthermore, the repetitive stripping and hybridization process is relatively simple to reproduce and has the potential to be easily automated. The Genetic Data Environment (GDE), an X Windows based Graphic User Interface has allowed the seamless integration of a core multiple sequence editor with pre-existing external sequence analysis programs and internally developed programs into a single prototypic environment. This system has facilitated linkage of the 9 Harvard Genome Lab`s internal database and automated data control systems into one Graphic User Interface which can handle the archiving and analysis of both random fluorescent sequencing data and genomic walking data from the Mycoplasma project. Finally, it has facilitated the integration of the Genomic sequence data into a PROLOG database environment for the comparative analysis of Mycoplasma capricolum and other organisms.

  2. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system.

    PubMed

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D; Markowitz, Victor M; Kyrpides, Nikos C

    2009-11-24

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  3. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de).

  4. Integration of genomic medicine into pathology residency training: the stanford open curriculum.

    PubMed

    Schrijver, Iris; Natkunam, Yasodha; Galli, Stephen; Boyd, Scott D

    2013-03-01

    Next-generation sequencing methods provide an opportunity for molecular pathology laboratories to perform genomic testing that is far more comprehensive than single-gene analyses. Genome-based test results are expected to develop into an integral component of diagnostic clinical medicine and to provide the basis for individually tailored health care. To achieve these goals, rigorous interpretation of high-quality data must be informed by the medical history and the phenotype of the patient. The discipline of pathology is well positioned to implement genome-based testing and to interpret its results, but new knowledge and skills must be included in the training of pathologists to develop expertise in this area. Pathology residents should be trained in emerging technologies to integrate genomic test results appropriately with more traditional testing, to accelerate clinical studies using genomic data, and to help develop appropriate standards of data quality and evidence-based interpretation of these test results. We have created a genomic pathology curriculum as a first step in helping pathology residents build a foundation for the understanding of genomic medicine and its implications for clinical practice. This curriculum is freely accessible online.

  5. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system

    SciTech Connect

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D.; Markowitz, Victor M.; Kyrpides, Nikos C.

    2009-05-01

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across a statistically significant and phylogeneticaly diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate and explore gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  6. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  7. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  8. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  9. Comparison of 432 Pseudomonas strains through integration of genomic, functional, metabolic and expression data

    PubMed Central

    Koehorst, Jasper J.; van Dam, Jesse C. J.; van Heck, Ruben G. A.; Saccenti, Edoardo; dos Santos, Vitor A. P. Martins; Suarez-Diez, Maria; Schaap, Peter J.

    2016-01-01

    Pseudomonas is a highly versatile genus containing species that can be harmful to humans and plants while others are widely used for bioengineering and bioremediation. We analysed 432 sequenced Pseudomonas strains by integrating results from a large scale functional comparison using protein domains with data from six metabolic models, nearly a thousand transcriptome measurements and four large scale transposon mutagenesis experiments. Through heterogeneous data integration we linked gene essentiality, persistence and expression variability. The pan-genome of Pseudomonas is closed indicating a limited role of horizontal gene transfer in the evolutionary history of this genus. A large fraction of essential genes are highly persistent, still non essential genes represent a considerable fraction of the core-genome. Our results emphasize the power of integrating large scale comparative functional genomics with heterogeneous data for exploring bacterial diversity and versatility. PMID:27922098

  10. Exploration of Genomic, Proteomic, and Histopathological Image Data Integration Methods for Clinical Prediction

    PubMed Central

    Poruthoor, A.; Phan, J.H.; Kothari, S.; Wang, May D.

    2016-01-01

    The emergence of large multi-platform and multi-scale data repositories in biomedicine has enabled the exploration of data integration for holistic decision making. In this research, we investigate multi-modal genomic, proteomic, and histopathological image data integration for prediction of ovarian cancer clinical endpoints in The Cancer Genome Atlas (TCGA). Specifically, we study two data integration techniques, simple data concatenation and ensemble classification, to determine whether they can improve prediction of ovarian cancer grade or patient survival. Results indicate that integration via ensemble classification is more effective than simple data concatenation. We also highlight several key factors impacting data integration outcome such as predictability of endpoint, class prevalence, and unbalanced representation of features from different data modalities.

  11. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells

    PubMed Central

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H.

    2015-01-01

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  12. High-throughput genomic mapping of vector integration sites in gene therapy studies.

    PubMed

    Beard, Brian C; Adair, Jennifer E; Trobridge, Grant D; Kiem, Hans-Peter

    2014-01-01

    Gene therapy has enormous potential to treat a variety of infectious and genetic diseases. To date hundreds of patients worldwide have received hematopoietic cell products that have been gene-modified with retrovirus vectors carrying therapeutic transgenes, and many patients have been cured or demonstrated disease stabilization as a result (Adair et al., Sci Transl Med 4:133ra57, 2012; Biffi et al., Science 341:1233158, 2013; Aiuti et al., Science 341:1233151, 2013; Fischer et al., Gene 525:170-173, 2013). Unfortunately, for some patients the provirus integration dysregulated the expression of nearby genes leading to clonal outgrowth and, in some cases, cancer. Thus, the unwanted side effect of insertional mutagenesis has become a major concern for retrovirus gene therapy. The careful study of retrovirus integration sites (RIS) and the contribution of individual gene-modified clones to hematopoietic repopulating cells is of crucial importance for all gene therapy studies. Supporting this, the US Food and Drug Administration (FDA) has mandated the careful monitoring of RIS in all clinical trials of gene therapy. An invaluable method was developed: linear amplification mediated-polymerase chain reaction (LAM-PCR) capable of analyzing in vitro and complex in vivo samples, capturing valuable genomic information directly flanking the site of provirus integration. Linking this method and similar methods to high-throughput sequencing has now made possible an unprecedented understanding of the integration profile of various retrovirus vectors, and allows for sensitive monitoring of their safety. It also allows for a detailed comparison of improved safety-enhanced gene therapy vectors. An important readout of safety is the relative contribution of individual gene-modified repopulating clones. One limitation of LAM-PCR is that the ability to capture the relative contribution of individual clones is compromised because of the initial linear PCR common to all current methods

  13. From integrative genomics to systems genetics in the rat to link genotypes to phenotypes

    PubMed Central

    Moreno-Moral, Aida

    2016-01-01

    ABSTRACT Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. PMID:27736746

  14. Genome-wide RNAi screen reveals ALK1 mediates LDL uptake and transcytosis in endothelial cells

    PubMed Central

    Kraehling, Jan R.; Chidlow, John H.; Rajagopal, Chitra; Sugiyama, Michael G.; Fowler, Joseph W.; Lee, Monica Y.; Zhang, Xinbo; Ramírez, Cristina M.; Park, Eon Joo; Tao, Bo; Chen, Keyang; Kuruvilla, Leena; Larriveé, Bruno; Folta-Stogniew, Ewa; Ola, Roxana; Rotllan, Noemi; Zhou, Wenping; Nagle, Michael W.; Herz, Joachim; Williams, Kevin Jon; Eichmann, Anne; Lee, Warren L.; Fernández-Hernando, Carlos; Sessa, William C.

    2016-01-01

    In humans and animals lacking functional LDL receptor (LDLR), LDL from plasma still readily traverses the endothelium. To identify the pathways of LDL uptake, a genome-wide RNAi screen was performed in endothelial cells and cross-referenced with GWAS-data sets. Here we show that the activin-like kinase 1 (ALK1) mediates LDL uptake into endothelial cells. ALK1 binds LDL with lower affinity than LDLR and saturates only at hypercholesterolemic concentrations. ALK1 mediates uptake of LDL into endothelial cells via an unusual endocytic pathway that diverts the ligand from lysosomal degradation and promotes LDL transcytosis. The endothelium-specific genetic ablation of Alk1 in Ldlr-KO animals leads to less LDL uptake into the aortic endothelium, showing its physiological role in endothelial lipoprotein metabolism. In summary, identification of pathways mediating LDLR-independent uptake of LDL may provide unique opportunities to block the initiation of LDL accumulation in the vessel wall or augment hepatic LDLR-dependent clearance of LDL. PMID:27869117

  15. RecQ Helicases: Conserved Guardians of Genomic Integrity.

    PubMed

    Larsen, Nicolai Balle; Hickson, Ian D

    2013-01-01

    The RecQ family of DNA helicases is highly conserved throughout -evolution, and is important for the maintenance of genome stability. In humans, five RecQ family members have been identified: BLM, WRN, RECQ4, RECQ1 and RECQ5. Defects in three of these give rise to Bloom's syndrome (BLM), Werner's syndrome (WRN) and Rothmund-Thomson/RAPADILINO/Baller-Gerold (RECQ4) syndromes. These syndromes are characterised by cancer predisposition and/or premature ageing. In this review, we focus on the roles of BLM and its S. cerevisiae homologue, Sgs1, in genome maintenance. BLM/Sgs1 has been shown to play a critical role in homologous recombination at multiple steps, including end-resection, displacement loop formation, branch migration and double Holliday junction dissolution. In addition, recent evidence has revealed a role for BLM/Sgs1 in the stabilisation and repair of replication forks damaged during a perturbed S-phase. Finally BLM also plays a role in the suppression and/or resolution of ultra-fine anaphase DNA bridges that form between sister-chromatids during mitosis.

  16. Integrated genome-wide chromatin occupancy and expression analyses identify key myeloid pro-differentiation transcription factors repressed by Myb.

    PubMed

    Zhao, Liang; Glazov, Evgeny A; Pattabiraman, Diwakar R; Al-Owaidi, Faisal; Zhang, Ping; Brown, Matthew A; Leo, Paul J; Gonda, Thomas J

    2011-06-01

    To gain insight into the mechanisms by which the Myb transcription factor controls normal hematopoiesis and particularly, how it contributes to leukemogenesis, we mapped the genome-wide occupancy of Myb by chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) in ERMYB myeloid progenitor cells. By integrating the genome occupancy data with whole genome expression profiling data, we identified a Myb-regulated transcriptional program. Gene signatures for leukemia stem cells, normal hematopoietic stem/progenitor cells and myeloid development were overrepresented in 2368 Myb regulated genes. Of these, Myb bound directly near or within 793 genes. Myb directly activates some genes known critical in maintaining hematopoietic stem cells, such as Gfi1 and Cited2. Importantly, we also show that, despite being usually considered as a transactivator, Myb also functions to repress approximately half of its direct targets, including several key regulators of myeloid differentiation, such as Sfpi1 (also known as Pu.1), Runx1, Junb and Cebpb. Furthermore, our results demonstrate that interaction with p300, an established coactivator for Myb, is unexpectedly required for Myb-mediated transcriptional repression. We propose that the repression of the above mentioned key pro-differentiation factors may contribute essentially to Myb's ability to suppress differentiation and promote self-renewal, thus maintaining progenitor cells in an undifferentiated state and promoting leukemic transformation.

  17. Fluorescent reporters for markerless genomic integration in Staphylococcus aureus

    PubMed Central

    de Jong, Nienke W. M.; van der Horst, Thijs; van Strijp, Jos A. G.; Nijland, Reindert

    2017-01-01

    We present integration vectors for Staphylococcus aureus encoding the fluorescent reporters mAmetrine, CFP, sGFP, YFP, mCherry and mKate. The expression is driven either from the sarA-P1 promoter or from any other promoter of choice. The reporter can be inserted markerless in the chromosome of a wide range of S. aureus strains. The integration site chosen does not disrupt any open reading frame, provides good expression, and has no detectable effect on the strains physiology. As an intermediate construct, we present a set of replicating plasmids containing the same fluorescent reporters. Also in these reporter plasmids the sarA-P1 promoter can be replaced by any other promoter of interest for expression studies. Cassettes from the replication plasmids can be readily swapped with the integration vector. With these constructs it becomes possible to monitor reporters of separate fluorescent wavelengths simultaneously. PMID:28266573

  18. Homologous recombination maintenance of genome integrity during DNA damage tolerance

    PubMed Central

    Prado, Félix

    2014-01-01

    The DNA strand exchange protein Rad51 provides a safe mechanism for the repair of DNA breaks using the information of a homologous DNA template. Homologous recombination (HR) also plays a key role in the response to DNA damage that impairs the advance of the replication forks by providing mechanisms to circumvent the lesion and fill in the tracks of single-stranded DNA that are generated during the process of lesion bypass. These activities postpone repair of the blocking lesion to ensure that DNA replication is completed in a timely manner. Experimental evidence generated over the last few years indicates that HR participates in this DNA damage tolerance response together with additional error-free (template switch) and error-prone (translesion synthesis) mechanisms through intricate connections, which are presented here. The choice between repair and tolerance, and the mechanism of tolerance, is critical to avoid increased mutagenesis and/or genome rearrangements, which are both hallmarks of cancer. PMID:27308329

  19. Plant Genome DataBase Japan (PGDBj): A Portal Website for the Integration of Plant Genome-Related Databases

    PubMed Central

    Asamizu, Erika; Ichihara, Hisako; Nakaya, Akihiro; Nakamura, Yasukazu; Hirakawa, Hideki; Ishii, Takahiro; Tamura, Takuro; Fukami-Kobayashi, Kaoru; Nakajima, Yukari; Tabata, Satoshi

    2014-01-01

    The Plant Genome DataBase Japan (PGDBj, http://pgdbj.jp/?ln=en) is a portal website that aims to integrate plant genome-related information from databases (DBs) and the literature. The PGDBj is comprised of three component DBs and a cross-search engine, which provides a seamless search over the contents of the DBs. The three DBs are as follows. (i) The Ortholog DB, providing gene cluster information based on the amino acid sequence similarity. Over 500,000 amino acid sequences of 20 Viridiplantae species were subjected to reciprocal BLAST searches and clustered. Sequences from plant genome DBs (e.g. TAIR10 and RAP-DB) were also included in the cluster with a direct link to the original DB. (ii) The Plant Resource DB, integrating the SABRE DB, which provides cDNA and genome sequence resources accumulated and maintained in the RIKEN BioResource Center and National BioResource Projects. (iii) The DNA Marker DB, providing manually or automatically curated information of DNA markers, quantitative trait loci and related linkage maps, from the literature and external DBs. As the PGDBj targets various plant species, including model plants, algae, and crops important as food, fodder and biofuel, researchers in the field of basic biology as well as a wide range of agronomic fields are encouraged to perform searches using DNA sequences, gene names, traits and phenotypes of interest. The PGDBj will return the search results from the component DBs and various types of linked external DBs. PMID:24363285

  20. Multiplex CRISPR/Cas9-based genome engineering enhanced by Drosha-mediated sgRNA-shRNA structure.

    PubMed

    Yan, Qiang; Xu, Kun; Xing, Jiani; Zhang, Tingting; Wang, Xin; Wei, Zehui; Ren, Chonghua; Liu, Zhongtian; Shao, Simin; Zhang, Zhiying

    2016-12-12

    The clustered regularly interspaced short palindromic repeats (CRISPR) system has recently been developed into a powerful genome-editing technology, as it requires only two key components (Cas9 protein and sgRNA) to function and further enables multiplex genome targeting and homology-directed repair (HDR) based precise genome editing in a wide variety of organisms. Here, we report a novel and interesting strategy by using the Drosha-mediated sgRNA-shRNA structure to direct Cas9 for multiplex genome targeting and precise genome editing. For multiplex genome targeting assay, we achieved more than 9% simultaneous mutant efficiency for 3 genomic loci among the puromycin-selected cell clones. By introducing the shRNA against DNA ligase IV gene (LIG4) into the sgRNA-shRNA construct, the HDR-based precise genome editing efficiency was improved as more than 2-fold. Our works provide a useful tool for multiplex and precise genome modifying in mammalian cells.

  1. Multiplex CRISPR/Cas9-based genome engineering enhanced by Drosha-mediated sgRNA-shRNA structure

    PubMed Central

    Yan, Qiang; Xu, Kun; Xing, Jiani; Zhang, Tingting; Wang, Xin; Wei, Zehui; Ren, Chonghua; Liu, Zhongtian; Shao, Simin; Zhang, Zhiying

    2016-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) system has recently been developed into a powerful genome-editing technology, as it requires only two key components (Cas9 protein and sgRNA) to function and further enables multiplex genome targeting and homology-directed repair (HDR) based precise genome editing in a wide variety of organisms. Here, we report a novel and interesting strategy by using the Drosha-mediated sgRNA-shRNA structure to direct Cas9 for multiplex genome targeting and precise genome editing. For multiplex genome targeting assay, we achieved more than 9% simultaneous mutant efficiency for 3 genomic loci among the puromycin-selected cell clones. By introducing the shRNA against DNA ligase IV gene (LIG4) into the sgRNA-shRNA construct, the HDR-based precise genome editing efficiency was improved as more than 2-fold. Our works provide a useful tool for multiplex and precise genome modifying in mammalian cells. PMID:27941919

  2. Various applications of TALEN- and CRISPR/Cas9-mediated homologous recombination to modify the Drosophila genome.

    PubMed

    Yu, Zhongsheng; Chen, Hanqing; Liu, Jiyong; Zhang, Hongtao; Yan, Yan; Zhu, Nannan; Guo, Yawen; Yang, Bo; Chang, Yan; Dai, Fei; Liang, Xuehong; Chen, Yixu; Shen, Yan; Deng, Wu-Min; Chen, Jianming; Zhang, Bo; Li, Changqing; Jiao, Renjie

    2014-04-15

    Modifying the genomes of many organisms is becoming as easy as manipulating DNA in test tubes, which is made possible by two recently developed techniques based on either the customizable DNA binding protein, TALEN, or the CRISPR/Cas9 system. Here, we describe a series of efficient applications derived from these two technologies, in combination with various homologous donor DNA plasmids, to manipulate the Drosophila genome: (1) to precisely generate genomic deletions; (2) to make genomic replacement of a DNA fragment at single nucleotide resolution; and (3) to generate precise insertions to tag target proteins for tracing their endogenous expressions. For more convenient genomic manipulations, we established an easy-to-screen platform by knocking in a white marker through homologous recombination. Further, we provided a strategy to remove the unwanted duplications generated during the "ends-in" recombination process. Our results also indicate that TALEN and CRISPR/Cas9 had comparable efficiency in mediating genomic modifications through HDR (homology-directed repair); either TALEN or the CRISPR/Cas9 system could efficiently mediate in vivo replacement of DNA fragments of up to 5 kb in Drosophila, providing an ideal genetic tool for functional annotations of the Drosophila genome.

  3. p53 isoform Δ133p53 promotes efficiency of induced pluripotent stem cells and ensures genomic integrity during reprogramming

    PubMed Central

    Gong, Lu; Pan, Xiao; Chen, Haide; Rao, Lingjun; Zeng, Yelin; Hang, Honghui; Peng, Jinrong; Xiao, Lei; Chen, Jun

    2016-01-01

    Human induced pluripotent stem (iPS) cells have great potential in regenerative medicine, but this depends on the integrity of their genomes. iPS cells have been found to contain a large number of de novo genetic alterations due to DNA damage response during reprogramming. Thus, to maintain the genetic stability of iPS cells is an important goal in iPS cell technology. DNA damage response can trigger tumor suppressor p53 activation, which ensures genome integrity of reprogramming cells by inducing apoptosis and senescence. p53 isoform Δ133p53 is a p53 target gene and functions to not only antagonize p53 mediated apoptosis, but also promote DNA double-strand break (DSB) repair. Here we report that Δ133p53 is induced in reprogramming. Knockdown of Δ133p53 results 2-fold decrease in reprogramming efficiency, 4-fold increase in chromosomal aberrations, whereas overexpression of Δ133p53 with 4 Yamanaka factors showes 4-fold increase in reprogamming efficiency and 2-fold decrease in chromosomal aberrations, compared to those in iPS cells induced only with 4 Yamanaka factors. Overexpression of Δ133p53 can inhibit cell apoptosis and promote DNA DSB repair foci formation during reprogramming. Our finding demonstrates that the overexpression of Δ133p53 not only enhances reprogramming efficiency, but also results better genetic quality in iPS cells. PMID:27874035

  4. Genomic integration of the full-length dystrophin coding sequence in Duchenne muscular dystrophy induced pluripotent stem cells.

    PubMed

    Farruggio, Alfonso P; Bhakta, Mital S; du Bois, Haley; Ma, Julia; P Calos, Michele

    2017-04-01

    The plasmid vectors that express the full-length human dystrophin coding sequence in human cells was developed. Dystrophin, the protein mutated in Duchenne muscular dystrophy, is extraordinarily large, providing challenges for cloning and plasmid production in Escherichia coli. The authors expressed dystrophin from the strong, widely expressed CAG promoter, along with co-transcribed luciferase and mCherry marker genes useful for tracking plasmid expression. Introns were added at the 3' and 5' ends of the dystrophin sequence to prevent translation in E. coli, resulting in improved plasmid yield. Stability and yield were further improved by employing a lower-copy number plasmid origin of replication. The dystrophin plasmids also carried an attB site recognized by phage phiC31 integrase, enabling the plasmids to be integrated into the human genome at preferred locations by phiC31 integrase. The authors demonstrated single-copy integration of plasmid DNA into the genome and production of human dystrophin in the human 293 cell line, as well as in induced pluripotent stem cells derived from a patient with Duchenne muscular dystrophy. Plasmid-mediated dystrophin expression was also demonstrated in mouse muscle. The dystrophin expression plasmids described here will be useful in cell and gene therapy studies aimed at ameliorating Duchenne muscular dystrophy.

  5. Enhancing the specificity of recombinase-mediated genome engineering through dimer interface redesign.

    PubMed

    Gaj, Thomas; Sirk, Shannon J; Tingle, Ryan D; Mercer, Andrew C; Wallen, Mark C; Barbas, Carlos F

    2014-04-02

    Despite recent advances in genome engineering made possible by the emergence of site-specific endonucleases, there remains a need for tools capable of specifically delivering genetic payloads into the human genome. Hybrid recombinases based on activated catalytic domains derived from the resolvase/invertase family of serine recombinases fused to Cys2-His2 zinc-finger or TAL effector DNA-binding domains are a class of reagents capable of achieving this. The utility of these enzymes, however, has been constrained by their low overall targeting specificity, largely due to the formation of side-product homodimers capable of inducing off-target modifications. Here, we combine rational design and directed evolution to re-engineer the serine recombinase dimerization interface and generate a recombinase architecture that reduces formation of these undesirable homodimers by >500-fold. We show that these enhanced recombinases demonstrate substantially improved targeting specificity in mammalian cells and achieve rates of site-specific integration similar to those previously reported for site-specific nucleases. Additionally, we show that enhanced recombinases exhibit low toxicity and promote the delivery of the human coagulation factor IX and α-galactosidase genes into endogenous genomic loci with high specificity. These results provide a general means for improving hybrid recombinase specificity by protein engineering and illustrate the potential of these enzymes for basic research and therapeutic applications.

  6. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia.

    PubMed

    Williams, Anna V; Miller, Joseph T; Small, Ian; Nevill, Paul G; Boykin, Laura M

    2016-03-01

    Combining whole genome data with previously obtained amplicon sequences has the potential to increase the resolution of phylogenetic analyses, particularly at low taxonomic levels or where recent divergence, rapid speciation or slow genome evolution has resulted in limited sequence variation. However, the integration of these types of data for large scale phylogenetic studies has rarely been investigated. Here we conduct a phylogenetic analysis of the whole chloroplast genome and two nuclear ribosomal loci for 65 Acacia species from across the most recent Acacia phylogeny. We then combine this data with previously generated amplicon sequences (four chloroplast loci and two nuclear ribosomal loci) for 508 Acacia species. We use several phylogenetic methods, including maximum likelihood bootstrapping (with and without constraint) and ExaBayes, in order to determine the success of combining a dataset of 4000bp with one of 189,000bp. The results of our study indicate that the inclusion of whole genome data gave a far better resolved and well supported representation of the phylogenetic relationships within Acacia than using only amplicon sequences, with the greatest support observed when using a whole genome phylogeny as a constraint on the amplicon sequences. Our study therefore provides methods for optimal integration of genomic and amplicon sequences.

  7. An integrated CRISPR Bombyx mori genome editing system with improved efficiency and expanded target sites.

    PubMed

    Ma, Sanyuan; Liu, Yue; Liu, Yuanyuan; Chang, Jiasong; Zhang, Tong; Wang, Xiaogang; Shi, Run; Lu, Wei; Xia, Xiaojuan; Zhao, Ping; Xia, Qingyou

    2017-02-09

    Genome editing enabled unprecedented new opportunities for targeted genomic engineering of a wide variety of organisms ranging from microbes, plants, animals and even human embryos. The serial establishing and rapid applications of genome editing tools significantly accelerated Bombyx mori (B. mori) research during the past years. However, the only CRISPR system in B. mori was the commonly used SpCas9, which only recognize target sites containing NGG PAM sequence. In the present study, we first improve the efficiency of our previous established SpCas9 system by 3.5 folds. The improved high efficiency was also observed at several loci in both BmNs cells and B. mori embryos. Then to expand the target sites, we showed that two newly discovered CRISPR system, SaCas9 and AsCpf1, could also induce highly efficient site-specific genome editing in BmNs cells, and constructed an integrated CRISPR system. Genome-wide analysis of targetable sites was further conducted and showed that the integrated system cover 69,144,399 sites in B. mori genome, and one site could be found in every 6.5 bp. The efficiency and resolution of this CRISPR platform will probably accelerate both fundamental researches and applicable studies in B. mori, and perhaps other insects.

  8. Prolonged Integration Site Selection of a Lentiviral Vector in the Genome of Human Keratinocytes

    PubMed Central

    Qian, Wei; Wang, Yong; Li, Rui-fu; Zhou, Xin; Liu, Jing; Peng, Dai-zhi

    2017-01-01

    Background Lentiviral vectors have been successfully used for human skin cell gene transfer studies. Defining the selection of integration sites for retroviral vectors in the host genome is crucial in risk assessment analysis of gene therapy. However, genome-wide analyses of lentiviral integration sites in human keratinocytes, especially after prolonged growth, are poorly understood. Material/Methods In this study, 874 unique lentiviral vector integration sites in human HaCaT keratinocytes after long-term culture were identified and analyzed with the online tool GTSG-QuickMap and SPSS software. Results The data indicated that lentiviral vectors showed integration site preferences for genes and gene-rich regions. Conclusions This study will likely assist in determining the relative risks of the lentiviral vector system and in the design of a safe lentiviral vector system in the gene therapy of skin diseases. PMID:28255155

  9. An integrated approach for analyzing clinical genomic variant data from next-generation sequencing.

    PubMed

    Crowgey, Erin L; Stabley, Deborah L; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M; Polson, Shawn W; Sol-Church, Katia; Wu, Cathy H

    2015-04-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource's iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease.

  10. Dynamic Interplay between Nucleoid Segregation and Genome Integrity in Chlamydomonas Chloroplasts1[OPEN

    PubMed Central

    Odahara, Masaki; Kobayashi, Yusuke; Shikanai, Toshiharu; Nishimura, Yoshiki

    2016-01-01

    The chloroplast (cp) genome is organized as nucleoids that are dispersed throughout the cp stroma. Previously, a cp homolog of bacterial recombinase RecA (cpRECA) was shown to be involved in the maintenance of cp genome integrity by repairing damaged chloroplast DNA and by suppressing aberrant recombination between short dispersed repeats in the moss Physcomitrella patens. Here, overexpression and knockdown analysis of cpRECA in the green alga Chlamydomonas reinhardtii revealed that cpRECA was involved in cp nucleoid dynamics as well as having a role in maintaining cp genome integrity. Overexpression of cpRECA tagged with yellow fluorescent protein or hemagglutinin resulted in the formation of giant filamentous structures that colocalized exclusively to chloroplast DNA and cpRECA localized to cp nucleoids in a heterogenous manner. Knockdown of cpRECA led to a significant reduction in cp nucleoid number that was accompanied by nucleoid enlargement. This phenotype resembled those of gyrase inhibitor-treated cells and monokaryotic chloroplast mutant cells and suggested that cpRECA was involved in organizing cp nucleoid dynamics. The cp genome also was destabilized by induced recombination between short dispersed repeats in cpRECA-knockdown cells and gyrase inhibitor-treated cells. Taken together, these results suggest that cpRECA and gyrase are both involved in nucleoid dynamics and the maintenance of genome integrity and that the mechanisms underlying these processes may be intimately related in C. reinhardtii cps. PMID:27756821

  11. An Integrated Approach for Analyzing Clinical Genomic Variant Data from Next-Generation Sequencing

    PubMed Central

    Stabley, Deborah L.; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M.; Polson, Shawn W.; Sol-Church, Katia; Wu, Cathy H.

    2015-01-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource’s iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease. PMID:25649353

  12. PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.

    PubMed

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X

    2017-01-01

    Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.

  13. Integrative genomics--a basic and essential tool for the development of molecular medicine.

    PubMed

    Ostrowski, Jerzy

    2008-01-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, and usually on the scale of single genes. Medicine in the post-genomic era will utilize thousands of molecular markers associated with disease that are provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical analyses and bioinformatic modeling of biological systems. The collecting, cataloging and comparison of data from molecular studies and the subsequent development of conclusions create the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm called integrative genomics.

  14. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

    PubMed Central

    Irizarry, Kristopher J. L.; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L.; Barrett, Gini; Barr, Margaret C.

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076

  15. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    PubMed

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  16. Decoding the genome with an integrative analysis tool: combinatorial CRM Decoder.

    PubMed

    Kang, Keunsoo; Kim, Joomyeong; Chung, Jae Hoon; Lee, Daeyoup

    2011-09-01

    The identification of genome-wide cis-regulatory modules (CRMs) and characterization of their associated epigenetic features are fundamental steps toward the understanding of gene regulatory networks. Although integrative analysis of available genome-wide information can provide new biological insights, the lack of novel methodologies has become a major bottleneck. Here, we present a comprehensive analysis tool called combinatorial CRM decoder (CCD), which utilizes the publicly available information to identify and characterize genome-wide CRMs in a species of interest. CCD first defines a set of the epigenetic features which is significantly associated with a set of known CRMs as a code called 'trace code', and subsequently uses the trace code to pinpoint putative CRMs throughout the genome. Using 61 genome-wide data sets obtained from 17 independent mouse studies, CCD successfully catalogued ∼12 600 CRMs (five distinct classes) including polycomb repressive complex 2 target sites as well as imprinting control regions. Interestingly, we discovered that ∼4% of the identified CRMs belong to at least two different classes named 'multi-functional CRM', suggesting their functional importance for regulating spatiotemporal gene expression. From these examples, we show that CCD can be applied to any potential genome-wide datasets and therefore will shed light on unveiling genome-wide CRMs in various species.

  17. Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

    PubMed

    Bibby, Kyle

    2014-02-01

    The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.

  18. Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis.

    PubMed

    Low, Teck Yew; van Heesch, Sebastiaan; van den Toorn, Henk; Giansanti, Piero; Cristobal, Alba; Toonen, Pim; Schafer, Sebastian; Hübner, Norbert; van Breukelen, Bas; Mohammed, Shabaz; Cuppen, Edwin; Heck, Albert J R; Guryev, Victor

    2013-12-12

    Quantitative and qualitative protein characteristics are regulated at genomic, transcriptomic, and posttranscriptional levels. Here, we integrated in-depth transcriptome and proteome analyses of liver tissues from two rat strains to unravel the interactions within and between these layers. We obtained peptide evidence for 26,463 rat liver proteins. We validated 1,195 gene predictions, 83 splice events, 126 proteins with nonsynonymous variants, and 20 isoforms with nonsynonymous RNA editing. Quantitative RNA sequencing and proteomics data correlate highly between strains but poorly among each other, indicating extensive nongenetic regulation. Our multilevel analysis identified a genomic variant in the promoter of the most differentially expressed gene Cyp17a1, a previously reported top hit in genome-wide association studies for human hypertension, as a potential contributor to the hypertension phenotype in SHR rats. These results demonstrate the power of and need for integrative analysis for understanding genetic control of molecular dynamics and phenotypic diversity in a system-wide manner.

  19. Rice TOGO Browser: A platform to retrieve integrated information on rice functional and applied genomics.

    PubMed

    Nagamura, Yoshiaki; Antonio, Baltazar A; Sato, Yutaka; Miyao, Akio; Namiki, Nobukazu; Yonemaru, Jun-ichi; Minami, Hiroshi; Kamatsuki, Kaori; Shimura, Kan; Shimizu, Yuji; Hirochika, Hirohiko

    2011-02-01

    The Rice TOGO Browser is an online public resource designed to facilitate integration and visualization of mapping data of bacterial artificial chromosome (BAC)/P1-derived artificial chromosome (PAC) clones, genes, restriction fragment length polymorphism (RFLP)/simple sequence repeat (SSR) markers and phenotype data represented as quantitative trait loci (QTLs) onto the genome sequence, and to provide a platform for more efficient utilization of genome information from the point of view of applied genomics as well as functional genomics. Three search options, namely keyword search, region search and trait search, generate various types of data in a user-friendly interface with three distinct viewers, a chromosome viewer, an integrated map viewer and a sequence viewer, thereby providing the opportunity to view the position of genes and/or QTLs at the chromosomal level and to retrieve any sequence information in a user-defined genome region. Furthermore, the gene list, marker list and genome sequence in a specified region delineated by RFLP/SSR markers and any sequences designed as primers can be viewed and downloaded to support forward genetics approaches. An additional feature of this database is the graphical viewer for BLAST search to reveal information not only for regions with significant sequence similarity but also for regions adjacent to those with similarity but with no hits between sequences. An easy to use and intuitive user interface can help a wide range of users in retrieving integrated mapping information including agronomically important traits on the rice genome sequence. The database can be accessed at http://agri-trait.dna.affrc.go.jp/.

  20. Barriers and potential solutions for Critical Zone data integration between environmental genomics and the geosciences

    NASA Astrophysics Data System (ADS)

    Aronson, E. L.; Meyer, F.; Packman, A. I.; Mayorga, E.

    2015-12-01

    The Earth's permeable near-surface layer from bedrock to canopy is referred to as the Critical Zone (CZ). Integration of bio- and geoscience data is critical for understanding physical, biological and chemical interactions in the CZ. Genomic and meta-genomic scientists study organisms both in laboratory settings and in the environment, in order to understand the interactions of organisms with the environment. Geoscientists are using environmental data to describe and model dynamics of physical and chemical properties. Yet, there is no agreed upon method for integrating genomic and environmental data to address interactions of living and non-living components of the CZ. There are standards for data interchange being developed in the geosciences and genomics sciences, via standards organization such as the Open Geospatial Consortium (OGC), as well as by research communities in biogeochemistry, hydrology, climatology, and other fields. These are in parallel to, but typically not in coordination with the standards the Genomics Standards Consortium (GSC) is developing for genomics. In addition, efforts are being made to allow for intercompatability of these CZ data with data generated by NEON, Inc. The interoperability of these types of data is limited with current software and cyberinfrastructure. A group of CZ geoscientists, environmental genomic scientists and cyberinfrastructure scientists are coming together to develop a set of common data collection and integration methods and sets of common standards. The data generated by this effort across multiple CZ sites (including the US CZ Observatories, or CZOs) around the world, along with NEON facility data, will be used to test EarthCube (an NSF initiative to develop cyberinfrastructure for the geosciences) cyberinfrastructure, with the goal of bridging this gap in standards and interoperability. Potential solutions to these issues of interoperability will be presented, and a way forward will be described.

  1. [Prolonging the vase life of carnation "Mabel" through integrating repeated ACC oxidase genes into its genome].

    PubMed

    Yu, Yi-Xun; Bao, Man-Zhu

    2004-10-01

    Carnation (Dianthus caryophyllus L.) is one of the most important cut flowers. The cultivar "Mabel" of carnation was transformed with direct repeat gene of ACC oxidase, the key enzyme in ethylene synthesis, driven by the CaMV35S promoter mediated by Agrobacterium tumefacien. Hygromycin phosphotransferase (HPT) gene was used as selection marker. Leaf explants were pre-cultured on shoot-inducing medium for 2 d, then immersed in Agrobacterium suspension for 8-12 min. Co-cultivation was carried out on the medium (MS+BA 1.0 mg/L+NAA 0.3 mg/L +Acetosyringone 100 micromol/L, pH 5.8-6.0) for 3 d. After that transformants were obtained by transferring explants to selection medium supplemented with 5 mg/L hygromycin (Hyg) and 400 mg/L cefotaxime (Cef). Southern blotting detection showed that a foreign gene was integrated into the carnation genome and 3 transgenic lines (T257, T299 and T273 line) obtained. Addition of acetosyringone and the time of co-culture were the main factors that influenced transformation frequency. After being transplanted to soil, transgenic plants were grew normally in greenhouse. Ethylene production of cut flower of transgenic T257 line was 95% lower than that of the control, and that of T299 line was reduced by 90% than that of the control, while that of transgenic T273 line has no of significantly different from control. Vase life of transgenic T257 line was 5 d longer than that of the control line at 25 degrees C.

  2. Filling the knowledge gap: Integrating quantitative genetics and genomics in graduate education and outreach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomics revolution provides vital tools to address global food security. Yet to be incorporated into livestock breeding, molecular techniques need to be integrated into a quantitative genetics framework. Within the U.S., with shrinking faculty numbers with the requisite skills, the capacity to ...

  3. Integrated and translational genomics for analysis of complex traits in crops

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report here on integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of translating gems from these resources into useable DNA markers in the ...

  4. Integrating functional genomics to accelerate mechanistic personalized medicine

    PubMed Central

    Tyner, Jeffrey W.

    2017-01-01

    The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns. PMID:28299357

  5. Dissecting the brown adipogenic regulatory network using integrative genomics

    PubMed Central

    Pradhan, Rachana N.; Bues, Johannes J.; Gardeux, Vincent; Schwalie, Petra C.; Alpern, Daniel; Chen, Wanze; Russeil, Julie; Raghav, Sunil K.; Deplancke, Bart

    2017-01-01

    Brown adipocytes regulate energy expenditure via mitochondrial uncoupling, which makes them attractive therapeutic targets to tackle obesity. However, the regulatory mechanisms underlying brown adipogenesis are still poorly understood. To address this, we profiled the transcriptome and chromatin state during mouse brown fat cell differentiation, revealing extensive gene expression changes and chromatin remodeling, especially during the first day post-differentiation. To identify putatively causal regulators, we performed transcription factor binding site overrepresentation analyses in active chromatin regions and prioritized factors based on their expression correlation with the bona-fide brown adipogenic marker Ucp1 across multiple mouse and human datasets. Using loss-of-function assays, we evaluated both the phenotypic effect as well as the transcriptomic impact of several putative regulators on the differentiation process, uncovering ZFP467, HOXA4 and Nuclear Factor I A (NFIA) as novel transcriptional regulators. Of these, NFIA emerged as the regulator yielding the strongest molecular and cellular phenotypes. To examine its regulatory function, we profiled the genomic localization of NFIA, identifying it as a key early regulator of terminal brown fat cell differentiation. PMID:28181539

  6. Integrating functional genomics to accelerate mechanistic personalized medicine.

    PubMed

    Tyner, Jeffrey W

    2017-03-01

    The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns.

  7. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology.

    PubMed

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho; Park, Kwang-Wook

    2016-03-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs.

  8. Tc1-like Transposase Thm3 of Silver Carp (Hypophthalmichthys molitrix) Can Mediate Gene Transposition in the Genome of Blunt Snout Bream (Megalobrama amblycephala).

    PubMed

    Guo, Xiu-Ming; Zhang, Qian-Qian; Sun, Yi-Wen; Jiang, Xia-Yun; Zou, Shu-Ming

    2015-10-04

    Tc1-like transposons consist of an inverted repeat sequence flanking a transposase gene that exhibits similarity to the mobile DNA element, Tc1, of the nematode, Caenorhabditis elegans. They are widely distributed within vertebrate genomes including teleost fish; however, few active Tc1-like transposases have been discovered. In this study, 17 Tc1-like transposon sequences were isolated from 10 freshwater fish species belonging to the families Cyprinidae, Adrianichthyidae, Cichlidae, and Salmonidae. We conducted phylogenetic analyses of these sequences using previously isolated Tc1-like transposases and report that 16 of these elements comprise a new subfamily of Tc1-like transposons. In particular, we show that one transposon, Thm3 from silver carp (Hypophthalmichthys molitrix; Cyprinidae), can encode a 335-aa transposase with apparently intact domains, containing three to five copies in its genome. We then coinjected donor plasmids harboring 367 bp of the left end and 230 bp of the right end of the nonautonomous silver carp Thm1 cis-element along with capped Thm3 transposase RNA into the embryos of blunt snout bream (Megalobrama amblycephala; one- to two-cell embryos). This experiment revealed that the average integration rate could reach 50.6% in adult fish. Within the blunt snout bream genome, the TA dinucleotide direct repeat, which is the signature of Tc1-like family of transposons, was created adjacent to both ends of Thm1 at the integration sites. Our results indicate that the silver carp Thm3 transposase can mediate gene insertion by transposition within the genome of blunt snout bream genome, and that this occurs with a TA position preference.

  9. Increasing the Efficiency of CRISPR/Cas9-mediated Precise Genome Editing of HSV-1 Virus in Human Cells

    PubMed Central

    Lin, Chaolong; Li, Huanhuan; Hao, Mengru; Xiong, Dan; Luo, Yong; Huang, Chenghao; Yuan, Quan; Zhang, Jun; Xia, Ningshao

    2016-01-01

    Genetically modified HSV-1 viruses serve as promising vectors for tumour therapy and vaccine development. The CRISPR/Cas9 system is one of the most powerful tools for precise gene editing of the genomes of organisms. However, whether the CRISPR/Cas9 system can precisely and efficiently make gene replacements in the genome of HSV-1 remains essentially unknown. Here, we reported CRISPR/Cas9-mediated editing of the HSV-1 genome in human cells, including the knockout and replacement of large genes. In established cells stably expressing CRISPR/Cas9, gRNA in coordination with Cas9 could direct a precise cleavage within a pre-defined target region, and foreign genes were successfully used to replace the target gene seamlessly by HDR-mediated gene replacement. Introducing the NHEJ inhibitor SCR7 to the CRISPR/Cas9 system greatly facilitated HDR-mediated gene replacement in the HSV-1 genome. We provided the first genetic evidence that two copies of the ICP0 gene in different locations on the same HSV-1 genome could be simultaneously modified with high efficiency and with no off-target modifications. We also developed a revolutionized isolation platform for desired recombinant viruses using single-cell sorting. Together, our work provides a significantly improved method for targeted editing of DNA viruses, which will facilitate the development of anti-cancer oncolytic viruses and vaccines. PMID:27713537

  10. Increasing the Efficiency of CRISPR/Cas9-mediated Precise Genome Editing of HSV-1 Virus in Human Cells.

    PubMed

    Lin, Chaolong; Li, Huanhuan; Hao, Mengru; Xiong, Dan; Luo, Yong; Huang, Chenghao; Yuan, Quan; Zhang, Jun; Xia, Ningshao

    2016-10-07

    Genetically modified HSV-1 viruses serve as promising vectors for tumour therapy and vaccine development. The CRISPR/Cas9 system is one of the most powerful tools for precise gene editing of the genomes of organisms. However, whether the CRISPR/Cas9 system can precisely and efficiently make gene replacements in the genome of HSV-1 remains essentially unknown. Here, we reported CRISPR/Cas9-mediated editing of the HSV-1 genome in human cells, including the knockout and replacement of large genes. In established cells stably expressing CRISPR/Cas9, gRNA in coordination with Cas9 could direct a precise cleavage within a pre-defined target region, and foreign genes were successfully used to replace the target gene seamlessly by HDR-mediated gene replacement. Introducing the NHEJ inhibitor SCR7 to the CRISPR/Cas9 system greatly facilitated HDR-mediated gene replacement in the HSV-1 genome. We provided the first genetic evidence that two copies of the ICP0 gene in different locations on the same HSV-1 genome could be simultaneously modified with high efficiency and with no off-target modifications. We also developed a revolutionized isolation platform for desired recombinant viruses using single-cell sorting. Together, our work provides a significantly improved method for targeted editing of DNA viruses, which will facilitate the development of anti-cancer oncolytic viruses and vaccines.

  11. Cerebral White Matter Integrity Mediates Adult Age Differences in Cognitive Performance

    ERIC Educational Resources Information Center

    Madden, David J.; Spaniol, Julia; Costello, Matthew C.; Bucur, Barbara; White, Leonard E.; Cabeza, Roberto; Davis, Simon W.; Dennis, Nancy A.; Provenzale, James M.; Huettel, Scott A.

    2009-01-01

    Previous research has established that age-related decline occurs in measures of cerebral white matter integrity, but the role of this decline in age-related cognitive changes is not clear. To conclude that white matter integrity has a mediating (causal) contribution, it is necessary to demonstrate that statistical control of the white…

  12. A Genome-Wide Analysis of Promoter-Mediated Phenotypic Noise in Escherichia coli

    PubMed Central

    Silander, Olin K.; Nikolic, Nela; Zaslaver, Alon; Bren, Anat; Kikoin, Ilya; Alon, Uri; Ackermann, Martin

    2012-01-01

    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as “phenotypic noise.” In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone. PMID:22275871

  13. A genome-wide analysis of promoter-mediated phenotypic noise in Escherichia coli.

    PubMed

    Silander, Olin K; Nikolic, Nela; Zaslaver, Alon; Bren, Anat; Kikoin, Ilya; Alon, Uri; Ackermann, Martin

    2012-01-01

    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as "phenotypic noise." In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone.

  14. Integrative exploration of genomic profiles for triple negative breast cancer identifies potential drug targets

    PubMed Central

    Wang, Xiaosheng; Guda, Chittibabu

    2016-01-01

    Abstract Background: Triple negative breast cancer (TNBC) is high-risk due to its rapid drug resistance and recurrence, metastasis, and lack of targeted therapy. So far, no molecularly targeted therapeutic agents have been clinically approved for TNBC. It is imperative that we discover new targets for TNBC therapy. Objectives: A large volume of cancer genomics data are emerging and advancing breast cancer research. We may integrate different types of TNBC genomic data to discover molecular targets for TNBC therapy. Data sources: We used publicly available TNBC tumor tissue genomic data in the Cancer Genome Atlas database in this study. Methods: We integratively explored genomic profiles (gene expression, copy number, methylation, microRNA [miRNA], and gene mutation) in TNBC and identified hyperactivated genes that have higher expression, more copy numbers, lower methylation level, or are targets of miRNAs with lower expression in TNBC than in normal samples. We ranked the hyperactivated genes into different levels based on all the genomic evidence and performed functional analyses of the sets of genes identified. More importantly, we proposed potential molecular targets for TNBC therapy based on the hyperactivated genes. Results: Some of the genes we identified such as FGFR2, MAPK13, TP53, SRC family, MUC family, and BCL2 family have been suggested to be potential targets for TNBC treatment. Others such as CSF1R, EPHB3, TRIB1, and LAD1 could be promising new targets for TNBC treatment. By utilizing this integrative analysis of genomic profiles for TNBC, we hypothesized that some of the targeted treatment strategies for TNBC currently in development are more likely to be promising, such as poly (ADP-ribose) polymerase inhibitors, while the others are more likely to be discouraging, such as angiogenesis inhibitors. Limitations: The findings in this study need to be experimentally validated in the future. Conclusion: This is a systematic study that combined 5

  15. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    PubMed Central

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  16. GAPIT Version 2: An Enhanced Integrated Tool for Genomic Association and Prediction.

    PubMed

    Tang, You; Liu, Xiaolei; Wang, Jiabo; Li, Meng; Wang, Qishan; Tian, Feng; Su, Zhongbin; Pan, Yuchun; Liu, Di; Lipka, Alexander E; Buckler, Edward S; Zhang, Zhiwu

    2016-07-01

    Most human diseases and agriculturally important traits are complex. Dissecting their genetic architecture requires continued development of innovative and powerful statistical methods. Corresponding advances in computing tools are critical to efficiently use these statistical innovations and to enhance and accelerate biomedical and agricultural research and applications. The genome association and prediction integrated tool (GAPIT) was first released in 2012 and became widely used for genome-wide association studies (GWAS) and genomic prediction. The GAPIT implemented computationally efficient statistical methods, including the compressed mixed linear model (CMLM) and genomic prediction by using genomic best linear unbiased prediction (gBLUP). New state-of-the-art statistical methods have now been implemented in a new, enhanced version of GAPIT. These methods include factored spectrally transformed linear mixed models (FaST-LMM), enriched CMLM (ECMLM), FaST-LMM-Select, and settlement of mixed linear models under progressively exclusive relationship (SUPER). The genomic prediction methods implemented in this new release of the GAPIT include gBLUP based on CMLM, ECMLM, and SUPER. Additionally, the GAPIT was updated to improve its existing output display features and to add new data display and evaluation functions, including new graphing options and capabilities, phenotype simulation, power analysis, and cross-validation. These enhancements make the GAPIT a valuable resource for determining appropriate experimental designs and performing GWAS and genomic prediction. The enhanced R-based GAPIT software package uses state-of-the-art methods to conduct GWAS and genomic prediction. The GAPIT also provides new functions for developing experimental designs and creating publication-ready tabular summaries and graphs to improve the efficiency and application of genomic research.

  17. The REST remodeling complex protects genomic integrity during embryonic neurogenesis

    PubMed Central

    Nechiporuk, Tamilla; McGann, James; Mullendorff, Karin; Hsieh, Jenny; Wurst, Wolfgang; Floss, Thomas; Mandel, Gail

    2016-01-01

    The timely transition from neural progenitor to post-mitotic neuron requires down-regulation and loss of the neuronal transcriptional repressor, REST. Here, we have used mice containing a gene trap in the Rest gene, eliminating transcription from all coding exons, to remove REST prematurely from neural progenitors. We find that catastrophic DNA damage occurs during S-phase of the cell cycle, with long-term consequences including abnormal chromosome separation, apoptosis, and smaller brains. Persistent effects are evident by latent appearance of proneural glioblastoma in adult mice deleted additionally for the tumor suppressor p53 protein (p53). A previous line of mice deleted for REST in progenitors by conventional gene targeting does not exhibit these phenotypes, likely due to a remaining C-terminal peptide that still binds chromatin and recruits co-repressors. Our results suggest that REST-mediated chromatin remodeling is required in neural progenitors for proper S-phase dynamics, as part of its well-established role in repressing neuronal genes until terminal differentiation. DOI: http://dx.doi.org/10.7554/eLife.09584.001 PMID:26745185

  18. Molecular Characterization of Pediatric Restrictive Cardiomyopathy from Integrative Genomics.

    PubMed

    Rindler, Tara N; Hinton, Robert B; Salomonis, Nathan; Ware, Stephanie M

    2017-01-18

    Pediatric restrictive cardiomyopathy (RCM) is a genetically heterogeneous heart disease with limited therapeutic options. RCM cases are largely idiopathic; however, even within families with a known genetic cause for cardiomyopathy, there is striking variability in disease severity. Although accumulating evidence implicates both gene expression and alternative splicing in development of dilated cardiomyopathy (DCM), there have been no detailed molecular characterizations of underlying pathways dysregulated in RCM. RNA-Seq on a cohort of pediatric RCM patients compared to other forms of adult cardiomyopathy and controls identified transcriptional differences highly common to the cardiomyopathies, as well as those unique to RCM. Transcripts selectively induced in RCM include many known and novel G-protein coupled receptors linked to calcium handling and contractile regulation. In-depth comparisons of alternative splicing revealed splicing events shared among cardiomyopathy subtypes, as well as those linked solely to RCM. Genes identified with altered alternative splicing implicate RBM20, a DCM splicing factor, as a potential mediator of alternative splicing in RCM. We present the first comprehensive report on molecular pathways dysregulated in pediatric RCM including unique/shared pathways identified compared to other cardiomyopathy subtypes and demonstrate that disruption of alternative splicing patterns in pediatric RCM occurs in the inverse direction as DCM.

  19. Molecular Characterization of Pediatric Restrictive Cardiomyopathy from Integrative Genomics

    PubMed Central

    Rindler, Tara N.; Hinton, Robert B.; Salomonis, Nathan; Ware, Stephanie M.

    2017-01-01

    Pediatric restrictive cardiomyopathy (RCM) is a genetically heterogeneous heart disease with limited therapeutic options. RCM cases are largely idiopathic; however, even within families with a known genetic cause for cardiomyopathy, there is striking variability in disease severity. Although accumulating evidence implicates both gene expression and alternative splicing in development of dilated cardiomyopathy (DCM), there have been no detailed molecular characterizations of underlying pathways dysregulated in RCM. RNA-Seq on a cohort of pediatric RCM patients compared to other forms of adult cardiomyopathy and controls identified transcriptional differences highly common to the cardiomyopathies, as well as those unique to RCM. Transcripts selectively induced in RCM include many known and novel G-protein coupled receptors linked to calcium handling and contractile regulation. In-depth comparisons of alternative splicing revealed splicing events shared among cardiomyopathy subtypes, as well as those linked solely to RCM. Genes identified with altered alternative splicing implicate RBM20, a DCM splicing factor, as a potential mediator of alternative splicing in RCM. We present the first comprehensive report on molecular pathways dysregulated in pediatric RCM including unique/shared pathways identified compared to other cardiomyopathy subtypes and demonstrate that disruption of alternative splicing patterns in pediatric RCM occurs in the inverse direction as DCM. PMID:28098235

  20. Genome3D: A viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome

    PubMed Central

    2010-01-01

    Background New technologies are enabling the measurement of many types of genomic and epigenomic information at scales ranging from the atomic to nuclear. Much of this new data is increasingly structural in nature, and is often difficult to coordinate with other data sets. There is a legitimate need for integrating and visualizing these disparate data sets to reveal structural relationships not apparent when looking at these data in isolation. Results We have applied object-oriented technology to develop a downloadable visualization tool, Genome3D, for integrating and displaying epigenomic data within a prescribed three-dimensional physical model of the human genome. In order to integrate and visualize large volume of data, novel statistical and mathematical approaches have been developed to reduce the size of the data. To our knowledge, this is the first such tool developed that can visualize human genome in three-dimension. We describe here the major features of Genome3D and discuss our multi-scale data framework using a representative basic physical model. We then demonstrate many of the issues and benefits of multi-resolution data integration. Conclusions Genome3D is a software visualization tool that explores a wide range of structural genomic and epigenetic data. Data from various sources of differing scales can be integrated within a hierarchical framework that is easily adapted to new developments concerning the structure of the physical genome. In addition, our tool has a simple annotation mechanism to incorporate non-structural information. Genome3D is unique is its ability to manipulate large amounts of multi-resolution data from diverse sources to uncover complex and new structural relationships within the genome. PMID:20813045

  1. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    PubMed Central

    2011-01-01

    Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR) relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data. PMID:21943338

  2. Annotation of the zebrafish genome through an integrated transcriptomic and proteomic analysis.

    PubMed

    Kelkar, Dhanashree S; Provost, Elayne; Chaerkady, Raghothama; Muthusamy, Babylakshmi; Manda, Srikanth S; Subbannayya, Tejaswini; Selvan, Lakshmi Dhevi N; Wang, Chieh-Huei; Datta, Keshava K; Woo, Sunghee; Dwivedi, Sutopa B; Renuse, Santosh; Getnet, Derese; Huang, Tai-Chung; Kim, Min-Sik; Pinto, Sneha M; Mitchell, Christopher J; Madugundu, Anil K; Kumar, Praveen; Sharma, Jyoti; Advani, Jayshree; Dey, Gourav; Balakrishnan, Lavanya; Syed, Nazia; Nanjappa, Vishalakshi; Subbannayya, Yashwanth; Goel, Renu; Prasad, T S Keshava; Bafna, Vineet; Sirdeshmukh, Ravi; Gowda, Harsha; Wang, Charles; Leach, Steven D; Pandey, Akhilesh

    2014-11-01

    Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼ 69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes.

  3. Integration of banana streak badnavirus into the Musa genome: molecular and cytogenetic evidence.

    PubMed

    Harper, G; Osuji, J O; Heslop-Harrison, J S; Hull, R

    1999-03-15

    Breeding and tissue culture of certain cultivars of bananas (Musa) have led to high levels of banana streak badnavirus (BSV) infection in progeny from symptomless parents. BSV DNA hybridized to genomic DNA of one such parent, Obino l'Ewai, suggesting integration of viral sequences. Sequencing of clones of Obino l'Ewai genomic DNA revealed an interface between BSV and Musa sequences and a complex BSV integrant. In situ hybridization revealed two different BSV sequence locations in Obino l'Ewai chromosomes and a complex arrangement of BSV and Musa sequences was shown by probing stretched DNA fibers. This is the first report of integrated sequences that possibly lead to a plant pararetrovirus episomal infection by a mechanism differing markedly from animal retroviral systems.

  4. New Insights into the Classification and Integration Specificity of Streptococcus Integrative Conjugative Elements through Extensive Genome Exploration.

    PubMed

    Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie

    2015-01-01

    Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent.

  5. GnpIS: an information system to integrate genetic and genomic data from plants and fungi

    PubMed Central

    Steinbach, Delphine; Alaux, Michael; Amselem, Joelle; Choisne, Nathalie; Durand, Sophie; Flores, Raphaël; Keliet, Aminah-Olivia; Kimmel, Erik; Lapalu, Nicolas; Luyten, Isabelle; Michotey, Célia; Mohellibi, Nacer; Pommier, Cyril; Reboux, Sébastien; Valdenaire, Dorothée; Verdelet, Daphné; Quesneville, Hadi

    2013-01-01

    Data integration is a key challenge for modern bioinformatics. It aims to provide biologists with tools to explore relevant data produced by different studies. Large-scale international projects can generate lots of heterogeneous and unrelated data. The challenge is to integrate this information with other publicly available data. Nucleotide sequencing throughput has been improved with new technologies; this increases the need for powerful information systems able to store, manage and explore data. GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, markers, single nucleotide polymorphisms, germplasms and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Institute for Agricultural Research. Here, we illustrate its use. Database URL: http://urgi.versailles.inra.fr/gnpis PMID:23959375

  6. Integration of Brassica A genome genetic linkage map between Brassica napus and B. rapa.

    PubMed

    Suwabe, Keita; Morgan, Colin; Bancroft, Ian

    2008-03-01

    An integrated linkage map between B. napus and B. rapa was constructed based on a total of 44 common markers comprising 41 SSR (33 BRMS, 6 Saskatoon, and 2 BBSRC) and 3 SNP/indel markers. Between 3 and 7 common markers were mapped onto each of the linkage groups A1 to A10. The position and order of most common markers revealed a high level of colinearity between species, although two small regions on A4, A5, and A10 revealed apparent local inversions between them. These results indicate that the A genome of Brassica has retained a high degree of colinearity between species, despite each species having evolved independently after the integration of the A and C genomes in the amphidiploid state. Our results provide a genetic integration of the Brassica A genome between B. napus and B. rapa. As the analysis employed sequence-based molecular markers, the information will accelerate the exploitation of the B. rapa genome sequence for the improvement of oilseed rape.

  7. Integrative physiology, functional genomics and the phenotype gap: a guide for comparative physiologists.

    PubMed

    Dow, Julian A T

    2007-05-01

    Classical, curiosity-led comparative physiology finds itself at a crossroads. Major funding for classical physiology is becoming harder to find, as grant agencies focus on more molecular approaches or on science with more immediate strategic value to their respective countries. In turn, this shift in funding places Zoology and Animal Science departments under enormous stress: student numbers are buoyant, but how can research funding be maintained at high levels? Our research group has argued for the redefinition of integrative physiology as the investigation of gene function in an organotypic context in the intact animal. Implicit in this definition is the use of transgenics and reverse genetics to manipulate gene function in a cell-specific manner; this in turn implies the use of a genetically tractable 'model organism'. The significance of this definition is that it aligns integrative physiology with functional genomics. Again, functional genomics draws heavily on reverse genetics to elucidate the function of novel genes. The phenotype gap (the mismatch between what a genetic model organism's genome encodes and the reasons that it has historically been studied) emphasises the need to attract and empower functional biologists: can all 13,500 genes in Drosophila really be explained in terms of developmental biology? So, by embracing the integrative physiology manifesto, comparative physiologists can not only accelerate their own research, but their functional skills can make them indispensable in the post-genomic endeavour.

  8. GnpIS: an information system to integrate genetic and genomic data from plants and fungi.

    PubMed

    Steinbach, Delphine; Alaux, Michael; Amselem, Joelle; Choisne, Nathalie; Durand, Sophie; Flores, Raphaël; Keliet, Aminah-Olivia; Kimmel, Erik; Lapalu, Nicolas; Luyten, Isabelle; Michotey, Célia; Mohellibi, Nacer; Pommier, Cyril; Reboux, Sébastien; Valdenaire, Dorothée; Verdelet, Daphné; Quesneville, Hadi

    2013-01-01

    Data integration is a key challenge for modern bioinformatics. It aims to provide biologists with tools to explore relevant data produced by different studies. Large-scale international projects can generate lots of heterogeneous and unrelated data. The challenge is to integrate this information with other publicly available data. Nucleotide sequencing throughput has been improved with new technologies; this increases the need for powerful information systems able to store, manage and explore data. GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, markers, single nucleotide polymorphisms, germplasms and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Institute for Agricultural Research. Here, we illustrate its use. Database URL: http://urgi.versailles.inra.fr/gnpis.

  9. IMG/M: integrated genome and metagenome comparative data analysis system.

    PubMed

    Chen, I-Min A; Markowitz, Victor M; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.

  10. Integrated and sequence-ordered BAC- and YAC-based physical maps for the rat genome.

    PubMed

    Krzywinski, Martin; Wallis, John; Gösele, Claudia; Bosdet, Ian; Chiu, Readman; Graves, Tina; Hummel, Oliver; Layman, Dan; Mathewson, Carrie; Wye, Natasja; Zhu, Baoli; Albracht, Derek; Asano, Jennifer; Barber, Sarah; Brown-John, Mabel; Chan, Susanna; Chand, Steve; Cloutier, Alison; Davito, Jonathon; Fjell, Chris; Gaige, Tony; Ganten, Detlev; Girn, Noreen; Guggenheimer, Kurtis; Himmelbauer, Heinz; Kreitler, Thomas; Leach, Stephen; Lee, Darlene; Lehrach, Hans; Mayo, Michael; Mead, Kelly; Olson, Teika; Pandoh, Pawan; Prabhu, Anna-Liisa; Shin, Heesun; Tänzer, Simone; Thompson, Jason; Tsai, Miranda; Walker, Jason; Yang, George; Sekhon, Mandeep; Hillier, LaDeana; Zimdahl, Heike; Marziali, Andre; Osoegawa, Kazutoyo; Zhao, Shaying; Siddiqui, Asim; de Jong, Pieter J; Warren, Wes; Mardis, Elaine; McPherson, John D; Wilson, Richard; Hübner, Norbert; Jones, Steven; Marra, Marco; Schein, Jacqueline

    2004-04-01

    As part of the effort to sequence the genome of Rattus norvegicus, we constructed a physical map comprised of fingerprinted bacterial artificial chromosome (BAC) clones from the CHORI-230 BAC library. These BAC clones provide approximately 13-fold redundant coverage of the genome and have been assembled into 376 fingerprint contigs. A yeast artificial chromosome (YAC) map was also constructed and aligned with the BAC map via fingerprinted BAC and P1 artificial chromosome clones (PACs) sharing interspersed repetitive sequence markers with the YAC-based physical map. We have annotated 95% of the fingerprint map clones in contigs with coordinates on the version 3.1 rat genome sequence assembly, using BAC-end sequences and in silico mapping methods. These coordinates have allowed anchoring 358 of the 376 fingerprint map contigs onto the sequence assembly. Of these, 324 contigs are anchored to rat genome sequences localized to chromosomes, and 34 contigs are anchored to unlocalized portions of the rat sequence assembly. The remaining 18 contigs, containing 54 clones, still require placement. The fingerprint map is a high-resolution integrative data resource that provides genome-ordered associations among BAC, YAC, and PAC clones and the assembled sequence of the rat genome.

  11. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    PubMed

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  12. Integrated cytogenetic BAC map of the genome of the gray, short-tailed opossum, Monodelphis domestica.

    PubMed

    Duke, S E; Samollow, P B; Mauceli, E; Lindblad-Toh, K; Breen, M

    2007-01-01

    The generation of high-quality genome assemblies for numerous species is advancing at a rapid pace. As the number of genome assemblies increases, so does our ability to investigate genome relationships and their contributions to unraveling complex biological, evolutionary, and biomedical processes. A key process in the generation of a genome assembly is to determine and verify the precise physical location and order of the large sequence blocks (scaffolds) that result from the assembly. For organisms of relatively recent common ancestry this process may be achieved largely through comparative sequence alignment. However, as the evolutionary distance between species lengthens, the use of comparative sequence alignment becomes increasingly less reliable. Simultaneous cytogenetic mapping, using multicolor fluorescence in-situ hybridization (FISH) analysis, offers an alternative means to define the cytogenetic location and relative order of DNA sequences, thereby anchoring the genome sequence to the karyotype. In this article we report the molecular cytogenetic locations of 415 bacterial artificial chromosome (BAC) clones that served to anchor sequence scaffolds of the gray, short-tailed opossum (Monodelphis domestica) to its karyotype, which enabled accurate integration of these regions into the genome assembly.

  13. IMG/M: integrated genome and metagenome comparative data analysis system

    PubMed Central

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2017-01-01

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. PMID:27738135

  14. Exploring breast carcinogenesis through integrative genomics and epigenomics analyses.

    PubMed

    Minning, Chin; Mokhtar, Norfilza Mohd; Abdullah, Norlia; Muhammad, Rohaizak; Emran, Nor Aina; Ali, Siti Aishah M D; Harun, Roslan; Jamal, Rahman

    2014-11-01

    There have been many DNA methylation studies on breast cancer which showed various methylation patterns involving tumour suppressor genes and oncogenes but only a few of those studies link the methylation data with gene expression. More data are required especially from the Asian region and to analyse how the epigenome data correlate with the transcriptome. DNA methylation profiling was carried out on 76 fresh frozen primary breast tumour tissues and 25 adjacent non-cancerous breast tissues using the Illumina Infinium(®) HumanMethylation27 BeadChip. Validation of methylation results was performed on 7 genes using either MS-MLPA or MS-qPCR. Gene expression profiling was done on 15 breast tumours and 5 adjacent non-cancerous breast tissues using the Affymetrix GeneChip(®) Human Gene 1.0 ST array. The overlapping genes between DNA methylation and gene expression datasets were further mapped to the KEGG database to identify the molecular pathways that linked these genes together. Supervised hierarchical cluster analysis revealed 1,389 hypermethylated CpG sites and 22 hypomethylated CpG sites in cancer compared to the normal samples. Gene expression microarray analysis using a fold-change of at least 1.5 and a false discovery rate (FDR) at p>0.05 identified 404 upregulated and 463 downregulated genes in cancer samples. Integration of both datasets identified 51 genes with hypermethylation with low expression (negative association) and 13 genes with hypermethylation with high expression (positive association). Most of the overlapping genes belong to the focal adhesion and extracellular matrix-receptor interaction that play important roles in breast carcinogenesis. The present study displayed the value of using multiple datasets in the same set of tissues and how the integrative analysis can create a list of well-focused genes as well as to show the correlation between epigenetic changes and gene expression. These gene signatures can help us understand the epigenetic

  15. Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.

    PubMed

    Masseroli, Marco; Canakoglu, Arif; Ceri, Stefano

    2016-01-01

    Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists' ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Proteomic Knowledge Base (GPKB), which integrates several of the most relevant sources of such dispersed information (including Entrez Gene, UniProt, IntAct, Expasy Enzyme, GO, GOA, BioCyc, KEGG, Reactome, and OMIM). Our solution is general, as it uses a flexible, modular, and multilevel global data schema based on abstraction and generalization of integrated data features, and a set of automatic procedures for easing data integration and maintenance, also when the integrated data sources evolve in data content, structure, and number. These procedures also assure consistency, quality, and provenance tracking of all integrated data, and perform the semantic closure of the hierarchical relationships of the integrated biomedical ontologies. At http://www.bioinformatics.deib.polimi.it/GPKB/, a Web interface allows graphical easy composition of queries, although complex, on the knowledge base, supporting also semantic query expansion and comprehensive explorative search of the integrated data to better sustain biomedical knowledge extraction.

  16. Personalised Medicine Possible With Real-Time Integration of Genomic and Clinical Data To Inform Clinical Decision-Making.

    PubMed

    Martin-Sanchez, Fernando; Turner, Maureen; Johnstone, Alice; Heffer, Leon; Rafael, Naomi; Bakker, Tim; Thorne, Natalie; Macciocca, Ivan; Gaff, Clara

    2015-01-01

    Despite widespread use of genomic sequencing in research, there are gaps in our understanding of the performance and provision of genomic sequencing in clinical practice. The Melbourne Genomics Health Alliance (the Alliance), has been established to determine the feasibility, performance and impact of using genomic sequencing as a diagnostic tool. The Alliance has partnered with BioGrid Australia to enable the linkage of genomic sequencing, clinical treatment and outcome data for this project. This integrated dataset of genetic, clinical and patient sourced information will be used by the Alliance to evaluate the potential diagnostic value of genomic sequencing in routine clinical practice. This project will allow the Alliance to provide recommendations to facilitate the integration of genomic sequencing into clinical practice to enable personalised disease treatment.

  17. Microhomology-mediated end-joining-dependent integration of donor DNA in cells and animals using TALENs and CRISPR/Cas9.

    PubMed

    Nakade, Shota; Tsubota, Takuya; Sakane, Yuto; Kume, Satoshi; Sakamoto, Naoaki; Obara, Masanobu; Daimon, Takaaki; Sezutsu, Hideki; Yamamoto, Takashi; Sakuma, Tetsushi; Suzuki, Ken-ichi T

    2014-11-20

    Genome engineering using programmable nucleases enables homologous recombination (HR)-mediated gene knock-in. However, the labour used to construct targeting vectors containing homology arms and difficulties in inducing HR in some cell type and organisms represent technical hurdles for the application of HR-mediated knock-in technology. Here, we introduce an alternative strategy for gene knock-in using transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) mediated by microhomology-mediated end-joining, termed the PITCh (Precise Integration into Target Chromosome) system. TALEN-mediated PITCh, termed TAL-PITCh, enables efficient integration of exogenous donor DNA in human cells and animals, including silkworms and frogs. We further demonstrate that CRISPR/Cas9-mediated PITCh, termed CRIS-PITCh, can be applied in human cells without carrying the plasmid backbone sequence. Thus, our PITCh-ing strategies will be useful for a variety of applications, not only in cultured cells, but also in various organisms, including invertebrates and vertebrates.

  18. Salt stress in Desulfovibrio vulgaris Hildenborough: an integrated genomics approach.

    PubMed

    Mukhopadhyay, Aindrila; He, Zhili; Alm, Eric J; Arkin, Adam P; Baidoo, Edward E; Borglin, Sharon C; Chen, Wenqiong; Hazen, Terry C; He, Qiang; Holman, Hoi-Ying; Huang, Katherine; Huang, Rick; Joyner, Dominique C; Katz, Natalie; Keller, Martin; Oeller, Paul; Redding, Alyssa; Sun, Jun; Wall, Judy; Wei, Jing; Yang, Zamin; Yen, Huei-Che; Zhou, Jizhong; Keasling, Jay D

    2006-06-01

    The ability of Desulfovibrio vulgaris Hildenborough to reduce, and therefore contain, toxic and radioactive metal waste has made all factors that affect the physiology of this organism of great interest. Increased salinity is an important and frequent fluctuation faced by D. vulgaris in its natural habitat. In liquid culture, exposure to excess salt resulted in striking elongation of D. vulgaris cells. Using data from transcriptomics, proteomics, metabolite assays, phospholipid fatty acid profiling, and electron microscopy, we used a systems approach to explore the effects of excess NaCl on D. vulgaris. In this study we demonstrated that import of osmoprotectants, such as glycine betaine and ectoine, is the primary mechanism used by D. vulgaris to counter hyperionic stress. Several efflux systems were also highly up-regulated, as was the ATP synthesis pathway. Increases in the levels of both RNA and DNA helicases suggested that salt stress affected the stability of nucleic acid base pairing. An overall increase in the level of branched fatty acids indicated that there were changes in cell wall fluidity. The immediate response to salt stress included up-regulation of chemotaxis genes, although flagellar biosynthesis was down-regulated. Other down-regulated systems included lactate uptake permeases and ABC transport systems. The results of an extensive NaCl stress analysis were compared with microarray data from a KCl stress analysis, and unlike many other bacteria, D. vulgaris responded similarly to the two stresses. Integration of data from multiple methods allowed us to develop a conceptual model for the salt stress response in D. vulgaris that can be compared to those in other microorganisms.

  19. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    PubMed

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  20. Neuroscience Data Integration through Mediation: An (F)BIRN Case Study

    PubMed Central

    Ashish, Naveen; Ambite, José Luis; Muslea, Maria; Turner, Jessica A.

    2010-01-01

    We describe an application of the BIRN mediator to the integration of neuroscience experimental data sources. The BIRN mediator is a general purpose solution to the problem of providing integrated, semantically-consistent access to biomedical data from multiple, distributed, heterogeneous data sources. The system follows the mediation approach, where the data remains at the sources, providers maintain control of the data, and the integration system retrieves data from the sources in real-time in response to client queries. Our aim with this paper is to illustrate how domain-specific data integration applications can be developed quickly and in a principled way by using our general mediation technology. We describe in detail the integration of two leading, but radically different, experimental neuroscience sources, namely, the human imaging database, a relational database, and the eXtensible neuroimaging archive toolkit, an XML web services system. We discuss the steps, sources of complexity, effort, and time required to build such applications, as well as outline directions of ongoing and future research on biomedical data integration. PMID:21228907

  1. Mediation analysis demonstrates that trans-eQTLs are often explained by cis-mediation: a genome-wide analysis among 1,800 South Asians.

    PubMed

    Pierce, Brandon L; Tong, Lin; Chen, Lin S; Rahaman, Ronald; Argos, Maria; Jasmine, Farzana; Roy, Shantanu; Paul-Brutus, Rachelle; Westra, Harm-Jan; Franke, Lude; Esko, Tonu; Zaman, Rakibuz; Islam, Tariqul; Rahman, Mahfuzar; Baron, John A; Kibriya, Muhammad G; Ahsan, Habibul

    2014-12-01

    A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a population-based cohort of 1,799 Bangladeshi individuals to characterize cis- and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P<0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P<10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying

  2. Brd4 is required for e2-mediated transcriptional activation but not genome partitioning of all papillomaviruses.

    PubMed

    McPhillips, M G; Oliveira, J G; Spindler, J E; Mitra, R; McBride, A A

    2006-10-01

    Bromodomain protein 4 (Brd4) has been identified as the cellular binding target through which the E2 protein of bovine papillomavirus type 1 links the viral genome to mitotic chromosomes. This tethering ensures retention and efficient partitioning of genomes to daughter cells following cell division. E2 is also a regulator of viral gene expression and a replication factor, in association with the viral E1 protein. In this study, we show that E2 proteins from a wide range of papillomaviruses interact with Brd4, albeit with variations in efficiency. Moreover, disruption of the E2-Brd4 interaction abrogates the transactivation function of E2, indicating that Brd4 is required for E2-mediated transactivation of all papillomaviruses. However, the interaction of E2 and Brd4 is not required for genome partitioning of all papillomaviruses since a number of papillomavirus E2 proteins associate with mitotic chromosomes independently of Brd4 binding. Furthermore, mutations in E2 that disrupt the interaction with Brd4 do not affect the ability of these E2s to associate with chromosomes. Thus, while all papillomaviruses attach their genomes to cellular chromosomes to facilitate genome segregation, they target different cellular binding partners. In summary, the E2 proteins from many papillomaviruses, including the clinically important alpha genus human papillomaviruses, interact with Brd4 to mediate transcriptional activation function but not all depend on this interaction to efficiently associate with mitotic chromosomes.

  3. Loss of p53-mediated cell-cycle arrest, senescence and apoptosis promotes genomic instability and premature aging.

    PubMed

    Li, Tongyuan; Liu, Xiangyu; Jiang, Le; Manfredi, James; Zha, Shan; Gu, Wei

    2016-03-15

    Although p53-mediated cell cycle arrest, senescence and apoptosis are well accepted as major tumor suppression mechanisms, the loss of these functions does not directly lead to tumorigenesis, suggesting that the precise roles of these canonical activities of p53 need to be redefined. Here, we report that the cells derived from the mutant mice expressing p533KR, an acetylation-defective mutant that fails to induce cell-cycle arrest, senescence and apoptosis, exhibit high levels of aneuploidy upon DNA damage. Moreover, the embryonic lethality caused by the deficiency of XRCC4, a key DNA double strand break repair factor, can be fully rescued in the p533KR/3KR background. Notably, despite high levels of genomic instability, p533KR/3KRXRCC4-/- mice, unlike p53-/- XRCC4-/- mice, are not succumbed to pro-B-cell lymphomas. Nevertheless, p533KR/3KR XRCC4-/- mice display aging-like phenotypes including testicular atrophy, kyphosis, and premature death. Further analyses demonstrate that SLC7A11 is downregulated and that p53-mediated ferroptosis is significantly induced in spleens and testis of p533KR/3KRXRCC4-/- mice. These results demonstrate that the direct role of p53-mediated cell cycle arrest, senescence and apoptosis is to control genomic stability in vivo. Our study not only validates the importance of ferroptosis in p53-mediated tumor suppression in vivo but also reveals that the combination of genomic instability and activation of ferroptosis may promote aging-associated phenotypes.

  4. More powerful genetic association testing via a new statistical framework for integrative genomics

    PubMed Central

    Zhao, Sihai D.; Cai, T. Tony; Li, Hongzhe

    2015-01-01

    Integrative genomics offers a promising approach to more powerful genetic association studies. The hope is that combining outcome and genotype data with other types of genomic information can lead to more powerful SNP detection. We present a new association test based on a statistical model that explicitly assumes that genetic variations affect the outcome through perturbing gene expression levels. It is shown analytically that the proposed approach can have more power to detect SNPs that are associated with the outcome through transcriptional regulation, compared to tests using the outcome and genotype data alone, and simulations show that our method is relatively robust to misspecification. We also provide a strategy for applying our approach to high-dimensional genomic data. We use this strategy to identify a potentially new association between a SNP and a yeast cell’s response to the natural product tomatidine, which standard association analysis did not detect. PMID:24975802

  5. More powerful genetic association testing via a new statistical framework for integrative genomics.

    PubMed

    Zhao, Sihai D; Cai, T Tony; Li, Hongzhe

    2014-12-01

    Integrative genomics offers a promising approach to more powerful genetic association studies. The hope is that combining outcome and genotype data with other types of genomic information can lead to more powerful SNP detection. We present a new association test based on a statistical model that explicitly assumes that genetic variations affect the outcome through perturbing gene expression levels. It is shown analytically that the proposed approach can have more power to detect SNPs that are associated with the outcome through transcriptional regulation, compared to tests using the outcome and genotype data alone, and simulations show that our method is relatively robust to misspecification. We also provide a strategy for applying our approach to high-dimensional genomic data. We use this strategy to identify a potentially new association between a SNP and a yeast cell's response to the natural product tomatidine, which standard association analysis did not detect.

  6. Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine.

    PubMed

    Ostrowski, Jerzy; Wyrwicz, Lucjan S

    2009-09-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, which are usually applied on the scale of single genes. Medicine in the postgenomic era will utilize thousands of disease-associated molecular markers provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical and bioinformatic analyses to model biological systems. Collecting, cataloging and comparing data from molecular studies, and the subsequent development of conclusions, creates the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm known as integrative genomics.

  7. The human specialized DNA polymerases and non-B DNA: vital relationships to preserve genome integrity.

    PubMed

    Boyer, Anne-Sophie; Grgurevic, Srdana; Cazaux, Christophe; Hoffmann, Jean-Sébastien

    2013-11-29

    In addition to the canonical right-handed double helix, DNA molecule can adopt several other non-B DNA structures. Readily formed in the genome at specific DNA repetitive sequences, these secondary conformations present a distinctive challenge for progression of DNA replication forks. Impeding normal DNA synthesis, cruciforms, hairpins, H DNA, Z DNA and G4 DNA considerably impact the genome stability and in some instances play a causal role in disease development. Along with previously discovered dedicated DNA helicases, the specialized DNA polymerases emerge as major actors performing DNA synthesis through these distorted impediments. In their new role, they are facilitating DNA synthesis on replication stalling sites formed by non-B DNA structures and thereby helping the completion of DNA replication, a process otherwise crucial for preserving genome integrity and concluding normal cell division. This review summarizes the evidence gathered describing the function of specialized DNA polymerases in replicating DNA through non-B DNA structures.

  8. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

    PubMed

    Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H

    2010-12-24

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.

  9. Genomic Analysis of Sleeping Beauty Transposon Integration in Human Somatic Cells

    PubMed Central

    Turchiano, Giandomenico; Latella, Maria Carmela; Gogol-Döring, Andreas; Cattoglio, Claudia; Mavilio, Fulvio; Izsvák, Zsuzsanna; Ivics, Zoltán; Recchia, Alessandra

    2014-01-01

    The Sleeping Beauty (SB) transposon is a non-viral integrating vector system with proven efficacy for gene transfer and functional genomics. However, integration efficiency is negatively affected by the length of the transposon. To optimize the SB transposon machinery, the inverted repeats and the transposase gene underwent several modifications, resulting in the generation of the hyperactive SB100X transposase and of the high-capacity “sandwich” (SA) transposon. In this study, we report a side-by-side comparison of the SA and the widely used T2 arrangement of transposon vectors carrying increasing DNA cargoes, up to 18 kb. Clonal analysis of SA integrants in human epithelial cells and in immortalized keratinocytes demonstrates stability and integrity of the transposon independently from the cargo size and copy number-dependent expression of the cargo cassette. A genome-wide analysis of unambiguously mapped SA integrations in keratinocytes showed an almost random distribution, with an overrepresentation in repetitive elements (satellite, LINE and small RNAs) compared to a library representing insertions of the first-generation transposon vector and to gammaretroviral and lentiviral libraries. The SA transposon/SB100X integrating system therefore shows important features as a system for delivering large gene constructs for gene therapy applications. PMID:25390293

  10. Epiviz: a view inside the design of an integrated visual analysis software for genomics

    PubMed Central

    2015-01-01

    Background Computational and visual data analysis for genomics has traditionally involved a combination of tools and resources, of which the most ubiquitous consist of genome browsers, focused mainly on integrative visualization of large numbers of big datasets, and computational environments, focused on data modeling of a small number of moderately sized datasets. Workflows that involve the integration and exploration of multiple heterogeneous data sources, small and large, public and user specific have been poorly addressed by these tools. In our previous work, we introduced Epiviz, which bridges the gap between the two types of tools, simplifying these workflows. Results In this paper we expand on the design decisions behind Epiviz, and introduce a series of new advanced features that further support the type of interactive exploratory workflow we have targeted. We discuss three ways in which Epiviz advances the field of genomic data analysis: 1) it brings code to interactive visualizations at various different levels; 2) takes the first steps in the direction of collaborative data analysis by incorporating user plugins from source control providers, as well as by allowing analysis states to be shared among the scientific community; 3) combines established analysis features that have never before been available simultaneously in a genome browser. In our discussion section, we present security implications of the current design, as well as a series of limitations and future research steps. Conclusions Since many of the design choices of Epiviz are novel in genomics data analysis, this paper serves both as a document of our own approaches with lessons learned, as well as a start point for future efforts in the same direction for the genomics community. PMID:26328750

  11. AACR precision medicine series: Highlights of the integrating clinical genomics and cancer therapy meeting.

    PubMed

    Maggi, Elaine; Montagna, Cristina

    2015-12-01

    The American Association for Cancer Research (AACR) Precision Medicine Series "Integrating Clinical Genomics and Cancer Therapy" took place June 13-16, 2015 in Salt Lake City, Utah. The conference was co-chaired by Charles L. Sawyers form Memorial Sloan Kettering Cancer Center in New York, Elaine R. Mardis form Washington University School of Medicine in St. Louis, and Arul M. Chinnaiyan from University of Michigan in Ann Arbor. About 500 clinicians, basic science investigators, bioinformaticians, and postdoctoral fellows joined together to discuss the current state of Clinical Genomics and the advances and challenges of integrating Next Generation Sequencing (NGS) technologies into clinical practice. The plenary sessions and panel discussions covered current platforms and sequencing approaches adopted for NGS assays of cancer genome at several national and international institutions, different approaches used to map and classify targetable sequence variants, and how information acquired with the sequencing of the cancer genome is used to guide treatment options. While challenges still exist from a technological perspective, it emerged that there exists considerable need for the development of tools to aid the identification of the therapy most suitable based on the mutational profile of the somatic cancer genome. The process to match patients to ongoing clinical trials is still complex. In addition, the need for centralized data repositories, preferably linked to well annotated clinical records, that aid sharing of sequencing information is central to begin understanding the contribution of variants of unknown significance to tumor etiology and response to therapy. Here we summarize the highlights of this stimulating four-day conference with a major emphasis on the open problems that the clinical genomics community is currently facing and the tools most needed for advancing this field.

  12. Use of whole-genome sequencing data to analyze 23S rRNA-mediated azithromycin resistance.

    PubMed

    Johnson, Steven R; Grad, Yonatan; Abrams, A Jeanine; Pettus, Kevin; Trees, David L

    2017-02-01

    The whole-genome sequences of 24 isolates of Neisseria gonorrhoeae with elevated minimum inhibitory concentrations (MICs) to azithromycin (≥2.0 µg/mL) were analyzed against a modified sequence derived from the whole-genome sequence of N. gonorrhoeae FA1090 to determine, by signal ratio, the number of mutant copies of the 23S rRNA gene and the copy number effect on 50S ribosome-mediated azithromycin resistance. Isolates that were predicted to contain four mutated copies were accurately identified compared with the results of direct sequencing. Fewer than four mutated copies gave less accurate results but were consistent with elevated MICs.

  13. Integrated analysis of copy number variation and genome-wide expression profiling in colorectal cancer tissues.

    PubMed

    Ali Hassan, Nur Zarina; Mokhtar, Norfilza Mohd; Kok Sin, Teow; Mohamed Rose, Isa; Sagap, Ismail; Harun, Roslan; Jamal, Rahman

    2014-01-01

    Integrative analyses of multiple genomic datasets for selected samples can provide better insight into the overall data and can enhance our knowledge of cancer. The objective of this study was to elucidate the association between copy number variation (CNV) and gene expression in colorectal cancer (CRC) samples and their corresponding non-cancerous tissues. Sixty-four paired CRC samples from the same patients were subjected to CNV profiling using the Illumina HumanOmni1-Quad assay, and validation was performed using multiplex ligation probe amplification method. Genome-wide expression profiling was performed on 15 paired samples from the same group of patients using the Affymetrix Human Gene 1.0 ST array. Significant genes obtained from both array results were then overlapped. To identify molecular pathways, the data were mapped to the KEGG database. Whole genome CNV analysis that compared primary tumor and non-cancerous epithelium revealed gains in 1638 genes and losses in 36 genes. Significant gains were mostly found in chromosome 20 at position 20q12 with a frequency of 45.31% in tumor samples. Examples of genes that were associated at this cytoband were PTPRT, EMILIN3 and CHD6. The highest number of losses was detected at chromosome 8, position 8p23.2 with 17.19% occurrence in all tumor samples. Among the genes found at this cytoband were CSMD1 and DLC1. Genome-wide expression profiling showed 709 genes to be up-regulated and 699 genes to be down-regulated in CRC compared to non-cancerous samples. Integration of these two datasets identified 56 overlapping genes, which were located in chromosomes 8, 20 and 22. MLPA confirmed that the CRC samples had the highest gains in chromosome 20 compared to the reference samples. Interpretation of the CNV data in the context of the transcriptome via integrative analyses may provide more in-depth knowledge of the genomic landscape of CRC.

  14. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource

    PubMed Central

    Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk

    2016-01-01

    Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. PMID:27903883

  15. Drosophila Sld5 is essential for normal cell cycle progression and maintenance of genomic integrity

    SciTech Connect

    Gouge, Catherine A.; Christensen, Tim W.

    2010-09-10

    Research highlights: {yields} Drosophila Sld5 interacts with Psf1, PPsf2, and Mcm10. {yields} Haploinsufficiency of Sld5 leads to M-phase delay and genomic instability. {yields} Sld5 is also required for normal S phase progression. -- Abstract: Essential for the normal functioning of a cell is the maintenance of genomic integrity. Failure in this process is often catastrophic for the organism, leading to cell death or mis-proliferation. Central to genomic integrity is the faithful replication of DNA during S phase. The GINS complex has recently come to light as a critical player in DNA replication through stabilization of MCM2-7 and Cdc45 as a member of the CMG complex which is likely responsible for the processivity of helicase activity during S phase. The GINS complex is made up of 4 members in a 1:1:1:1 ratio: Psf1, Psf2, Psf3, And Sld5. Here we present the first analysis of the function of the Sld5 subunit in a multicellular organism. We show that Drosophila Sld5 interacts with Psf1, Psf2, and Mcm10 and that mutations in Sld5 lead to M and S phase delays with chromosomes exhibiting hallmarks of genomic instability.

  16. Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences

    PubMed Central

    Zhang, Jianwei; Kudrna, Dave; Mu, Ting; Li, Weiming; Copetti, Dario; Yu, Yeisoo; Goicoechea, Jose Luis; Lei, Yang; Wing, Rod A.

    2016-01-01

    Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool—Genome Puzzle Master (GPM)—that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. Availability and Implementation: The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS Contacts: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27318200

  17. Integrated Syntenic and Phylogenomic Analyses Reveal an Ancient Genome Duplication in Monocots[W

    PubMed Central

    Jiao, Yuannian; Li, Jingping; Tang, Haibao; Paterson, Andrew H.

    2014-01-01

    Unraveling widespread polyploidy events throughout plant evolution is a necessity for inferring the impacts of whole-genome duplication (WGD) on speciation, functional innovations, and to guide identification of true orthologs in divergent taxa. Here, we employed an integrated syntenic and phylogenomic analyses to reveal an ancient WGD that shaped the genomes of all commelinid monocots, including grasses, bromeliads, bananas (Musa acuminata), ginger, palms, and other plants of fundamental, agricultural, and/or horticultural interest. First, comprehensive phylogenomic analyses revealed 1421 putative gene families that retained ancient duplication shared by Musa (Zingiberales) and grass (Poales) genomes, indicating an ancient WGD in monocots. Intergenomic synteny blocks of Musa and Oryza were investigated, and 30 blocks were shown to be duplicated before Musa-Oryza divergence an estimated 120 to 150 million years ago. Synteny comparisons of four monocot (rice [Oryza sativa], sorghum [Sorghum bicolor], banana, and oil palm [Elaeis guineensis]) and two eudicot (grape [Vitis vinifera] and sacred lotus [Nelumbo nucifera]) genomes also support this additional WGD in monocots, herein called Tau (τ). Integrating synteny and phylogenomic comparisons achieves better resolution of ancient polyploidy events than either approach individually, a principle that is exemplified in the disambiguation of a WGD series of rho (ρ)-sigma (σ)-tau (τ) in the grass lineages that echoes the alpha (α)-beta (β)-gamma (γ) series previously revealed in the Arabidopsis thaliana lineage. PMID:25082857

  18. A comprehensive whole-genome integrated cytogenetic map for the alpaca (Lama pacos).

    PubMed

    Avila, Felipe; Baily, Malorie P; Perelman, Polina; Das, Pranab J; Pontius, Joan; Chowdhary, Renuka; Owens, Elaine; Johnson, Warren E; Merriwether, David A; Raudsepp, Terje

    2014-01-01

    Genome analysis of the alpaca (Lama pacos, LPA) has progressed slowly compared to other domestic species. Here, we report the development of the first comprehensive whole-genome integrated cytogenetic map for the alpaca using fluorescence in situ hybridization (FISH) and CHORI-246 BAC library clones. The map is comprised of 230 linearly ordered markers distributed among all 36 alpaca autosomes and the sex chromosomes. For the first time, markers were assigned to LPA14, 21, 22, 28, and 36. Additionally, 86 genes from 15 alpaca chromosomes were mapped in the dromedary camel (Camelus dromedarius, CDR), demonstrating exceptional synteny and linkage conservation between the 2 camelid genomes. Cytogenetic mapping of 191 protein-coding genes improved and refined the known Zoo-FISH homologies between camelids and humans: we discovered new homologous synteny blocks (HSBs) corresponding to HSA1-LPA/CDR11, HSA4-LPA/CDR31 and HSA7-LPA/CDR36, and revised the location of breakpoints for others. Overall, gene mapping was in good agreement with the Zoo-FISH and revealed remarkable evolutionary conservation of gene order within many human-camelid HSBs. Most importantly, 91 FISH-mapped markers effectively integrated the alpaca whole-genome sequence and the radiation hybrid maps with physical chromosomes, thus facilitating the improvement of the sequence assembly and the discovery of genes of biological importance.

  19. Integration of the full-length HPV16 genome in cervical cancer and Caski and Siha cell lines and the possible ways of HPV integration.

    PubMed

    Xu, Feng; Cao, Meng; Shi, Qinfeng; Chen, Hongwei; Wang, Yili; Li, Xu

    2015-04-01

    Integration of high-risk human papillomavirus (HPV) into the host genome is a key event for cervical carcinogenesis. Different methods have been used to explore the physical states of the HPV genome to reveal the mechanisms for malignant transformation of the infected cells. Consensus has been reached that, although variable portions of the HPV genome are deleted in the integrated HPV sequences, common disruption of the viral E2 gene has been demonstrated in different studies. The head-to-tail concatemers of the full-length HPV16 genome is another typical integration pattern of HPV16, typically found in Caski cell lines, but its prevalence in cervical cancer has never been tested. Here, by introducing a modified PCR, we identified this head-to-tail concatemers of full-length HPV genomes in advanced cervical cancer with HPV16 single positive. Our results show that more than half of the cases contain this integrated head-to-tail concatemers of full-length HPV16 genomes. Further studies in two cervical cell lines, Caski cells and Siha cells, revealed a correlation between the prevalence of the spliced variants of integrated HPV16 sequences and the full-length transcription of the integrated head-to-tail concatemers of the full-length HPV16 genome. Based on these results, we propose that HPV16 integrated into host cells by two mechanisms: one mechanism is shared by other DNA virus and cause integration of the head-to-tail concatemers of the viral genome; another is related to the reverse transcription process, which the integrated HPV sequence is generated by the reverse transcription of the viral mRNA.

  20. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing.

    PubMed

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75-98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing.

  1. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing

    PubMed Central

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75–98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing. PMID:27606350

  2. MDI-GPU: accelerating integrative modelling for genomic-scale data using GP-GPU computing.

    PubMed

    Mason, Samuel A; Sayyid, Faiz; Kirk, Paul D W; Starr, Colin; Wild, David L

    2016-03-01

    The integration of multi-dimensional datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct--but often complementary--information. However, the large amount of data adds burden to any inference task. Flexible Bayesian methods may reduce the necessity for strong modelling assumptions, but can also increase the computational burden. We present an improved implementation of a Bayesian correlated clustering algorithm, that permits integrated clustering to be routinely performed across multiple datasets, each with tens of thousands of items. By exploiting GPU based computation, we are able to improve runtime performance of the algorithm by almost four orders of magnitude. This permits analysis across genomic-scale data sets, greatly expanding the range of applications over those originally possible. MDI is available here: http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/.

  3. How might flukes and tapeworms maintain genome integrity without a canonical piRNA pathway?

    PubMed

    Skinner, Danielle E; Rinaldi, Gabriel; Koziol, Uriel; Brehm, Klaus; Brindley, Paul J

    2014-03-01

    Surveillance by RNA interference is central to controlling the mobilization of transposable elements (TEs). In stem cells, Piwi argonaute (Ago) proteins and associated proteins repress mobilization of TEs to maintain genome integrity. This defense mechanism targeting TEs is termed the Piwi-interacting RNA (piRNA) pathway. In this opinion article, we draw attention to the situation that the genomes of cestodes and trematodes have lost the piwi and vasa genes that are hallmark characters of the germline multipotency program. This absence of Piwi-like Agos and Vasa helicases prompts the question: how does the germline of these flatworms withstand mobilization of TEs? Here, we present an interpretation of mechanisms likely to defend the germline integrity of parasitic flatworms.

  4. Solutions for data integration in functional genomics: a critical assessment and case study.

    PubMed

    Smedley, Damian; Swertz, Morris A; Wolstencroft, Katy; Proctor, Glenn; Zouberakis, Michael; Bard, Jonathan; Hancock, John M; Schofield, Paul

    2008-11-01

    The torrent of data emerging from the application of new technologies to functional genomics and systems biology can no longer be contained within the traditional modes of data sharing and publication with the consequence that data is being deposited in, distributed across and disseminated through an increasing number of databases. The resulting fragmentation poses serious problems for the model organism community which increasingly rely on data mining and computational approaches that require gathering of data from a range of sources. In the light of these problems, the European Commission has funded a coordination action, CASIMIR (coordination and sustainability of international mouse informatics resources), with a remit to assess the technical and social aspects of database interoperability that currently prevent the full realization of the potential of data integration in mouse functional genomics. In this article, we assess the current problems with interoperability, with particular reference to mouse functional genomics, and critically review the technologies that can be deployed to overcome them. We describe a typical use-case where an investigator wishes to gather data on variation, genomic context and metabolic pathway involvement for genes discovered in a genome-wide screen. We go on to develop an automated approach involving an in silico experimental workflow tool, Taverna, using web services, BioMart and MOLGENIS technologies for data retrieval. Finally, we focus on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.

  5. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DOE PAGES

    King, Zachary A.; Lu, Justin; Drager, Andreas; ...

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scalemore » metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.« less

  6. MEGANTE: a web-based system for integrated plant genome annotation.

    PubMed

    Numa, Hisataka; Itoh, Takeshi

    2014-01-01

    The recent advancement of high-throughput genome sequencing technologies has resulted in a considerable increase in demands for large-scale genome annotation. While annotation is a crucial step for downstream data analyses and experimental studies, this process requires substantial expertise and knowledge of bioinformatics. Here we present MEGANTE, a web-based annotation system that makes plant genome annotation easy for researchers unfamiliar with bioinformatics. Without any complicated configuration, users can perform genomic sequence annotations simply by uploading a sequence and selecting the species to query. MEGANTE automatically runs several analysis programs and integrates the results to select the appropriate consensus exon-intron structures and to predict open reading frames (ORFs) at each locus. Functional annotation, including a similarity search against known proteins and a functional domain search, are also performed for the predicted ORFs. The resultant annotation information is visualized with a widely used genome browser, GBrowse. For ease of analysis, the results can be downloaded in Microsoft Excel format. All of the query sequences and annotation results are stored on the server side so that users can access their own data from virtually anywhere on the web. The current release of MEGANTE targets 24 plant species from the Brassicaceae, Fabaceae, Musaceae, Poaceae, Salicaceae, Solanaceae, Rosaceae and Vitaceae families, and it allows users to submit a sequence up to 10 Mb in length and to save up to 100 sequences with the annotation information on the server. The MEGANTE web service is available at https://megante.dna.affrc.go.jp/.

  7. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    SciTech Connect

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  8. Integrative Approaches for Studying Mitochondrial and Nuclear Genome Co-evolution in Oxidative Phosphorylation

    PubMed Central

    Sunnucks, Paul; Morales, Hernán E.; Lamb, Annika M.; Pavlova, Alexandra; Greening, Chris

    2017-01-01

    In animals, interactions among gene products of mitochondrial and nuclear genomes (mitonuclear interactions) are of profound fitness, evolutionary, and ecological significance. Most fundamentally, the oxidative phosphorylation (OXPHOS) complexes responsible for cellular bioenergetics are formed by the direct interactions of 13 mitochondrial-encoded and ∼80 nuclear-encoded protein subunits in most animals. It is expected that organisms will develop genomic architecture that facilitates co-adaptation of these mitonuclear interactions and enhances biochemical efficiency of OXPHOS complexes. In this perspective, we present principles and approaches to understanding the co-evolution of these interactions, with a novel focus on how genomic architecture might facilitate it. We advocate that recent interdisciplinary advances assist in the consolidation of links between genotype and phenotype. For example, advances in genomics allow us to unravel signatures of selection in mitochondrial and nuclear OXPHOS genes at population-relevant scales, while newly published complete atomic-resolution structures of the OXPHOS machinery enable more robust predictions of how these genes interact epistatically and co-evolutionarily. We use three case studies to show how integrative approaches have improved the understanding of mitonuclear interactions in OXPHOS, namely those driving high-altitude adaptation in bar-headed geese, allopatric population divergence in Tigriopus californicus copepods, and the genome architecture of nuclear genes coding for mitochondrial functions in the eastern yellow robin. PMID:28316610

  9. Prediction of clinical phenotypes in invasive breast carcinomas from the integration of radiomics and genomics data.

    PubMed

    Guo, Wentian; Li, Hui; Zhu, Yitan; Lan, Li; Yang, Shengjie; Drukker, Karen; Morris, Elizabeth; Burnside, Elizabeth; Whitman, Gary; Giger, Maryellen L; Ji, Yuan

    2015-10-01

    Genomic and radiomic imaging profiles of invasive breast carcinomas from The Cancer Genome Atlas and The Cancer Imaging Archive were integrated and a comprehensive analysis was conducted to predict clinical outcomes using the radiogenomic features. Variable selection via LASSO and logistic regression were used to select the most-predictive radiogenomic features for the clinical phenotypes, including pathological stage, lymph node metastasis, and status of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). Cross-validation with receiver operating characteristic (ROC) analysis was performed and the area under the ROC curve (AUC) was employed as the prediction metric. Higher AUCs were obtained in the prediction of pathological stage, ER, and PR status than for lymph node metastasis and HER2 status. Overall, the prediction performances by genomics alone, radiomics alone, and combined radiogenomics features showed statistically significant correlations with clinical outcomes; however, improvement on the prediction performance by combining genomics and radiomics data was not found to be statistically significant, most likely due to the small sample size of 91 cancer cases with 38 radiomic features and 144 genomic features.

  10. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    PubMed Central

    King, Zachary A.; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  11. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    SciTech Connect

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  12. Integrated genomics and molecular breeding approaches for dissecting the complex quantitative traits in crop plants.

    PubMed

    Kujur, Alice; Saxena, Maneesha S; Bajaj, Deepak; Laxmi; Parida, Swarup K

    2013-12-01

    The enormous population growth, climate change and global warming are now considered major threats to agriculture and world's food security. To improve the productivity and sustainability of agriculture, the development of highyielding and durable abiotic and biotic stress-tolerant cultivars and/climate resilient crops is essential. Henceforth, understanding the molecular mechanism and dissection of complex quantitative yield and stress tolerance traits is the prime objective in current agricultural biotechnology research. In recent years, tremendous progress has been made in plant genomics and molecular breeding research pertaining to conventional and next-generation whole genome, transcriptome and epigenome sequencing efforts, generation of huge genomic, transcriptomic and epigenomic resources and development of modern genomics-assisted breeding approaches in diverse crop genotypes with contrasting yield and abiotic stress tolerance traits. Unfortunately, the detailed molecular mechanism and gene regulatory networks controlling such complex quantitative traits is not yet well understood in crop plants. Therefore, we propose an integrated strategies involving available enormous and diverse traditional and modern -omics (structural, functional, comparative and epigenomics) approaches/resources and genomics-assisted breeding methods which agricultural biotechnologist can adopt/utilize to dissect and decode the molecular and gene regulatory networks involved in the complex quantitative yield and stress tolerance traits in crop plants. This would provide clues and much needed inputs for rapid selection of novel functionally relevant molecular tags regulating such complex traits to expedite traditional and modern marker-assisted genetic enhancement studies in target crop species for developing high-yielding stress-tolerant varieties.

  13. Prediction of clinical phenotypes in invasive breast carcinomas from the integration of radiomics and genomics data

    PubMed Central

    Guo, Wentian; Li, Hui; Zhu, Yitan; Lan, Li; Yang, Shengjie; Drukker, Karen; Morris, Elizabeth; Burnside, Elizabeth; Whitman, Gary; Giger, Maryellen L.; Ji, Yuan; TCGA Breast Phenotype Research Group

    2015-01-01

    Abstract. Genomic and radiomic imaging profiles of invasive breast carcinomas from The Cancer Genome Atlas and The Cancer Imaging Archive were integrated and a comprehensive analysis was conducted to predict clinical outcomes using the radiogenomic features. Variable selection via LASSO and logistic regression were used to select the most-predictive radiogenomic features for the clinical phenotypes, including pathological stage, lymph node metastasis, and status of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). Cross-validation with receiver operating characteristic (ROC) analysis was performed and the area under the ROC curve (AUC) was employed as the prediction metric. Higher AUCs were obtained in the prediction of pathological stage, ER, and PR status than for lymph node metastasis and HER2 status. Overall, the prediction performances by genomics alone, radiomics alone, and combined radiogenomics features showed statistically significant correlations with clinical outcomes; however, improvement on the prediction performance by combining genomics and radiomics data was not found to be statistically significant, most likely due to the small sample size of 91 cancer cases with 38 radiomic features and 144 genomic features. PMID:26835491

  14. Genomic Access to Monarch Migration Using TALEN and CRISPR/Cas9-Mediated Targeted Mutagenesis.

    PubMed

    Markert, Matthew J; Zhang, Ying; Enuameh, Metewo S; Reppert, Steven M; Wolfe, Scot A; Merlin, Christine

    2016-04-07

    The eastern North American monarch butterfly, Danaus plexippus, is an emerging model system to study the neural, molecular, and genetic basis of animal long-distance migration and animal clockwork mechanisms. While genomic studies have provided new insight into migration-associated and circadian clock genes, the general lack of simple and versatile reverse-genetic methods has limited in vivo functional analysis of candidate genes in this species. Here, we report the establishment of highly efficient and heritable gene mutagenesis methods in the monarch butterfly using transcriptional activator-like effector nucleases (TALENs) and CRISPR-associated RNA-guided nuclease Cas9 (CRISPR/Cas9). Using two clock gene loci, cryptochrome 2 and clock (clk), as candidates, we show that both TALENs and CRISPR/Cas9 generate high-frequency nonhomologous end-joining (NHEJ)-mediated mutations at targeted sites (up to 100%), and that injecting fewer than 100 eggs is sufficient to recover mutant progeny and generate monarch knockout lines in about 3 months. Our study also genetically defines monarch CLK as an essential component of the transcriptional activation complex of the circadian clock. The methods presented should not only greatly accelerate functional analyses of many aspects of monarch biology, but are also anticipated to facilitate the development of these tools in other nontraditional insect species as well as the development of homology-directed knock-ins.

  15. Interplay between arginine methylation and ubiquitylation regulates KLF4-mediated genome stability and carcinogenesis

    PubMed Central

    Hu, Dong; Gur, Mert; Zhou, Zhuan; Gamper, Armin; Hung, Mien-Chie; Fujita, Naoya; Lan, Li; Bahar, Ivet; Wan, Yong

    2015-01-01

    KLF4 is an important regulator of cell-fate decision, including DNA damage response and apoptosis. We identify a novel interplay between protein modifications in regulating KLF4 function. Here we show that arginine methylation of KLF4 by PRMT5 inhibits KLF4 ubiquitylation by VHL and thereby reduces KLF4 turnover, resulting in the elevation of KLF4 protein levels concomitant with increased transcription of KLF4-dependent p21 and reduced expression of KLF4-repressed Bax. Structure-based modelling and simulations provide insight into the molecular mechanisms of KLF4 recognition and catalysis by PRMT5. Following genotoxic stress, disruption of PRMT5-mediated KLF4 methylation leads to abrogation of KLF4 accumulation, which, in turn, attenuates cell cycle arrest. Mutating KLF4 methylation sites suppresses breast tumour initiation and progression, and immunohistochemical stain shows increased levels of both KLF4 and PRMT5 in breast cancer tissues. Taken together, our results point to a critical role for aberrant KLF4 regulation by PRMT5 in genome stability and breast carcinogenesis. PMID:26420673

  16. CRISPR/Cas9-Mediated Genome Editing as a Therapeutic Approach for Leber Congenital Amaurosis 10.

    PubMed

    Ruan, Guo-Xiang; Barry, Elizabeth; Yu, Dan; Lukason, Michael; Cheng, Seng H; Scaria, Abraham

    2017-02-01

    As the most common subtype of Leber congenital amaurosis (LCA), LCA10 is a severe retinal dystrophy caused by mutations in the CEP290 gene. The most frequent mutation found in patients with LCA10 is a deep intronic mutation in CEP290 that generates a cryptic splice donor site. The large size of the CEP290 gene prevents its use in adeno-associated virus (AAV)-mediated gene augmentation therapy. Here, we show that targeted genomic deletion using the clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system represents a promising therapeutic approach for the treatment of patients with LCA10 bearing the CEP290 splice mutation. We generated a cellular model of LCA10 by introducing the CEP290 splice mutation into 293FT cells and we showed that guide RNA pairs coupled with SpCas9 were highly efficient at removing the intronic splice mutation and restoring the expression of wild-type CEP290. In addition, we demonstrated that a dual AAV system could effectively delete an intronic fragment of the Cep290 gene in the mouse retina. To minimize the immune response to prolonged expression of SpCas9, we developed a self-limiting CRISPR/Cas9 system that minimizes the duration of SpCas9 expression. These results support further studies to determine the therapeutic potential of CRISPR/Cas9-based strategies for the treatment of patients with LCA10.

  17. FLP-mediated site-specific recombination for genome modification in turfgrass.

    PubMed

    Hu, Qian; Nelson, Kimberly; Luo, Hong

    2006-11-01

    To develop molecular strategies for gene containment in genetically modified (GM) turfgrass, we have studied the feasibility of using the FLP/FRT site-specific DNA recombination system from yeast for controlled genome modification in turfgrass. Suspension cell cultures of creeping bentgrass (Agrostis stolonifera L.) and Kentucky bluegrass (Poa pratensis) were co-transformed with a FLP recombinase expression vector and a recombination-reporter test plasmid containing beta-glucuronidase (gusA) gene which was separated from the maize ubiquitin (ubi) promoter by an FRT-flanked blocking DNA sequence to prevent its transcription. GUS activity was observed in co-transformed cells, in which molecular analyses indicated that FLP-mediated excision of the blocking sequence had brought into proximity the upstream promoter and the downstream reporter gene, resulting in GUS expression. Functional evaluation of the FLP/FRT system using transgenic creeping bentgrass stably expressing FLP recombinase confirmed the observation in suspension cell culture. Our results indicate that FLP/FRT system is a useful tool for genetic manipulation of turfgrass, pointing to the great potential of exploiting the system to develop molecular strategies for transgene containment in perennials.

  18. Genomic Access to Monarch Migration Using TALEN and CRISPR/Cas9-Mediated Targeted Mutagenesis

    PubMed Central

    Markert, Matthew J.; Zhang, Ying; Enuameh, Metewo S.; Reppert, Steven M.; Wolfe, Scot A.; Merlin, Christine

    2016-01-01

    The eastern North American monarch butterfly, Danaus plexippus, is an emerging model system to study the neural, molecular, and genetic basis of animal long-distance migration and animal clockwork mechanisms. While genomic studies have provided new insight into migration-associated and circadian clock genes, the general lack of simple and versatile reverse-genetic methods has limited in vivo functional analysis of candidate genes in this species. Here, we report the establishment of highly efficient and heritable gene mutagenesis methods in the monarch butterfly using transcriptional activator-like effector nucleases (TALENs) and CRISPR-associated RNA-guided nuclease Cas9 (CRISPR/Cas9). Using two clock gene loci, cryptochrome 2 and clock (clk), as candidates, we show that both TALENs and CRISPR/Cas9 generate high-frequency nonhomologous end-joining (NHEJ)-mediated mutations at targeted sites (up to 100%), and that injecting fewer than 100 eggs is sufficient to recover mutant progeny and generate monarch knockout lines in about 3 months. Our study also genetically defines monarch CLK as an essential component of the transcriptional activation complex of the circadian clock. The methods presented should not only greatly accelerate functional analyses of many aspects of monarch biology, but are also anticipated to facilitate the development of these tools in other nontraditional insect species as well as the development of homology-directed knock-ins. PMID:26837953

  19. The RNAPII-CTD Maintains Genome Integrity through Inhibition of Retrotransposon Gene Expression and Transposition

    PubMed Central

    Aristizabal, Maria J.; Negri, Gian Luca; Kobor, Michael S.

    2015-01-01

    RNA polymerase II (RNAPII) contains a unique C-terminal domain that is composed of heptapeptide repeats and which plays important regulatory roles during gene expression. RNAPII is responsible for the transcription of most protein-coding genes, a subset of non-coding genes, and retrotransposons. Retrotransposon transcription is the first step in their multiplication cycle, given that the RNA intermediate is required for the synthesis of cDNA, the material that is ultimately incorporated into a new genomic location. Retrotransposition can have grave consequences to genome integrity, as integration events can change the gene expression landscape or lead to alteration or loss of genetic information. Given that RNAPII transcribes retrotransposons, we sought to investigate if the RNAPII-CTD played a role in the regulation of retrotransposon gene expression. Importantly, we found that the RNAPII-CTD functioned to maintaining genome integrity through inhibition of retrotransposon gene expression, as reducing CTD length significantly increased expression and transposition rates of Ty1 elements. Mechanistically, the increased Ty1 mRNA levels in the rpb1-CTD11 mutant were partly due to Cdk8-dependent alterations to the RNAPII-CTD phosphorylation status. In addition, Cdk8 alone contributed to Ty1 gene expression regulation by altering the occupancy of the gene-specific transcription factor Ste12. Loss of STE12 and TEC1 suppressed growth phenotypes of the RNAPII-CTD truncation mutant. Collectively, our results implicate Ste12 and Tec1 as general and important contributors to the Cdk8, RNAPII-CTD regulatory circuitry as it relates to the maintenance of genome integrity. PMID:26496706

  20. Efficient generation of knock-in transgenic zebrafish carrying reporter/driver genes by CRISPR/Cas9-mediated genome engineering.

    PubMed

    Kimura, Yukiko; Hisano, Yu; Kawahara, Atsuo; Higashijima, Shin-ichi

    2014-10-08

    The type II bacterial CRISPR/Cas9 system is rapidly becoming popular for genome-engineering due to its simplicity, flexibility, and high efficiency. Recently, targeted knock-in of a long DNA fragment via homology-independent DNA repair has been achieved in zebrafish using CRISPR/Cas9 system. This raised the possibility that knock-in transgenic zebrafish could be efficiently generated using CRISPR/Cas9. However, how widely this method can be applied for the targeting integration of foreign genes into endogenous genomic loci is unclear. Here, we report efficient generation of knock-in transgenic zebrafish that have cell-type specific Gal4 or reporter gene expression. A donor plasmid containing a heat-shock promoter was co-injected with a short guide RNA (sgRNA) targeted for genome digestion, a sgRNA targeted for donor plasmid digestion, and Cas9 mRNA. We have succeeded in establishing stable knock-in transgenic fish with several different constructs for 4 genetic loci at a frequency being exceeding 25%. Due to its simplicity, design flexibility, and high efficiency, we propose that CRISPR/Cas9-mediated knock-in will become a standard method for the generation transgenic zebrafish.

  1. HSF-1-mediated cytoskeletal integrity determines thermotolerance and life span.

    PubMed

    Baird, Nathan A; Douglas, Peter M; Simic, Milos S; Grant, Ana R; Moresco, James J; Wolff, Suzanne C; Yates, John R; Manning, Gerard; Dillin, Andrew

    2014-10-17

    The conserved heat shock transcription factor-1 (HSF-1) is essential to cellular stress resistance and life-span determination. The canonical function of HSF-1 is to regulate a network of genes encoding molecular chaperones that protect proteins from damage caused by extrinsic environmental stress or intrinsic age-related deterioration. In Caenorhabditis elegans, we engineered a modified HSF-1 strain that increased stress resistance and longevity without enhanced chaperone induction. This health assurance acted through the regulation of the calcium-binding protein PAT-10. Loss of pat-10 caused a collapse of the actin cytoskeleton, stress resistance, and life span. Furthermore, overexpression of pat-10 increased actin filament stability, thermotolerance, and longevity, indicating that in addition to chaperone regulation, HSF-1 has a prominent role in cytoskeletal integrity, ensuring cellular function during stress and aging.

  2. HSF-1 mediated cytoskeletal integrity determines thermotolerance and lifespan

    PubMed Central

    Baird, Nathan A.; Douglas, Peter M.; Simic, Milos S.; Grant, Ana R.; Moresco, James J.; Wolff, Suzanne C.; Yates, John R.; Manning, Gerard; Dillin, Andrew

    2015-01-01

    The conserved transcription factor HSF-1 is essential to cellular stress resistance and organismal lifespan determination. The canonical function of HSF-1 is to regulate a network of molecular chaperones that maintain protein homeostasis during extrinsic environmental stresses or intrinsic age related deterioration. In the metazoan C. elegans, we engineered a modified HSF-1 strain that increases stress resistance and longevity without enhancing chaperone induction. This HSF-1 dependent health assurance acts through the regulation of pat-10. Upon heat stress pat-10 upregulation maintains a functional actin cytoskeleton and endocytic network. Loss of pat-10 causes a collapse of organismal health and failure of stress resistance. Furthermore, overexpression of pat-10 is sufficient to increase both thermotolerance and longevity by mechanisms that affect actin stability. Our findings indicate that in addition to chaperone induction, HSF-1 plays a prominent role in cytoskeletal integrity to ensure proper cellular function during times of stress and aging. PMID:25324391

  3. Anti-infectious drug repurposing using an integrated chemical genomics and structural systems biology approach.

    PubMed

    Ng, Clara; Hauptman, Ruth; Zhang, Yinliang; Bourne, Philip E; Xie, Lei

    2014-01-01

    The emergence of multi-drug and extensive drug resistance of microbes to antibiotics poses a great threat to human health. Although drug repurposing is a promising solution for accelerating the drug development process, its application to anti-infectious drug discovery is limited by the scope of existing phenotype-, ligand-, or target-based methods. In this paper we introduce a new computational strategy to determine the genome-wide molecular targets of bioactive compounds in both human and bacterial genomes. Our method is based on the use of a novel algorithm, ligand Enrichment of Network Topological Similarity (ligENTS), to map the chemical universe to its global pharmacological space. ligENTS outperforms the state-of-the-art algorithms in identifying novel drug-target relationships. Furthermore, we integrate ligENTS with our structural systems biology platform to identify drug repurposing opportunities via target similarity profiling. Using this integrated strategy, we have identified novel P. falciparum targets of drug-like active compounds from the Malaria Box, and suggest that a number of approved drugs may be active against malaria. This study demonstrates the potential of an integrative chemical genomics and structural systems biology approach to drug repurposing.

  4. A series of conditional shuttle vectors for targeted genomic integration in budding yeast

    PubMed Central

    Chou, Chia-Ching; Patel, Michael T.; Gartenberg, Marc R.

    2015-01-01

    The capacity of Saccharomyces cerevisiae to repair exposed DNA ends by homologous recombination has long been used by experimentalists to assemble plasmids from DNA fragments in vivo. While this approach works well for engineering extrachromosomal vectors, it is not well suited to the generation, recovery and reuse of integrative vectors. Here, we describe the creation of a series of conditional centromeric shuttle vectors, termed pXR vectors, that can be used for both plasmid assembly in vivo and targeted genomic integration. The defining feature of pXR vectors is that the DNA segment bearing the centromere and origin of replication, termed CEN/ARS, is flanked by a pair of loxP sites. Passaging the vectors through bacteria that express Cre recombinase reduces the loxP-CEN/ARS-loxP module to a single loxP site, thereby eliminating the ability to replicate autonomously in yeast. Each vector also contains a selectable marker gene, as well as a fragment of the HO locus, which permits targeted integration at a neutral genomic site. The pXR vectors provide a convenient and robust method to assemble DNAs for targeted genomic modifications. PMID:25736914

  5. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

    PubMed Central

    Patel, Ravi K.; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  6. A series of conditional shuttle vectors for targeted genomic integration in budding yeast.

    PubMed

    Chou, Chia-Ching; Patel, Michael T; Gartenberg, Marc R

    2015-05-01

    The capacity of Saccharomyces cerevisiae to repair exposed DNA ends by homologous recombination has long been used by experimentalists to assemble plasmids from DNA fragments in vivo. While this approach works well for engineering extrachromosomal vectors, it is not well suited to the generation, recovery and reuse of integrative vectors. Here, we describe the creation of a series of conditional centromeric shuttle vectors, termed pXR vectors, that can be used for both plasmid assembly in vivo and targeted genomic integration. The defining feature of pXR vectors is that the DNA segment bearing the centromere and origin of replication, termed CEN/ARS, is flanked by a pair of loxP sites. Passaging the vectors through bacteria that express Cre recombinase reduces the loxP-CEN/ARS-loxP module to a single loxP site, thereby eliminating the ability to replicate autonomously in yeast. Each vector also contains a selectable marker gene, as well as a fragment of the HO locus, which permits targeted integration at a neutral genomic site. The pXR vectors provide a convenient and robust method to assemble DNAs for targeted genomic modifications.

  7. Tet3 and DNA replication mediate demethylation of both the maternal and paternal genomes in mouse zygotes.

    PubMed

    Shen, Li; Inoue, Azusa; He, Jin; Liu, Yuting; Lu, Falong; Zhang, Yi

    2014-10-02

    With the exception of imprinted genes and certain repeats, DNA methylation is globally erased during preimplantation development. Recent studies have suggested that Tet3-mediated oxidation of 5-methylcytosine (5mC) and DNA replication-dependent dilution both contribute to global paternal DNA demethylation, but demethylation of the maternal genome occurs via replication. Here we present genome-scale DNA methylation maps for both the paternal and maternal genomes of Tet3-depleted and/or DNA replication-inhibited zygotes. In both genomes, we found that inhibition of DNA replication blocks DNA demethylation independently from Tet3 function and that Tet3 facilitates DNA demethylation largely by coupling with DNA replication. For both genomes, our data indicate that replication-dependent dilution is the major contributor to demethylation, but Tet3 plays an important role, particularly at certain loci. Our study thus defines the respective functions of Tet3 and DNA replication in paternal DNA demethylation and reveals an unexpected contribution of Tet3 to demethylation of the maternal genome.

  8. Endonuclease mediated genome editing in drug discovery and development: promises and challenges.

    PubMed

    Prabhu, Vidya; Xu, Han

    Site specific genome editing has been gradually employed in drug discovery and development process over the past few decades. Recent development of CRISPR technology has significantly accelerated the incorporation of genome editing in the bench side to bedside process. In this review, we summarize examples of applications of genome editing in the drug discovery and development process. We also discuss current hurdles and solutions of genome editing.

  9. Construction of an Ortholog Database Using the Semantic Web Technology for Integrative Analysis of Genomic Data

    PubMed Central

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis. PMID:25875762

  10. Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data.

    PubMed

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis.

  11. Integrated physical, genetic and genome map of chickpea (Cicer arietinum L.).

    PubMed

    Varshney, Rajeev K; Mir, Reyazul Rouf; Bhatia, Sabhyata; Thudi, Mahendar; Hu, Yuqin; Azam, Sarwar; Zhang, Yong; Jaganathan, Deepa; You, Frank M; Gao, Jinliang; Riera-Lizarazu, Oscar; Luo, Ming-Cheng

    2014-03-01

    Physical map of chickpea was developed for the reference chickpea genotype (ICC 4958) using bacterial artificial chromosome (BAC) libraries targeting 71,094 clones (~12× coverage). High information content fingerprinting (HICF) of these clones gave high-quality fingerprinting data for 67,483 clones, and 1,174 contigs comprising 46,112 clones and 3,256 singletons were defined. In brief, 574 Mb genome size was assembled in 1,174 contigs with an average of 0.49 Mb per contig and 3,256 singletons represent 407 Mb genome. The physical map was linked with two genetic maps with the help of 245 BAC-end sequence (BES)-derived simple sequence repeat (SSR) markers. This allowed locating some of the BACs in the vicinity of some important quantitative trait loci (QTLs) for drought tolerance and reistance to Fusarium wilt and Ascochyta blight. In addition, fingerprinted contig (FPC) assembly was also integrated with the draft genome sequence of chickpea. As a result, ~965 BACs including 163 minimum tilling path (MTP) clones could be mapped on eight pseudo-molecules of chickpea forming 491 hypothetical contigs representing 54,013,992 bp (~54 Mb) of the draft genome. Comprehensive analysis of markers in abiotic and biotic stress tolerance QTL regions led to identification of 654, 306 and 23 genes in drought tolerance "QTL-hotspot" region, Ascochyta blight resistance QTL region and Fusarium wilt resistance QTL region, respectively. Integrated physical, genetic and genome map should provide a foundation for cloning and isolation of QTLs/genes for molecular dissection of traits as well as markers for molecular breeding for chickpea improvement.

  12. NAHR-mediated copy-number variants in a clinical population: Mechanistic insights into both genomic disorders and Mendelizing traits

    PubMed Central

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y.; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E.; Scott, Daryl A.; Schaaf, Christian P.; Torres-Martinez, Wilfredo; Stevens, Abby K.; Rosenfeld, Jill A.; Agadi, Satish; Francis, David; Kang, Sung-Hae L.; Breman, Amy; Lalani, Seema R.; Bacino, Carlos A.; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L.; Patel, Ankita; Shaw, Chad A.; Lupski, James R.; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-01-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5′-CCNCCNTNNCCNC-3′, correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease. PMID:23657883

  13. NAHR-mediated copy-number variants in a clinical population: mechanistic insights into both genomic disorders and Mendelizing traits.

    PubMed

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E; Scott, Daryl A; Schaaf, Christian P; Torres-Martinez, Wilfredo; Stevens, Abby K; Rosenfeld, Jill A; Agadi, Satish; Francis, David; Kang, Sung-Hae L; Breman, Amy; Lalani, Seema R; Bacino, Carlos A; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L; Patel, Ankita; Shaw, Chad A; Lupski, James R; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-09-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3', correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease.

  14. Maintaining Pedagogical Integrity of a Computer Mediated Course Delivery in Social Foundations

    ERIC Educational Resources Information Center

    Stewart, Shelley; Cobb-Roberts, Deirdre; Shircliffe, Barbara J.

    2013-01-01

    Transforming a face to face course to a computer mediated format in social foundations (interdisciplinary field in education), while maintaining pedagogical integrity, involves strategic collaboration between instructional technologists and content area experts. This type of planned partnership requires open dialogue and a mutual respect for prior…

  15. Genome-wide variant analysis of simplex autism families with an integrative clinical-bioinformatics pipeline

    PubMed Central

    Jiménez-Barrón, Laura T.; O'Rawe, Jason A.; Wu, Yiyang; Yoon, Margaret; Fang, Han; Iossifov, Ivan; Lyon, Gholson J.

    2015-01-01

    Autism spectrum disorders (ASDs) are a group of developmental disabilities that affect social interaction and communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASDs, in which many different loci are involved. Although many current population-scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole-genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de novo, autosomal recessive, X-linked, mitochondrial, and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous copy-number variations (CNVs), a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole-genome sequencing data can generate reliable results for use in downstream investigations. PMID:27148569

  16. Non-canonical integration events in Pichia pastoris encountered during standard transformation analysed with genome sequencing

    PubMed Central

    Schwarzhans, Jan-Philipp; Wibberg, Daniel; Winkler, Anika; Luttermann, Tobias; Kalinowski, Jörn; Friehs, Karl

    2016-01-01

    The non-conventional yeast Pichia pastoris is a popular host for recombinant protein production in scientific research and industry. Typically, the expression cassette is integrated into the genome via homologous recombination. Due to unknown integration events, a large clonal variability is often encountered consisting of clones with different productivities as well as aberrant morphological or growth characteristics. In this study, we analysed several clones with abnormal colony morphology and discovered unpredicted integration events via whole genome sequencing. These include (i) the relocation of the locus targeted for replacement to another chromosome (ii) co-integration of DNA from the E. coli plasmid host and (iii) the disruption of untargeted genes affecting colony morphology. Most of these events have not been reported so far in literature and present challenges for genetic engineering approaches in this yeast. Especially, the presence and independent activity of E. coli DNA elements in P. pastoris is of concern. In our study, we provide a deeper insight into these events and their potential origins. Steps preventing or reducing the risk for these phenomena are proposed and will help scientists working on genetic engineering of P. pastoris or similar non-conventional yeast to better understand and control clonal variability. PMID:27958335

  17. Integration of HIV in the Human Genome: Which Sites Are Preferential? A Genetic and Statistical Assessment

    PubMed Central

    Gonçalves, Juliana; Moreira, Elsa; Sequeira, Inês J.; Rodrigues, António S.; Rueff, José; Brás, Aldina

    2016-01-01

    Chromosomal fragile sites (FSs) are loci where gaps and breaks may occur and are preferential integration targets for some viruses, for example, Hepatitis B, Epstein-Barr virus, HPV16, HPV18, and MLV vectors. However, the integration of the human immunodeficiency virus (HIV) in Giemsa bands and in FSs is not yet completely clear. This study aimed to assess the integration preferences of HIV in FSs and in Giemsa bands using an in silico study. HIV integration positions from Jurkat cells were used and two nonparametric tests were applied to compare HIV integration in dark versus light bands and in FS versus non-FS (NFSs). The results show that light bands are preferential targets for integration of HIV-1 in Jurkat cells and also that it integrates with equal intensity in FSs and in NFSs. The data indicates that HIV displays different preferences for FSs compared to other viruses. The aim was to develop and apply an approach to predict the conditions and constraints of HIV insertion in the human genome which seems to adequately complement empirical data. PMID:27294106

  18. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis

    PubMed Central

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md. Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273

  19. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis.

    PubMed

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-09-16

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates.

  20. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  1. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  2. LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes.

    PubMed

    Li, Jun; Dai, Xinbin; Liu, Tingsong; Zhao, Patrick Xuechun

    2012-01-01

    Legumes play a vital role in maintaining the nitrogen cycle of the biosphere. They conduct symbiotic nitrogen fixation through endosymbiotic relationships with bacteria in root nodules. However, this and other characteristics of legumes, including mycorrhization, compound leaf development and profuse secondary metabolism, are absent in the typical model plant Arabidopsis thaliana. We present LegumeIP (http://plantgrn.noble.org/LegumeIP/), an integrative database for comparative genomics and transcriptomics of model legumes, for studying gene function and genome evolution in legumes. LegumeIP compiles gene and gene family information, syntenic and phylogenetic context and tissue-specific transcriptomic profiles. The database holds the genomic sequences of three model legumes, Medicago truncatula, Glycine max and Lotus japonicus plus two reference plant species, A. thaliana and Populus trichocarpa, with annotations based on UniProt, InterProScan, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes databases. LegumeIP also contains large-scale microarray and RNA-Seq-based gene expression data. Our new database is capable of systematic synteny analysis across M. truncatula, G. max, L. japonicas and A. thaliana, as well as construction and phylogenetic analysis of gene families across the five hosted species. Finally, LegumeIP provides comprehensive search and visualization tools that enable flexible queries based on gene annotation, gene family, synteny and relative gene expression.

  3. MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics.

    PubMed

    Schoof, Heiko; Ernst, Rebecca; Nazarov, Vladimir; Pfeifer, Lukas; Mewes, Hans-Werner; Mayer, Klaus F X

    2004-01-01

    Arabidopsis thaliana is the most widely studied model plant. Functional genomics is intensively underway in many laboratories worldwide. Beyond the basic annotation of the primary sequence data, the annotated genetic elements of Arabidopsis must be linked to diverse biological data and higher order information such as metabolic or regulatory pathways. The MIPS Arabidopsis thaliana database MAtDB aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The genome sequence as a common backbone serves as a scaffold for the integration of data, while, in a complementary effort, these data are enhanced through the application of state-of-the-art bioinformatics tools. This information is visualized on a genome-wide and a gene-by-gene basis with access both for web users and applications. This report updates the information given in a previous report and provides an outlook on further developments. The MAtDB web interface can be accessed at http://mips.gsf.de/proj/thal/db.

  4. Gross Deletions Involving IGHM, BTK, or Artemis: A Model for Genomic Lesions Mediated by Transposable Elements

    PubMed Central

    van Zelm, Menno C.; Geertsema, Corinne; Nieuwenhuis, Nicole; de Ridder, Dick; Conley, Mary Ellen; Schiff, Claudine; Tezcan, Ilhan; Bernatowska, Ewa; Hartwig, Nico G.; Sanders, Elisabeth A.M.; Litzman, Jiri; Kondratenko, Irina; van Dongen, Jacques J.M.; van der Burg, Mirjam

    2008-01-01

    Most genetic disruptions underlying human disease are microlesions, whereas gross lesions are rare with gross deletions being most frequently found (6%). Similar observations have been made in primary immunodeficiency genes, such as BTK, but for unknown reasons the IGHM and DCLRE1C (Artemis) gene defects frequently represent gross deletions (∼60%). We characterized the gross deletion breakpoints in IGHM-, BTK-, and Artemis-deficient patients. The IGHM deletion breakpoints did not show involvement of recombination signal sequences or immunoglobulin switch regions. Instead, five IGHM, eight BTK, and five unique Artemis breakpoints were located in or near sequences derived from transposable elements (TE). The breakpoints of four out of five disrupted Artemis alleles were located in highly homologous regions, similar to Ig subclass deficiencies and Vh deletion polymorphisms. Nevertheless, these observations suggest a role for TEs in mediating gross deletions. The identified gross deletion breakpoints were mostly located in TE subclasses that were specifically overrepresented in the involved gene as compared to the average in the human genome. This concerned both long (LINE1) and short (Alu, MIR) interspersed elements, as well as LTR retrotransposons (ERV). Furthermore, a high total TE content (>40%) was associated with an increased frequency of gross deletions. Both findings were further investigated and confirmed in a total set of 20 genes disrupted in human disease. Thus, to our knowledge for the first time, we provide evidence that a high TE content, irrespective of the type of element, results in the increased incidence of gross deletions as gene disruption underlying human disease. PMID:18252213

  5. Functional genomic analysis of cotton genes with agrobacterium-mediated virus-induced gene silencing.

    PubMed

    Gao, Xiquan; Shan, Libo

    2013-01-01

    Cotton (Gossypium spp.) is one of the most agronomically important crops worldwide for its unique textile fiber production and serving as food and feed stock. Molecular breeding and genetic engineering of useful genes into cotton have emerged as advanced approaches to improve cotton yield, fiber quality, and resistance to various stresses. However, the understanding of gene functions and regulations in cotton is largely hindered by the limited molecular and biochemical tools. Here, we describe the method of an Agrobacterium infiltration-based virus-induced gene silencing (VIGS) assay to transiently silence endogenous genes in cotton at 2-week-old seedling stage. The genes of interest could be readily silenced with a consistently high efficiency. To monitor gene silencing efficiency, we have cloned cotton GrCla1 from G. raimondii, a homolog gene of Arabidopsis Cloroplastos alterados 1 (AtCla1) involved in chloroplast development, and inserted into a tobacco rattle virus (TRV) binary vector pYL156. Silencing of GrCla1 results in albino phenotype on the newly emerging leaves, serving as a visual marker for silencing efficiency. To further explore the possibility of using VIGS assay to reveal the essential genes mediating disease resistance to Verticillium dahliae, a fungal pathogen causing severe Verticillium wilt in cotton, we developed a seedling infection assay to inoculate cotton seedlings when the genes of interest are silenced by VIGS. The method we describe here could be further explored for functional genomic analysis of cotton genes involved in development and various biotic and abiotic stresses.

  6. Advances in the integration of transcriptional regulatory information into genome-scale metabolic models.

    PubMed

    Vivek-Ananth, R P; Samal, Areejit

    2016-09-01

    A major goal of systems biology is to build predictive computational models of cellular metabolism. Availability of complete genome sequences and wealth of legacy biochemical information has led to the reconstruction of genome-scale metabolic networks in the last 15 years for several organisms across the three domains of life. Due to paucity of information on kinetic parameters associated with metabolic reactions, the constraint-based modelling approach, flux balance analysis (FBA), has proved to be a vital alternative to investigate the capabilities of reconstructed metabolic networks. In parallel, advent of high-throughput technologies has led to the generation of massive amounts of omics data on transcriptional regulation comprising mRNA transcript levels and genome-wide binding profile of transcriptional regulators. A frontier area in metabolic systems biology has been the development of methods to integrate the available transcriptional regulatory information into constraint-based models of reconstructed metabolic networks in order to increase the predictive capabilities of computational models and understand the regulation of cellular metabolism. Here, we review the existing methods to integrate transcriptional regulatory information into constraint-based models of metabolic networks.

  7. Host genome integration and giant virus-induced reactivation of the virophage mavirus.

    PubMed

    Fischer, Matthias G; Hackl, Thomas

    2016-12-07

    Endogenous viral elements are increasingly found in eukaryotic genomes, yet little is known about their origins, dynamics, or function. Here we provide a compelling example of a DNA virus that readily integrates into a eukaryotic genome where it acts as an inducible antiviral defence system. We found that the virophage mavirus, a parasite of the giant Cafeteria roenbergensis virus (CroV), integrates at multiple sites within the nuclear genome of the marine protozoan Cafeteria roenbergensis. The endogenous mavirus is structurally and genetically similar to eukaryotic DNA transposons and endogenous viruses of the Maverick/Polinton family. Provirophage genes are not constitutively expressed, but are specifically activated by superinfection with CroV, which induces the production of infectious mavirus particles. Virophages can inhibit the replication of mimivirus-like giant viruses and an anti-viral protective effect of provirophages on their hosts has been hypothesized. We find that provirophage-carrying cells are not directly protected from CroV; however, lysis of these cells releases infectious mavirus particles that are then able to suppress CroV replication and enhance host survival during subsequent rounds of infection. The microbial host-parasite interaction described here involves an altruistic aspect and suggests that giant-virus-induced activation of provirophages might be ecologically relevant in natural protist populations.

  8. The Spindle Assembly Checkpoint Safeguards Genomic Integrity of Skeletal Muscle Satellite Cells.

    PubMed

    Kollu, Swapna; Abou-Khalil, Rana; Shen, Carl; Brack, Andrew S

    2015-06-09

    To ensure accurate genomic segregation, cells evolved the spindle assembly checkpoint (SAC), whose role in adult stem cells remains unknown. Inducible perturbation of a SAC kinase, Mps1, and its downstream effector, Mad2, in skeletal muscle stem cells shows the SAC to be critical for normal muscle growth, repair, and self-renewal of the stem cell pool. SAC-deficient muscle stem cells arrest in G1 phase of the cell cycle with elevated aneuploidy, resisting differentiation even under inductive conditions. p21(CIP1) is responsible for these SAC-deficient phenotypes. Despite aneuploidy's correlation with aging, we find that aged proliferating muscle stem cells display robust SAC activity without elevated aneuploidy. Thus, muscle stem cells have a two-step mechanism to safeguard their genomic integrity. The SAC prevents chromosome missegregation and, if it fails, p21(CIP1)-dependent G1 arrest limits cellular propagation and tissue integration. These mechanisms ensure that muscle stem cells with compromised genomes do not contribute to tissue homeostasis.

  9. DNA damage responses by human ELG1 in S phase are important to maintain genomic integrity.

    PubMed

    Sikdar, Nilabja; Banerjee, Soma; Lee, Kyoo-young; Wincovitch, Stephen; Pak, Evgenia; Nakanishi, Koji; Jasin, Maria; Dutra, Amalia; Myung, Kyungjae

    2009-10-01

    Genomic integrity depends on DNA replication, recombination and repair, particularly in S phase. We demonstrate that a human homologue of yeast Elg1 plays an important role in S phase to preserve genomic stability. The level of ELG1 is induced during recovery from a variety of DNA damage. In response to DNA damage, ELG1 forms distinct foci at stalled DNA replication forks that are different from DNA double strand break foci. Targeted gene knockdown of ELG1 resulted in spontaneous foci formation of gamma-H2AX, 53BP1 and phosphorylated-ATM that mark chromosomal breaks. Abnormal chromosomes including fusions, inversions and hypersensitivity to DNA damaging agents were also observed in cells expressing low level of ELG1 by targeted gene knockdown. Knockdown of ELG1 by siRNA reduced homologous recombination frequency in the I-SceI induced double strand break-dependent assay. In contrast, spontaneous homologous recombination frequency and sister chromatin exchange rate were upregulated when ELG1 was silenced by shRNA. Taken together, we propose that ELG1 would be a new member of proteins involved in maintenance of genomic integrity.

  10. Identifying master regulators of cancer and their downstream targets by integrating genomic and epigenomic features.

    PubMed

    Gevaert, Olivier; Plevritis, Sylvia

    2013-01-01

    Vast amounts of molecular data characterizing the genome, epigenome and transcriptome are becoming available for a variety of cancers. The current challenge is to integrate these diverse layers of molecular biology information to create a more comprehensive view of key biological processes underlying cancer. We developed a biocomputational algorithm that integrates copy number, DNA methylation, and gene expression data to study master regulators of cancer and identify their targets. Our algorithm starts by generating a list of candidate driver genes based on the rationale that genes that are driven by multiple genomic events in a subset of samples are unlikely to be randomly deregulated. We then select the master regulators from the candidate driver and identify their targets by inferring the underlying regulatory network of gene expression. We applied our biocomputational algorithm to identify master regulators and their targets in glioblastoma multiforme (GBM) and serous ovarian cancer. Our results suggest that the expression of candidate drivers is more likely to be influenced by copy number variations than DNA methylation. Next, we selected the master regulators and identified their downstream targets using module networks analysis. As a proof-of-concept, we show that the GBM and ovarian cancer module networks recapitulate known processes in these cancers. In addition, we identify master regulators that have not been previously reported and suggest their likely role. In summary, focusing on genes whose expression can be explained by their genomic and epigenomic aberrations is a promising strategy to identify master regulators of cancer.

  11. Towards an expansive hybrid psychology: integrating theories of the mediated mind.

    PubMed

    Brinkmann, Svend

    2011-03-01

    This article develops an integrative theory of the mind by examining how the mind, understood as a set of skills and dispositions, depends upon four sources of mediators. Harré's hybrid psychology is taken as a meta-theoretical starting point, but is expanded significantly by including the four sources of mediators that are the brain, the body, social practices and technological artefacts. It is argued that the mind is normative in the sense that mental processes do not simply happen, but can be done more or less well, and thus are subject to normative appraisal. The expanded hybrid psychology is meant to assist in integrating theoretical perspectives and research interests that are often thought of as incompatible, among them neuroscience, phenomenology of the body, social practice theory and technology studies. A main point of the article is that these perspectives each are necessary for an integrative approach to the human mind.

  12. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGES

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of

  13. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    SciTech Connect

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; Leslie, Christina

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  14. Genome-wide occupancy profile of mediator and the Srb8-11 module reveals interactions with coding regions.

    PubMed

    Zhu, Xuefeng; Wirén, Marianna; Sinha, Indranil; Rasmussen, Nina N; Linder, Tomas; Holmberg, Steen; Ekwall, Karl; Gustafsson, Claes M

    2006-04-21

    Mediator exists in a free form containing the Med12, Med13, CDK8, and CycC subunits (the Srb8-11 module) and a smaller form, which lacks these four subunits and associates with RNA polymerase II (Pol II), forming a holoenzyme. We use chromatin immunoprecipitation (ChIP) and DNA microarrays to investigate genome-wide localization of Mediator and the Srb8-11 module in fission yeast. Mediator and the Srb8-11 module display similar binding patterns, and interactions with promoters and upstream activating sequences correlate with increased transcription activity. Unexpectedly, Mediator also interacts with the downstream coding region of many genes. These interactions display a negative bias for positions closer to the 5' ends of open reading frames (ORFs) and appear functionally important, because downregulation of transcription in a temperature-sensitive med17 mutant strain correlates with increased Mediator occupancy in the coding region. We propose that Mediator coordinates transcription initiation with transcriptional events in the coding region of eukaryotic genes.

  15. A case of bilateral human herpes virus 6 panuveitis with genomic viral DNA integration

    PubMed Central

    2014-01-01

    Background We report a rare case of bilateral panuveitis from human herpes virus 6 (HHV-6) with genomic viral DNA integration in an immunocompromised man. Findings A 59-year-old man with history of multiple myeloma presented with altered mental status, bilateral eye redness, and blurry vision. Examination revealed bilateral diffuse keratic precipitates, 4+ anterior chamber cell, hypopyon, vitritis, and intraretinal hemorrhages. Intraocular fluid testing by polymerase chain reaction (PCR) was positive for HHV-6. The patient was successfully treated with intravitreal foscarnet and intravenous ganciclovir and foscarnet. Despite clinical improvement, his serum HHV-6 levels remained high, and it was concluded that he had HHV-6 chromosomal integration. Conclusions HHV-6 should be considered in the differential for infectious uveitis in immunocompromised hosts who may otherwise have a negative work-up. HHV-6 DNA integration may lead to difficulties in disease diagnosis and determining disease resolution. PMID:24995045

  16. The Fanconi Anemia Pathway Protects Genome Integrity from R-loops.

    PubMed

    García-Rubio, María L; Pérez-Calero, Carmen; Barroso, Sonia I; Tumini, Emanuela; Herrera-Moyano, Emilia; Rosado, Iván V; Aguilera, Andrés

    2015-11-01

    Co-transcriptional RNA-DNA hybrids (R loops) cause genome instability. To prevent harmful R loop accumulation, cells have evolved specific eukaryotic factors, one being the BRCA2 double-strand break repair protein. As BRCA2 also protects stalled replication forks and is the FANCD1 member of the Fanconi Anemia (FA) pathway, we investigated the FA role in R loop-dependent genome instability. Using human and murine cells defective in FANCD2 or FANCA and primary bone marrow cells from FANCD2 deficient mice, we show that the FA pathway removes R loops, and that many DNA breaks accumulated in FA cells are R loop-dependent. Importantly, FANCD2 foci in untreated and MMC-treated cells are largely R loop dependent, suggesting that the FA functions at R loop-containing sites. We conclude that co-transcriptional R loops and R loop-mediated DNA damage greatly contribute to genome instability and that one major function of the FA pathway is to protect cells from R loops.

  17. The Fanconi Anemia Pathway Protects Genome Integrity from R-loops

    PubMed Central

    García-Rubio, María L.; Pérez-Calero, Carmen; Barroso, Sonia I.; Tumini, Emanuela; Herrera-Moyano, Emilia; Rosado, Iván V.; Aguilera, Andrés

    2015-01-01

    Co-transcriptional RNA-DNA hybrids (R loops) cause genome instability. To prevent harmful R loop accumulation, cells have evolved specific eukaryotic factors, one being the BRCA2 double-strand break repair protein. As BRCA2 also protects stalled replication forks and is the FANCD1 member of the Fanconi Anemia (FA) pathway, we investigated the FA role in R loop-dependent genome instability. Using human and murine cells defective in FANCD2 or FANCA and primary bone marrow cells from FANCD2 deficient mice, we show that the FA pathway removes R loops, and that many DNA breaks accumulated in FA cells are R loop-dependent. Importantly, FANCD2 foci in untreated and MMC-treated cells are largely R loop dependent, suggesting that the FA functions at R loop-containing sites. We conclude that co-transcriptional R loops and R loop-mediated DNA damage greatly contribute to genome instability and that one major function of the FA pathway is to protect cells from R loops. PMID:26584049

  18. Npl3, a new link between RNA-binding proteins and the maintenance of genome integrity

    PubMed Central

    Santos-Pereira, José M; Herrero, Ana B; Moreno, Sergio; Aguilera, Andrés

    2014-01-01

    The mRNA is co-transcriptionally bound by a number of RNA-binding proteins (RBPs) that contribute to its processing and formation of an export-competent messenger ribonucleoprotein particle (mRNP). In the last few years, increasing evidence suggests that RBPs play a key role in preventing transcription-associated genome instability. Part of this instability is mediated by the accumulation of co-transcriptional R loops, which may impair replication fork (RF) progression due to collisions between transcription and replication machineries. In addition, some RBPs have been implicated in DNA repair and/or the DNA damage response (DDR). Recently, the Npl3 protein, one of the most abundant heterogeneous nuclear ribonucleoproteins (hnRNPs) in yeast, has been shown to prevent transcription-associated genome instability and accumulation of RF obstacles, partially associated with R-loop formation. Interestingly, Npl3 seems to have additional functions in DNA repair, and npl3∆ mutants are highly sensitive to genotoxic agents, such as the antitumor drug trabectedin. Here we discuss the role of Npl3 in particular, and RBPs in general, in the connection of transcription with replication and genome instability, and its effect on the DDR. PMID:24694687

  19. Cerebral White Matter Integrity Mediates Adult Age Differences in Cognitive Performance

    PubMed Central

    Madden, David J.; Spaniol, Julia; Costello, Matthew C.; Bucur, Barbara; White, Leonard E.; Cabeza, Roberto; Davis, Simon W.; Dennis, Nancy A.; Provenzale, James M.; Huettel, Scott A.

    2009-01-01

    Previous research has established that age-related decline occurs in measures of cerebral white matter integrity, but the role of this decline in age-related cognitive changes is not clear. To conclude that white matter integrity has a mediating (causal) contribution, it is necessary to demonstrate that statistical control of the white matter-cognition relation reduces the magnitude of age-cognition relation. In this research, we tested the mediating role of white matter integrity, in the context of a task switching paradigm involving word categorization. Participants were 20 healthy, community-dwelling older adults (60–85 years), and 20 younger adults (18–27 years). From diffusion tensor imaging (DTI) tractography, we obtained fractional anisotropy (FA) as an index of white matter integrity in the genu and splenium of the corpus callosum and the superior longitudinal fasciculus (SLF). Mean FA values exhibited age-related decline consistent with a decrease in white matter integrity. From a model of reaction time distributions, we obtained independent estimates of the decisional and nondecisional (perceptual-motor) components of task performance. Age-related decline was evident in both components. Critically, age differences in task performance were mediated by FA in two regions: the central portion of the genu, and splenium-parietal fibers in the right hemisphere. This relation held only for the decisional component and was not evident in the nondecisional component. This result is the first demonstration that the integrity of specific white matter tracts is a mediator of age-related changes in cognitive performance. PMID:18564054

  20. BiologicalNetworks 2.0 - an integrative view of genome biology data

    PubMed Central

    2010-01-01

    Background A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. Results Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. Conclusions The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org. PMID:21190573

  1. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

  2. The GDB Human Genome Data Base: a source of integrated genetic mapping and disease data.

    PubMed Central

    Brandt, K A

    1993-01-01

    The GDB Human Genome Data Base refers collectively to GDB and OMIM, Online Mendelian Inheritance in Man. GDB and OMIM are linked databases that provide an international repository for information generated by the Human Genome Initiative. GDB contains human gene mapping data, while OMIM offers the text of Dr. Victor A. McKusick's catalog of genetic disease and phenotype descriptions. These databases, updated and edited continuously, integrate bibliographic and full-text information with several types of mapping data. They are accessible through a flexible interface and are available through SprintNet and the Internet to the scientific community without cost. This paper provides an overview of the context, development, structure, content, and use of these databases. PMID:8374584

  3. An integrated map of structural variation in 2,504 human genomes

    PubMed Central

    Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K.; Malhotra, Ankit; Stütz, Adrian M.; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J.P.; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y. K.; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M.; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A.; Marth, Gabor; Mason, Christopher E.; Menelaou, Androniki; Muzny, Donna M.; Nelson, Bradley J.; Noor, Amina; Parrish, Nicholas F.; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E.; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A.; Untergasser, Andreas; Walker, Jerilyn A.; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A.; McCarroll, Steven A.; Mills, Ryan E.; Gerstein, Mark B.; Bashir, Ali; Stegle, Oliver; Devine, Scott E.; Lee, Charles; Eichler, Evan E.; Korbel, Jan O.

    2015-01-01

    Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association. PMID:26432246

  4. Genome Integration and Excision by a New Streptomyces Bacteriophage, ϕJoe

    PubMed Central

    Haley, Joshua A.; Stark, W. Marshall

    2016-01-01

    ABSTRACT Bacteriophages are the source of many valuable tools for molecular biology and genetic manipulation. In Streptomyces, most DNA cloning vectors are based on serine integrase site-specific DNA recombination systems derived from phage. Because of their efficiency and simplicity, serine integrases are also used for diverse synthetic biology applications. Here, we present the genome of a new Streptomyces phage, ϕJoe, and investigate the conditions for integration and excision of the ϕJoe genome. ϕJoe belongs to the largest Streptomyces phage cluster (R4-like) and encodes a serine integrase. The attB site from Streptomyces venezuelae was used efficiently by an integrating plasmid, pCMF92, constructed using the ϕJoe int-attP locus. The attB site for ϕJoe integrase was occupied in several Streptomyces genomes, including that of S. coelicolor, by a mobile element that varies in gene content and size between host species. Serine integrases require a phage-encoded recombination directionality factor (RDF) to activate the excision reaction. The ϕJoe RDF was identified, and its function was confirmed in vivo. Both the integrase and RDF were active in in vitro recombination assays. The ϕJoe site-specific recombination system is likely to be an important addition to the synthetic biology and genome engineering toolbox. IMPORTANCE Streptomyces spp. are prolific producers of secondary metabolites, including many clinically useful antibiotics. Bacteriophage-derived integrases are important tools for genetic engineering, as they enable integration of heterologous DNA into the Streptomyces chromosome with ease and high efficiency. Recently, researchers have been applying phage integrases for a variety of applications in synthetic biology, including rapid assembly of novel combinations of genes, biosensors, and biocomputing. An important requirement for optimal experimental design and predictability when using integrases, however, is the need for multiple enzymes with

  5. Genome Integration and Excision by a New Streptomyces Bacteriophage, ϕJoe.

    PubMed

    Fogg, Paul C M; Haley, Joshua A; Stark, W Marshall; Smith, Margaret C M

    2017-03-01

    Bacteriophages are the source of many valuable tools for molecular biology and genetic manipulation. In Streptomyces, most DNA cloning vectors are based on serine integrase site-specific DNA recombination systems derived from phage. Because of their efficiency and simplicity, serine integrases are also used for diverse synthetic biology applications. Here, we present the genome of a new Streptomyces phage, ϕJoe, and investigate the conditions for integration and excision of the ϕJoe genome. ϕJoe belongs to the largest Streptomyces phage cluster (R4-like) and encodes a serine integrase. The attB site from Streptomyces venezuelae was used efficiently by an integrating plasmid, pCMF92, constructed using the ϕJoe int-attP locus. The attB site for ϕJoe integrase was occupied in several Streptomyces genomes, including that of S. coelicolor, by a mobile element that varies in gene content and size between host species. Serine integrases require a phage-encoded recombination directionality factor (RDF) to activate the excision reaction. The ϕJoe RDF was identified, and its function was confirmed in vivo Both the integrase and RDF were active in in vitro recombination assays. The ϕJoe site-specific recombination system is likely to be an important addition to the synthetic biology and genome engineering toolbox.IMPORTANCEStreptomyces spp. are prolific producers of secondary metabolites, including many clinically useful antibiotics. Bacteriophage-derived integrases are important tools for genetic engineering, as they enable integration of heterologous DNA into the Streptomyces chromosome with ease and high efficiency. Recently, researchers have been applying phage integrases for a variety of applications in synthetic biology, including rapid assembly of novel combinations of genes, biosensors, and biocomputing. An important requirement for optimal experimental design and predictability when using integrases, however, is the need for multiple enzymes with different

  6. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations

    PubMed Central

    Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD. PMID:26849207

  7. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    PubMed

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD.

  8. Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics.

    PubMed

    Sakai, Hiroaki; Lee, Sung Shin; Tanaka, Tsuyoshi; Numa, Hisataka; Kim, Jungsok; Kawahara, Yoshihiro; Wakimoto, Hironobu; Yang, Ching-chia; Iwamoto, Masao; Abe, Takashi; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro; Ikemura, Toshimichi; Matsumoto, Takashi; Sasaki, Takuji; Itoh, Takeshi

    2013-02-01

    The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics.

  9. Method to assemble and integrate biochemical pathways into the chloroplast genome of Chlamydomonas reinhardtii.

    PubMed

    Noor-Mohammadi, Samaneh; Pourmir, Azadeh; Johannes, Tyler W

    2012-11-01

    Recombinant protein expression in the chloroplasts of green algae has recently become more routine; however, the heterologous expression of multiple proteins or complete biosynthetic pathways remains a significant challenge. Here, we show that a modified DNA Assembler approach can be used to rapidly assemble multiple-gene biosynthetic pathways in yeast and then integrate these assembled pathways at a site-specific location in the chloroplast genome of the microalgal species Chlamydomonas reinhardtii. As a proof of concept, this method was used to successfully integrate and functionally express up to three reporter proteins (AphA6, AadA, and GFP) in the chloroplast of C. reinhardtii. An analysis of the relative gene expression of the engineered strains showed significant differences in the mRNA expression levels of the reporter genes and thus highlights the importance of proper promoter/untranslated region selection when constructing a target pathway. This new method represents a useful genetic tool in the construction and integration of complex biochemical pathways into the chloroplast genome of microalgae and should aid current efforts to engineer algae for biofuels production and other desirable natural products.

  10. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care.

    PubMed

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K; Chung, Yeun-Jun

    2013-12-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population.

  11. Perspectives on Clinical Informatics: Integrating Large-Scale Clinical, Genomic, and Health Information for Clinical Care

    PubMed Central

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K.

    2013-01-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  12. Integration of the Rat Recombination and EST Maps in the Rat Genomic Sequence and Comparative Mapping Analysis With the Mouse Genome

    PubMed Central

    Wilder, Steven P.; Bihoreau, Marie-Thérèse; Argoud, Karène; Watanabe, Takeshi K.; Lathrop, Mark; Gauguier, Dominique

    2004-01-01

    Inbred strains of the laboratory rat are widely used for identifying genetic regions involved in the control of complex quantitative phenotypes of biomedical importance. The draft genomic sequence of the rat now provides essential information for annotating rat quantitative trait locus (QTL) maps. Following the survey of unique rat microsatellite (11,585 including 1648 new markers) and EST (10,067) markers currently available, we have incorporated a selection of 7952 rat EST sequences in an improved version of the integrated linkage-radiation hybrid map of the rat containing 2058 microsatellite markers which provided over 10,000 potential anchor points between rat QTL and the genomic sequence of the rat. A total of 996 genetic positions were resolved (avg. spacing 1.77 cM) in a single large intercross and anchored in the rat genomic sequence (avg. spacing 1.62 Mb). Comparative genome maps between rat and mouse were constructed by successful computational alignment of 6108 mapped rat ESTs in the mouse genome. The integration of rat linkage maps in the draft genomic sequence of the rat and that of other species represents an essential step for translating rat QTL intervals into human chromosomal targets. PMID:15060020

  13. Functional visualization and disruption of targeted genes using CRISPR/Cas9-mediated eGFP reporter integration in zebrafish

    PubMed Central

    Ota, Satoshi; Taimatsu, Kiyohito; Yanagi, Kanoko; Namiki, Tomohiro; Ohga, Rie; Higashijima, Shin-ichi; Kawahara, Atsuo

    2016-01-01

    The CRISPR/Cas9 complex, which is composed of a guide RNA (gRNA) and the Cas9 nuclease, is useful for carrying out genome modifications in various organisms. Recently, the CRISPR/Cas9-mediated locus-specific integration of a reporter, which contains the Mbait sequence targeted using Mbait-gRNA, the hsp70 promoter and the eGFP gene, has allowed the visualization of the target gene expression. However, it has not been ascertained whether the reporter integrations at both targeted alleles cause loss-of-function phenotypes in zebrafish. In this study, we have inserted the Mbait-hs-eGFP reporter into the pax2a gene because the disruption of pax2a causes the loss of the midbrain-hindbrain boundary (MHB) in zebrafish. In the heterozygous Tg[pax2a-hs:eGFP] embryos, MHB formed normally and the eGFP expression recapitulated the endogenous pax2a expression, including the MHB. We observed the loss of the MHB in homozygous Tg[pax2a-hs:eGFP] embryos. Furthermore, we succeeded in integrating the Mbait-hs-eGFP reporter into an uncharacterized gene epdr1. The eGFP expression in heterozygous Tg[epdr1-hs:eGFP] embryos overlapped the epdr1 expression, whereas the distribution of eGFP-positive cells was disorganized in the MHB of homozygous Tg[epdr1-hs:eGFP] embryos. We propose that the locus-specific integration of the Mbait-hs-eGFP reporter is a powerful method to investigate both gene expression profiles and loss-of-function phenotypes. PMID:27725766

  14. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes

    PubMed Central

    Vallenet, David; Calteau, Alexandra; Cruveiller, Stéphane; Gachet, Mathieu; Lajus, Aurélie; Josso, Adrien; Mercier, Jonathan; Renaux, Alexandre; Rollin, Johan; Rouy, Zoe; Roche, David; Scarpelli, Claude; Médigue, Claudine

    2017-01-01

    The annotation of genomes from NGS platforms needs to be automated and fully integrated. However, maintaining consistency and accuracy in genome annotation is a challenging problem because millions of protein database entries are not assigned reliable functions. This shortcoming limits the knowledge that can be extracted from genomes and metabolic models. Launched in 2005, the MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Effective comparative analysis requires a consistent and complete view of biological data, and therefore, support for reviewing the quality of functional annotation is critical. MicroScope allows users to analyze microbial (meta)genomes together with post-genomic experiment results if any (i.e. transcriptomics, re-sequencing of evolved strains, mutant collections, phenotype data). It combines tools and graphical interfaces to analyze genomes and to perform the expert curation of gene functions in a comparative context. Starting with a short overview of the MicroScope system, this paper focuses on some major improvements of the Web interface, mainly for the submission of genomic data and on original tools and pipelines that have been developed and integrated in the platform: computation of pan-genomes and prediction of biosynthetic gene clusters. Today the resource contains data for more than 6000 microbial genomes, and among the 2700 personal accounts (65% of which are now from foreign countries), 14% of the users are performing expert annotations, on at least a weekly basis, contributing to improve the quality of microbial genome annotations. PMID:27899624

  15. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes.

    PubMed

    Vallenet, David; Calteau, Alexandra; Cruveiller, Stéphane; Gachet, Mathieu; Lajus, Aurélie; Josso, Adrien; Mercier, Jonathan; Renaux, Alexandre; Rollin, Johan; Rouy, Zoe; Roche, David; Scarpelli, Claude; Médigue, Claudine

    2017-01-04

    The annotation of genomes from NGS platforms needs to be automated and fully integrated. However, maintaining consistency and accuracy in genome annotation is a challenging problem because millions of protein database entries are not assigned reliable functions. This shortcoming limits the knowledge that can be extracted from genomes and metabolic models. Launched in 2005, the MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Effective comparative analysis requires a consistent and complete view of biological data, and therefore, support for reviewing the quality of functional annotation is critical. MicroScope allows users to analyze microbial (meta)genomes together with post-genomic experiment results if any (i.e. transcriptomics, re-sequencing of evolved strains, mutant collections, phenotype data). It combines tools and graphical interfaces to analyze genomes and to perform the expert curation of gene functions in a comparative context. Starting with a short overview of the MicroScope system, this paper focuses on some major improvements of the Web interface, mainly for the submission of genomic data and on original tools and pipelines that have been developed and integrated in the platform: computation of pan-genomes and prediction of biosynthetic gene clusters. Today the resource contains data for more than 6000 microbial genomes, and among the 2700 personal accounts (65% of which are now from foreign countries), 14% of the users are performing expert annotations, on at least a weekly basis, contributing to improve the quality of microbial genome annotations.

  16. Integrative proteomics, genomics, and translational immunology approaches reveal mutated forms of Proteolipid Protein 1 (PLP1) and mutant-specific immune response in multiple sclerosis.

    PubMed

    Qendro, Veneta; Bugos, Grace A; Lundgren, Debbie H; Glynn, John; Han, May H; Han, David K

    2017-03-01

    In order to gain mechanistic insights into multiple sclerosis (MS) pathogenesis, we utilized a multi-dimensional approach to test the hypothesis that mutations in myelin proteins lead to immune activation and central nervous system autoimmunity in MS. Mass spectrometry-based proteomic analysis of human MS brain lesions revealed seven unique mutations of PLP1; a key myelin protein that is known to be destroyed in MS. Surprisingly, in-depth genomic analysis of two MS patients at the genomic DNA and mRNA confirmed mutated PLP1 in RNA, but not in the genomic DNA. Quantification of wild type and mutant PLP RNA levels by qPCR further validated the presence of mutant PLP RNA in the MS patients. To seek evidence linking mutations in abundant myelin proteins and immune-mediated destruction of myelin, specific immune response against mutant PLP1 in MS patients was examined. Thus, we have designed paired, wild type and mutant peptide microarrays, and examined antibody response to multiple mutated PLP1 in sera from MS patients. Consistent with the idea of different patients exhibiting unique mutation profiles, we found that 13 out of 20 MS patients showed antibody responses against specific but not against all the mutant-PLP1 peptides. Interestingly, we found mutant PLP-directed antibody response against specific mutant peptides in the sera of pre-MS controls. The results from integrative proteomic, genomic, and immune analyses reveal a possible mechanism of mutation-driven pathogenesis in human MS. The study also highlights the need for integrative genomic and proteomic analyses for uncovering pathogenic mechanisms of human diseases.

  17. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    PubMed

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  18. Structural basis for DNase activity of a conserved protein implicated in CRISPR-mediated genome defense.

    PubMed

    Wiedenheft, Blake; Zhou, Kaihong; Jinek, Martin; Coyle, Scott M; Ma, Wendy; Doudna, Jennifer A

    2009-06-10

    Acquired immunity in prokaryotes is achieved by integrating short fragments of foreign nucleic acids into clustered regularly interspaced short palindromic repeats (CRISPRs). This nucleic acid-based immune system is mediated by a variable cassette of up to 45 protein families that represent distinct immune system subtypes. CRISPR-associated gene 1 (cas1) encodes the only universally conserved protein component of CRISPR immune systems, yet its function is unknown. Here we show that the Cas1 protein is a metal-dependent DNA-specific endonuclease that produces double-stranded DNA fragments of approximately 80 base pairs in length. The 2.2 A crystal structure of the Cas1 protein reveals a distinct fold and a conserved divalent metal ion-binding site. Mutation of metal ion-binding residues, chelation of metal ions, or metal-ion substitution inhibits Cas1-catalyzed DNA degradation. These results provide a foundation for understanding how Cas1 contributes to CRISPR function, perhaps as part of the machinery for processing foreign nucleic acids.

  19. The GLOBE 3D Genome Platform - towards a novel system-biological paper tool to integrate the huge complexity of genome organization and function.

    PubMed

    Knoch, Tobias A; Lesnussa, Michael; Kepper, Nick; Eussen, Hubert B; Grosveld, Frank G

    2009-01-01

    Genomes are tremendous co-evolutionary holistic systems for molecular storage, processing and fabrication of information. Their system-biological complexity remains, however, still largely mysterious, despite immense sequencing achievements and huge advances in the understanding of the general sequential, three-dimensional and regulatory organization. Here, we present the GLOBE 3D Genome Platform a completely novel grid based virtual "paper" tool and in fact the first system-biological genome browser integrating the holistic complexity of genomes in a single easy comprehensible platform: Based on a detailed study of biophysical and IT requirements, every architectural level from sequence to morphology of one or several genomes can be approached in a real and in a symbolic representation simultaneously and navigated by continuous scale-free zooming within a unique three-dimensional OpenGL and grid driven environment. In principle an unlimited number of multi-dimensional data sets can be visualized, customized in terms of arrangement, shape, colour, and texture etc. as well as accessed and annotated individually or in groups using internal or external data bases/facilities. Any information can be searched and correlated by importing or calculating simple relations in real-time using grid resources. A general correlation and application platform for more complex correlative analysis and a front-end for system-biological simulations both using again the huge capabilities of grid infrastructures is currently under development. Hence, the GLOBE 3D Genome Platform is an example of a grid based approach towards a virtual desktop for genomic work combining the three fundamental distributed resources: i) visual data representation, ii) data access and management, and iii) data analysis and creation. Thus, the GLOBE 3D Genome Platform is the novel system-biology oriented information system urgently needed to access, present, annotate, and to simulate the holistic genome

  20. An Integrative Genomic Island Affects the Adaptations of the Piezophilic Hyperthermophilic Archaeon Pyrococcus yayanosii to High Temperature and High Hydrostatic Pressure

    PubMed Central

    Li, Zhen; Li, Xuegong; Xiao, Xiang; Xu, Jun

    2016-01-01

    Deep-sea hydrothermal vent environments are characterized by high hydrostatic pressure and sharp temperature and chemical gradients. Horizontal gene transfer is thought to play an important role in the microbial adaptation to such an extreme environment. In this study, a 21.4-kb DNA fragment was identified as a genomic island, designated PYG1, in the genomic sequence of the piezophilic hyperthermophile Pyrococcus yayanosii. According to the sequence alignment and functional annotation, the genes in PYG1 could tentatively be divided into five modules, with functions related to mobility, DNA repair, metabolic processes and the toxin-antitoxin system. Integrase can mediate the site-specific integration and excision of PYG1 in the chromosome of P. yayanosii A1. Gene replacement of PYG1 with a SimR cassette was successful. The growth of the mutant strain ΔPYG1 was compared with its parent strain P. yayanosii A2 under various stress conditions, including different pH, salinity, temperature, and hydrostatic pressure. The ΔPYG1 mutant strain showed reduced growth when grown at 100°C, while the biomass of ΔPYG1 increased significantly when cultured at 80 MPa. Differential expression of the genes in module III of PYG1 was observed under different temperature and pressure conditions. This study demonstrates the first example of an archaeal integrative genomic island that could affect the adaptation of the hyperthermophilic piezophile P. yayanosii to high temperature and high hydrostatic pressure. PMID:27965650

  1. Contrasting growth phenology of native and invasive forest shrubs mediated by genome size.

    PubMed

    Fridley, Jason D; Craddock, Alaä

    2015-08-01

    Examination of the significance of genome size to plant invasions has been largely restricted to its association with growth rate. We investigated the novel hypothesis that genome size is related to forest invasions through its association with growth phenology, as a result of the ability of large-genome species to grow more effectively through cell expansion at cool temperatures. We monitored the spring leaf phenology of 54 species of eastern USA deciduous forests, including native and invasive shrubs of six common genera. We used new measurements of genome size to evaluate its association with spring budbreak, cell size, summer leaf production rate, and photosynthetic capacity. In a phylogenetic hierarchical model that differentiated native and invasive species as a function of summer growth rate and spring budbreak timing, species with smaller genomes exhibited both faster growth and delayed budbreak compared with those with larger nuclear DNA content. Growth rate, but not budbreak timing, was associated with whether a species was native or invasive. Our results support genome size as a broad indicator of the growth behavior of woody species. Surprisingly, invaders of deciduous forests show the same small-genome tendencies of invaders of more open habitats, supporting genome size as a robust indicator of invasiveness.

  2. Pancreatic cancer modeling using retrograde viral vector delivery and in vivo CRISPR/Cas9-mediated somatic genome editing

    PubMed Central

    Chiou, Shin-Heng; Winters, Ian P.; Wang, Jing; Naranjo, Santiago; Dudgeon, Crissy; Tamburini, Fiona B.; Brady, Jennifer J.; Yang, Dian; Grüner, Barbara M.; Chuang, Chen-Hua; Caswell, Deborah R.; Zeng, Hong; Chu, Pauline; Kim, Grace E.; Carpizo, Darren R.; Kim, Seung K.; Winslow, Monte M.

    2015-01-01

    Pancreatic ductal adenocarcinoma (PDAC) is a genomically diverse, prevalent, and almost invariably fatal malignancy. Although conventional genetically engineered mouse models of human PDAC have been instrumental in understanding pancreatic cancer development, these models are much too labor-intensive, expensive, and slow to perform the extensive molecular analyses needed to adequately understand this disease. Here we demonstrate that retrograde pancreatic ductal injection of either adenoviral-Cre or lentiviral-Cre vectors allows titratable initiation of pancreatic neoplasias that progress into invasive and metastatic PDAC. To enable in vivo CRISPR/Cas9-mediated gene inactivation in the pancreas, we generated a Cre-regulated Cas9 allele and lentiviral vectors that express Cre and a single-guide RNA. CRISPR-mediated targeting of Lkb1 in combination with oncogenic Kras expression led to selection for inactivating genomic alterations, absence of Lkb1 protein, and rapid tumor growth that phenocopied Cre-mediated genetic deletion of Lkb1. This method will transform our ability to rapidly interrogate gene function during the development of this recalcitrant cancer. PMID:26178787

  3. Pancreatic cancer modeling using retrograde viral vector delivery and in vivo CRISPR/Cas9-mediated somatic genome editing.

    PubMed

    Chiou, Shin-Heng; Winters, Ian P; Wang, Jing; Naranjo, Santiago; Dudgeon, Crissy; Tamburini, Fiona B; Brady, Jennifer J; Yang, Dian; Grüner, Barbara M; Chuang, Chen-Hua; Caswell, Deborah R; Zeng, Hong; Chu, Pauline; Kim, Grace E; Carpizo, Darren R; Kim, Seung K; Winslow, Monte M

    2015-07-15

    Pancreatic ductal adenocarcinoma (PDAC) is a genomically diverse, prevalent, and almost invariably fatal malignancy. Although conventional genetically engineered mouse models of human PDAC have been instrumental in understanding pancreatic cancer development, these models are much too labor-intensive, expensive, and slow to perform the extensive molecular analyses needed to adequately understand this disease. Here we demonstrate that retrograde pancreatic ductal injection of either adenoviral-Cre or lentiviral-Cre vectors allows titratable initiation of pancreatic neoplasias that progress into invasive and metastatic PDAC. To enable in vivo CRISPR/Cas9-mediated gene inactivation in the pancreas, we generated a Cre-regulated Cas9 allele and lentiviral vectors that express Cre and a single-guide RNA. CRISPR-mediated targeting of Lkb1 in combination with oncogenic Kras expression led to selection for inactivating genomic alterations, absence of Lkb1 protein, and rapid tumor growth that phenocopied Cre-mediated genetic deletion of Lkb1. This method will transform our ability to rapidly interrogate gene function during the development of this recalcitrant cancer.

  4. Integration of medical applications: the 'mediator service' of the SynEx platform.

    PubMed

    Xu, Y; Sauquet, D; Zapletal, E; Lemaitre, D; Degoulet, P

    2000-09-01

    Interoperability is a key issue and a long-term domain of research for distributed healthcare information systems. The SynEx European project provides open and standard integration platform for both new and legacy medical applications. It aims to provide access to hospital information services, patient records, and to medical knowledge, in a seamless way, hiding the distribution aspects and the heterogeneity of the underlying systems. In this study, we describe the SynEx 'mediator service', a software engineering component, that is used to facilitate the development of mediators between any pair of SynEx components and to manage the corresponding interchange messages. Both a C++ library and a Java package of a generic mediator model are provided with several ready-to-use specialisations for well-defined use. The use of the XML technology as a powerful data interchange format and as an efficient data structure converter is proposed and discussed.

  5. Epigenetic regulation of condensin-mediated genome organization during the cell cycle and upon DNA damage through histone H3 lysine 56 acetylation.

    PubMed

    Tanaka, Atsunari; Tanizawa, Hideki; Sriswasdi, Sira; Iwasaki, Osamu; Chatterjee, Atreyi G; Speicher, David W; Levin, Henry L; Noguchi, Eishi; Noma, Ken-Ichi

    2012-11-30

    Complex genome organizations participate in various nuclear processes including transcription, DNA replication, and repair. However, the mechanisms that generate and regulate these functional genome structures remain largely unknown. Here, we describe how the Ku heterodimer complex, which functions in nonhomologous end joining, mediates clustering of long terminal repeat retrotransposons at centromeres in fission yeast. We demonstrate that the CENP-B subunit, Abp1, functions as a recruiter of the Ku complex, which in turn loads the genome-organizing machinery condensin to retrotransposons. Intriguingly, histone H3 lysine 56 (H3K56) acetylation, which functions in DNA replication and repair, interferes with Ku localization at retrotransposons without disrupting Abp1 localization and, as a consequence, dissociates condensin from retrotransposons. This dissociation releases condensin-mediated genomic associations during S phase and upon DNA damage. ATR (ATM- and Rad3-related) kinase mediates the DNA damage response of condensin-mediated genome organization. Our study describes a function of H3K56 acetylation that neutralizes condensin-mediated genome organization.

  6. MicroRNA-mediated immune modulation as a therapeutic strategy in host-implant integration.

    PubMed

    Ong, Siew-Min; Biswas, Subhra K; Wong, Siew-Cheng

    2015-07-01

    The concept of implanting an artificial device into the human body was once the preserve of science fiction, yet this approach is now often used to replace lost or damaged biological structures in human patients. However, assimilation of medical devices into host tissues is a complex process, and successful implant integration into patients is far from certain. The body's immediate response to a foreign object is immune-mediated reaction, hence there has been extensive research into biomaterials that can reduce or even ablate anti-implant immune responses. There have also been attempts to embed or coat anti-inflammatory drugs and pro-regulatory molecules onto medical devices with the aim of preventing implant rejection by the host. In this review, we summarize the key immune mediators of medical implant reaction, and we evaluate the potential of microRNAs to regulate these processes to promote wound healing, and prolong host-implant integration.

  7. HERV-Mediated Genomic Rearrangement of EYA1 in an Individual With Branchio-oto-renal Syndrome

    PubMed Central

    Sanchez-Valle, Amarilis; Wang, Xueqing; Potocki, Lorraine; Xia, Zhilian; Kang, Sung-Hae L.; Carlin, Mary E.; Michel, Donnice; Williams, Patricia; Cabrera-Meza, Gerardo; Brundage, Ellen K.; Eifert, Anna L.; Stankiewicz, Pawel; Cheung, Sau Wai; Lalani, Seema R.

    2013-01-01

    Branchio-oto-renal syndrome is characterized by branchial defects, hearing loss, preauricular pits, and renal anomalies. Mutations in EYA1 are the most common cause of branchio-oto-renal and branchio-otic syndromes. Large chromosomal aberrations of 8q13, including complex rearrangements occur in about 20% of these individuals. However, submicroscopic deletions and the molecular characterization of genomic rearrangements involving the EYA1 gene have rarely been reported. Using the array-comparative genomic hybridization, we identified non-recurrent genomic deletions including the EYA1 gene in three patients with branchio-oto-renal syndrome, short stature, and developmental delay. One of these deletions was mediated by two human endogenous retroviral sequence blocks, analogous to the AZFa microdeletion on Yq11, responsible for male infertility. This report describes the expanded phenotype of individuals, resulting from contiguous gene deletion involving the EYA1 gene and provides a molecular description of the genomic rearrangements involving this gene in branchio-oto-renal syndrome. PMID:20979191

  8. Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci

    PubMed Central

    Miller, Clint L.; Pjanic, Milos; Wang, Ting; Nguyen, Trieu; Cohain, Ariella; Lee, Jonathan D.; Perisic, Ljubica; Hedin, Ulf; Kundu, Ramendra K.; Majmudar, Deshna; Kim, Juyong B.; Wang, Oliver; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Assimes, Themistocles L.; Montgomery, Stephen B.; Schadt, Eric E.; Björkegren, Johan L.M.; Quertermous, Thomas

    2016-01-01

    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified >150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context. PMID:27386823

  9. An integrated map of genetic variation from 1,092 human genomes

    PubMed Central

    2012-01-01

    Summary Through characterising the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help understand the genetic contribution to disease. We describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methodologies to integrate information across multiple algorithms and diverse data sources we provide a validated haplotype map of 38 million SNPs, 1.4 million indels and over 14 thousand larger deletions. We show that individuals from different populations carry different profiles of rare and common variants and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways and that each individual harbours hundreds of rare non-coding variants at conserved sites, such as transcription-factor-motif disrupting changes. This resource, which captures up to 98% of accessible SNPs at a frequency of 1% in populations of medical genetics focus, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. PMID:23128226

  10. Visible integration of the adenosine deaminase (ADA) gene into the recipient genome after gene therapy.

    PubMed

    Egashira, M; Ariga, T; Kawamura, N; Miyoshi, O; Niikawa, N; Sakiyama, Y

    1998-01-23

    Gene therapy for patients with adenosine deaminase (ADA) deficiency has become practical in the 1990s, and the exogenous gene has been reported to survive for several years in the recipient genome. To evaluate the integration efficiency of the ADA gene (ADA) into peripheral blood lymphocytes (PBL) of a patient with ADA deficiency who is receiving gene therapy, we performed two-color interphase fluorescence in situ hybridization (FISH) analysis by using digoxigenin-labeled ADA-cDNA and the biotin-labeled lambda-genomic ADA clone as probes. After each of 9 sequential series of gene therapy, interphase nuclei of 100 mononuclear cells from the patient were analyzed, and those of a LASN-producing cell line were used as a control. FISH signals were detected with rhodamine and FITC for the cDNA and the genomic DNA, respectively. The number of PBL giving a transgene signal grew after the sequential gene therapies, and the proportion of signal-positive cells reached about 10%. Our results indicate that the two-color FISH system can be used as a potential aid to monitor the efficiency of the ADA gene therapy.

  11. An integrated map of genetic variation from 1,092 human genomes.

    PubMed

    Abecasis, Goncalo R; Auton, Adam; Brooks, Lisa D; DePristo, Mark A; Durbin, Richard M; Handsaker, Robert E; Kang, Hyun Min; Marth, Gabor T; McVean, Gil A

    2012-11-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.

  12. Silicon-on-insulator sensors using integrated resonance-enhanced defect-mediated photodetectors.

    PubMed

    Fard, Sahba Talebi; Murray, Kyle; Caverley, Michael; Donzella, Valentina; Flueckiger, Jonas; Grist, Samantha M; Huante-Ceron, Edgar; Schmidt, Shon A; Kwok, Ezra; Jaeger, Nicolas A F; Knights, Andrew P; Chrostowski, Lukas

    2014-11-17

    A resonance-enhanced, defect-mediated, ring resonator photodetector has been implemented as a single unit biosensor on a silicon-on-insulator platform, providing a cost effective means of integrating ring resonator sensors with photodetectors for lab-on-chip applications. This method overcomes the challenge of integrating hybrid photodetectors on the chip. The demonstrated responsivity of the photodetector-sensor was 90 mA/W. Devices were characterized using refractive index modified solutions and showed sensitivities of 30 nm/RIU.

  13. Microbial modulators of soil carbon storage: integrating genomic and metabolic knowledge for global prediction.

    PubMed

    Trivedi, Pankaj; Anderson, Ian C; Singh, Brajesh K

    2013-12-01

    Soil organic carbon performs a number of functions in ecosystems and it is clear that microbial communities play important roles in land-atmosphere carbon (C) exchange and soil C storage. In this review, we discuss microbial modulators of soil C storage, 'omics'-based approaches to characterize microbial system interactions impacting terrestrial C sequestration, and how data related to microbial composition and activities can be incorporated into mechanistic and predictive models. We argue that although making direct linkage of genomes to global phenomena is a significant challenge, many connections at intermediate scales are viable with integrated application of new systems biology approaches and powerful analytical and modelling techniques. This integration could enhance our capability to develop and evaluate microbial strategies for capturing and sequestering atmospheric CO2.

  14. Integrative genomic mining for enzyme function to enable engineering of a non-natural biosynthetic pathway

    PubMed Central

    Mak, Wai Shun; Tran, Stephen; Marcheschi, Ryan; Bertolani, Steve; Thompson, James; Baker, David; Liao, James C.; Siegel, Justin B.

    2015-01-01

    The ability to biosynthetically produce chemicals beyond what is commonly found in Nature requires the discovery of novel enzyme function. Here we utilize two approaches to discover enzymes that enable specific production of longer-chain (C5–C8) alcohols from sugar. The first approach combines bioinformatics and molecular modelling to mine sequence databases, resulting in a diverse panel of enzymes capable of catalysing the targeted reaction. The median catalytic efficiency of the computationally selected enzymes is 75-fold greater than a panel of naively selected homologues. This integrative genomic mining approach establishes a unique avenue for enzyme function discovery in the rapidly expanding sequence databases. The second approach uses computational enzyme design to reprogramme specificity. Both approaches result in enzymes with >100-fold increase in specificity for the targeted reaction. When enzymes from either approach are integrated in vivo, longer-chain alcohol production increases over 10-fold and represents >95% of the total alcohol products. PMID:26598135

  15. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

    PubMed

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  16. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research

    PubMed Central

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  17. Purdue Ionomics Information Management System. An Integrated Functional Genomics Platform1[C][W][OA

    PubMed Central

    Baxter, Ivan; Ouzzani, Mourad; Orcun, Seza; Kennedy, Brad; Jandhyala, Shrinivas S.; Salt, David E.

    2007-01-01

    The advent of high-throughput phenotyping technologies has created a deluge of information that is difficult to deal with without the appropriate data management tools. These data management tools should integrate defined workflow controls for genomic-scale data acquisition and validation, data storage and retrieval, and data analysis, indexed around the genomic information of the organism of interest. To maximize the impact of these large datasets, it is critical that they are rapidly disseminated to the broader research community, allowing open access for data mining and discovery. We describe here a system that incorporates such functionalities developed around the Purdue University high-throughput ionomics phenotyping platform. The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS is deployed as a World Wide Web-enabled system, allowing for integration of distributed workflow processes and open access to raw data for analysis by numerous laboratories. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations. PiiMS is accessible at www.purdue.edu/dp/ionomics. PMID:17189337

  18. Bridging the Gap from Bench to Bedside--An Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED).

    PubMed

    2015-01-01

    The abundance of heterogeneous biomedical data from a variety of sources demands the development of strategies to address data integration and management issues, so that the data can be used effectively in clinical practices and biomedical research. This research presents an Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED) and provides a roadmap that envisions utilizing the clinical and biomedical resources in our case study. This work describes a data integration approach, proposed by ICGED, with a two-fold purpose: personalized medicine and biomedical data storage and sharing platform. It describes our experiences integrating disease specific clinical and genomics datasets with Data Integration and Analysis Tools (DIAT)--using Informatics for Integrating Biology and the Bedside, and discusses work in progress and future work for extending DIAT, and the development of Risk Assessment and Prediction Tools, Clinical Decision Support Systems and a Bioinformatics Data Warehouse.

  19. Circulating nucleic acids damage DNA of healthy cells by integrating into their genomes.

    PubMed

    Mittra, Indraneel; Khare, Naveen Kumar; Raghuram, Gorantla Venkata; Chaubal, Rohan; Khambatti, Fatema; Gupta, Deepika; Gaikwad, Ashwini; Prasannan, Preeti; Singh, Akshita; Iyer, Aishwarya; Singh, Ankita; Upadhyay, Pawan; Nair, Naveen Kumar; Mishra, Pradyumna Kumar; Dutt, Amit

    2015-03-01

    Whether nucleic acids that circulate in blood have any patho-physiological functions in the host have not been explored.We report here that far from being inert molecules, circulating nucleic acids have significant biological activities of their own that are deleterious to healthy cells of the body. Fragmented DNA and chromatin (DNAfs and Cfs) isolated from blood of cancer patients and healthy volunteers are readily taken up by a variety of cells in culture to be localized in their nuclei within a few minutes. The intra-nuclear DNAfs and Cfs associate themselves with host cell chromosomes to evoke a cellular DNA-damage-repair-response (DDR) followed by their incorporation into the host cell genomes. Whole genome sequencing detected the presence of tens of thousands of human sequence reads in the recipient mouse cells. Genomic incorporation of DNAfs and Cfs leads to dsDNA breaks and activation of apoptotic pathways in the treated cells. When injected intravenously into Balb/C mice, DNAfs and Cfs undergo genomic integration into cells of their vital organs resulting in activation of DDR and apoptotic proteins in the recipient cells. Cfs have significantly greater activity than DNAfs with respect to all parameters examined, while both DNAfs and Cfs isolated from cancer patients are more active than those from normal volunteers. All the above pathological actions of DNAfs and Cfs described above can be abrogated by concurrent treatment with DNase I and/or anti-histone antibody complexed nanoparticles both in vitro and in vivo. Taken together, our results suggest that circulating DNAfs and Cfs are physiological, continuously arising, endogenous DNA damaging agents with implications to ageing and a multitude of human pathologies including initiation of cancer.

  20. Shade avoidance 6 encodes an Arabidopsis flap endonuclease required for maintenance of genome integrity and development

    PubMed Central

    Zhang, Yijuan; Wen, Chunhong; Liu, Songbai; Zheng, Li; Shen, Binghui; Tao, Yi

    2016-01-01

    Flap endonuclease-1 (FEN1) belongs to the Rad2 family of structure-specific nucleases. It is required for several DNA metabolic pathways, including DNA replication and DNA damage repair. Here, we have identified a shade avoidance mutant, sav6, which reduces the mRNA splicing efficiency of SAV6. We have demonstrated that SAV6 is an FEN1 homologue that shows double-flap endonuclease and gap-dependent endonuclease activity, but lacks exonuclease activity. sav6 mutants are hypersensitive to DNA damage induced by ultraviolet (UV)-C radiation and reagents that induce double-stranded DNA breaks, but exhibit normal responses to chemicals that block DNA replication. Signalling components that respond to DNA damage are constitutively activated in sav6 mutants. These data indicate that SAV6 is required for DNA damage repair and the maintenance of genome integrity. Mutant sav6 plants also show reduced root apical meristem (RAM) size and defective quiescent centre (QC) development. The expression of SMR7, a cell cycle regulatory gene, and ERF115 and PSK5, regulators of QC division, is increased in sav6 mutants. Their constitutive induction is likely due to the elevated DNA damage responses in sav6 and may lead to defects in the development of the RAM and QC. Therefore, SAV6 assures proper root development through maintenance of genome integrity. PMID:26721386

  1. Merkel Cell Polyomavirus Large T Antigen Disrupts Host Genomic Integrity and Inhibits Cellular Proliferation

    PubMed Central

    Li, Jing; Wang, Xin; Diaz, Jason; Tsang, Sabrina H.; Buck, Christopher B.

    2013-01-01

    Clonal integration of Merkel cell polyomavirus (MCV) DNA into the host genome has been observed in at least 80% of Merkel cell carcinoma (MCC). The integrated viral genome typically carries mutations that truncate the C-terminal DNA binding and helicase domains of the MCV large T antigen (LT), suggesting a selective pressure to remove this MCV LT region during tumor development. In this study, we show that MCV infection leads to the activation of host DNA damage responses (DDR). This activity was mapped to the C-terminal helicase-containing region of the MCV LT. The MCV LT-activated DNA damage kinases, in turn, led to enhanced p53 phosphorylation, upregulation of p53 downstream target genes, and cell cycle arrest. Compared to the N-terminal MCV LT fragment that is usually preserved in mutants isolated from MCC tumors, full-length MCV LT shows a decreased potential to support cellular proliferation, focus formation, and anchorage-independent cell growth. These apparently antitumorigenic effects can be reversed by a dominant-negative p53 inhibitor. Our results demonstrate that MCV LT-induced DDR activates p53 pathway, leading to the inhibition of cellular proliferation. This study reveals a key difference between MCV LT and simian vacuolating virus 40 LT, which activates a DDR but inhibits p53 function. This study also explains, in part, why truncation mutations that remove the MCV LT C-terminal region are necessary for the oncogenic progression of MCV-associated cancers. PMID:23760247

  2. Flowering and genome integrity control by a nuclear matrix protein in Arabidopsis.

    PubMed

    Xu, Yifeng; Gan, Eng-Seng; He, Yuehui; Ito, Toshiro

    2013-01-01

    The matrix attachment regions (MARs) binding proteins could finely orchestrate temporal and spatial gene expression during development. In Arabidopsis, transposable elements (TEs) and TE-like repeat sequences are transcriptionally repressed or attenuated by the coordination of many key players including DNA methyltransferases, histone deacetylases, histone methyltransferases and the siRNA pathway, which help to protect genomic integrity and control multiple developmental processes such as flowering. We have recently reported that an AT-hook nuclear matrix binding protein, TRANSPOSABLE ELEMENT SILENCING VIA AT-HOOK (TEK), participates in a histone deacetylation (HDAC) complex to silence TEs and genes containing a TE-like sequence, including AtMu1, FWA and FLOWERING LOCUS C (FLC) in Ler background. We have shown that TEK knockdown causes increased histone acetylation, reduced H3K9me2 and moderate reduction of DNA methylation in the target loci, leading to the de-repression of FLC and FWA, as well as TE reactivation. Here we discuss the role of TEK as a putative MAR binding protein which functions in the maintenance of genome integrity and in flowering control by silencing TEs and repeat-containing genes.

  3. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments.

    PubMed

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R; Verstrepen, Kevin J; Thevelein, Johan M; Tohme, Joe

    2014-04-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.

  4. Evolutionary time-scale of the begomoviruses: evidence from integrated sequences in the Nicotiana genome.

    PubMed

    Lefeuvre, Pierre; Harkins, Gordon W; Lett, Jean-Michel; Briddon, Rob W; Chase, Mark W; Moury, Benoit; Martin, Darren P

    2011-01-01

    Despite having single stranded DNA genomes that are replicated by host DNA polymerases, viruses in the family Geminiviridae are apparently evolving as rapidly as some RNA viruses. The observed substitution rates of geminiviruses in the genera Begomovirus and Mastrevirus are so high that the entire family could conceivably have originated less than a million years ago (MYA). However, the existence of geminivirus related DNA (GRD) integrated within the genomes of various Nicotiana species suggests that the geminiviruses probably originated >10 MYA. Some have even suggested that a distinct New-World (NW) lineage of begomoviruses may have arisen following the separation by continental drift of African and American proto-begomoviruses ∼110 MYA. We evaluate these various geminivirus origin hypotheses using Bayesian coalescent-based approaches to date firstly the Nicotiana GRD integration events, and then the divergence of the NW and Old-World (OW) begomoviruses. Besides rejecting the possibility of a<2 MYA OW-NW begomovirus split, we could also discount that it may have occurred concomitantly with the breakup of Gondwanaland 110 MYA. Although we could only confidently narrow the date of the split down to between 2 and 80 MYA, the most plausible (and best supported) date for the split is between 20 and 30 MYA--a time when global cooling ended the dispersal of temperate species between Asia and North America via the Beringian land bridge.

  5. Systematic characterization of deubiquitylating enzymes for roles in maintaining genome integrity

    PubMed Central

    Nishi, Ryotaro; Wijnhoven, Paul; le Sage, Carlos; Tjeertes, Jorrit; Galanty, Yaron; Forment, Josep V.; Clague, Michael J.; Urbé, Sylvie; Jackson, Stephen P

    2014-01-01

    DNA double-strand breaks (DSBs) are perhaps the most toxic of all DNA lesions, with defects in the DNA damage response to DSBs being associated with various human diseases. Although it is known that DSB repair pathways are tightly regulated by ubiquitylation, we do not yet have a comprehensive understanding of how deubiquitylating enzymes (DUBs) function in DSB responses. Here, by carrying out a multi-dimensional screening strategy for human DUBs, we identify several with hitherto unknown links to DSB repair, the G2/M DNA-damage checkpoint and genome-integrity maintenance. Phylogenetic analyses reveal functional clustering within certain DUB subgroups, suggesting evolutionally conserved functions and/or related modes-of action. Furthermore, we establish that the DUB UCHL5 regulates DSB resection and repair by homologous recombination through protecting its interactor, NFRKB, from degradation. Collectively our findings extend the list of DUBs promoting the maintenance of genome integrity, and highlight their potential as therapeutic targets for cancer. PMID:25194926

  6. Evolutionary aspects of plastid proteins involved in transcription: the transcription of a tiny genome is mediated by a complicated machinery.

    PubMed

    Yagi, Yusuke; Shiina, Takashi

    2012-01-01

    Chloroplasts in land plants have a small genome consisting of only 100 genes encoding partial sets of proteins for photosynthesis, transcription and translation. Although it has been thought that chloroplast transcription is mediated by a basically cyanobacterium-derived system, due to the endosymbiotic origin of plastids, recent studies suggest the existence of a hybrid transcription machinery containing non-bacterial proteins that have been newly acquired during plant evolution. Here, we highlight chloroplast-specific non-bacterial transcription mechanisms by which land plant chloroplasts have gained novel functions.

  7. Complete genome sequence of the Sporosarcina psychrophila DSM 6497, a psychrophilic Bacillus strain that mediates the calcium carbonate precipitation.

    PubMed

    Yan, Wenkai; Xiao, Xiang; Zhang, Yu

    2016-05-20

    Sporosarcina psychrophila DSM 6497 is a gram positive, spore-formation psychrophilic bacterial strain, widely distributed in terrestrial and aquatic environments. Here we report its complete sequence including one circular chromosome of 4674191bp with a GC content of 40.3%. Genes encoding urease are predicted in the genome, which provide insight information on the microbiologically mediated urea hydrolysis process. This urea hydrolysis can further lead to an increase of carbonate anion and alkalinity in the environment, which promotes the microbiologically induced carbonate precipitation with various applications, such as the bioremediation of calcium rich wastewater and bio-reservation of architectural patrimony.

  8. Multiple proviral integration events after virological synapse-mediated HIV-1 spread

    SciTech Connect

    Russell, Rebecca A.; Martin, Nicola; Mitar, Ivonne; Jones, Emma; Sattentau, Quentin J.

    2013-08-15

    HIV-1 can move directly between T cells via virological synapses (VS). Although aspects of the molecular and cellular mechanisms underlying this mode of spread have been elucidated, the outcomes for infection of the target cell remain incompletely understood. We set out to determine whether HIV-1 transfer via VS results in productive, high-multiplicity HIV-1 infection. We found that HIV-1 cell-to-cell spread resulted in nuclear import of multiple proviruses into target cells as seen by fluorescence in-situ hybridization. Proviral integration into the target cell genome was significantly higher than that seen in a cell-free infection system, and consequent de novo viral DNA and RNA production in the target cell detected by quantitative PCR increased over time. Our data show efficient proviral integration across VS, implying the probability of multiple integration events in target cells that drive productive T cell infection. - Highlights: • Cell-to-cell HIV-1 infection delivers multiple vRNA copies to the target cell. • Cell-to-cell infection results in productive infection of the target cell. • Cell-to-cell transmission is more efficient than cell-free HIV-1 infection. • Suggests a mechanism for recombination in cells infected with multiple viral genomes.

  9. Genome-wide conserved non-coding microsatellite (CNMS) marker-based integrative genetical genomics for quantitative dissection of seed weight in chickpea.

    PubMed

    Bajaj, Deepak; Saxena, Maneesha S; Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Tripathi, Shailesh; Upadhyaya, Hari D; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-03-01

    Phylogenetic footprinting identified 666 genome-wide paralogous and orthologous CNMS (conserved non-coding microsatellite) markers from 5'-untranslated and regulatory regions (URRs) of 603 protein-coding chickpea genes. The (CT)n and (GA)n CNMS carrying CTRMCAMV35S and GAGA8BKN3 regulatory elements, respectively, are abundant in the chickpea genome. The mapped genic CNMS markers with robust amplification efficiencies (94.7%) detected higher intraspecific polymorphic potential (37.6%) among genotypes, implying their immense utility in chickpea breeding and genetic analyses. Seventeen differentially expressed CNMS marker-associated genes showing strong preferential and seed tissue/developmental stage-specific expression in contrasting genotypes were selected to narrow down the gene targets underlying seed weight quantitative trait loci (QTLs)/eQTLs (expression QTLs) through integrative genetical genomics. The integration of transcript profiling with seed weight QTL/eQTL mapping, molecular haplotyping, and association analyses identified potential molecular tags (GAGA8BKN3 and RAV1AAT regulatory elements and alleles/haplotypes) in the LOB-domain-containing protein- and KANADI protein-encoding transcription factor genes controlling the cis-regulated expression for seed weight in the chickpea. This emphasizes the potential of CNMS marker-based integrative genetical genomics for the quantitative genetic dissection of complex seed weight in chickpea.

  10. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine

    PubMed Central

    Orechia, John; Pathak, Ameet; Shi, Yunling; Nawani, Aniket; Belozerov, Andrey; Fontes, Caitlin; Lakhiani, Camille; Jawale, Chetan; Patel, Chetansharan; Quinn, Daniel; Botvinnik, Dmitry; Mei, Eddie; Cotter, Elizabeth; Byleckie, James; Ullman-Cullere, Mollie; Chhetri, Padam; Chalasani, Poornima; Karnam, Purushotham; Beaudoin, Ronald; Sahu, Sandeep; Belozerova, Yelena; Mathew, Jomol P.

    2015-01-01

    We live in the genomic era of medicine, where a patient's genomic/molecular data is becoming increasingly important for disease diagnosis, identification of targeted therapy, and risk assessment for adverse reactions. However, decoding the genomic test results and integrating it with clinical data for retrospective studies and cohort identification for prospective clinical trials is still a challenging task. In order to overcome these barriers, we developed an overarching enterprise informatics framework for translational research and personalized medicine called Synergistic Patient and Research Knowledge Systems (SPARKS) and a suite of tools called Oncology Data Retrieval Systems (OncDRS). OncDRS enables seamless data integration, secure and self-navigated query and extraction of clinical and genomic data from heterogeneous sources. Within a year of release, the system has facilitated more than 1500 research queries and has delivered data for more than 50 research studies. PMID:27054074

  11. CRISPR/Cas9-mediated genome engineering and the promise of designer flies on demand.

    PubMed

    Gratz, Scott J; Wildonger, Jill; Harrison, Melissa M; O'Connor-Giles, Kate M

    2013-01-01

    The CRISPR/Cas9 system has attracted significant attention for its potential to transform genome engineering. We and others have recently shown that the RNA-guided Cas9 nuclease can be employed to engineer the Drosophila genome, and that these modifications are efficiently transmitted through the germline. A single targeting RNA can guide Cas9 to a specific genomic sequence where it induces double-strand breaks that, when imperfectly repaired, yield mutations. We have also demonstrated that 2 targeting RNAs can be used to generate large defined deletions and that Cas9 can catalyze gene replacement by homologous recombination. Zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) have shown similar promise in Drosophila. However, the ease of producing targeting RNAs over the generation of unique sequence-directed nucleases to guide site-specific modifications makes the CRISPR/Cas9 system an appealingly accessible method for genome editing. From the initial planning stages, engineered flies can be obtained within a month. Here we highlight the variety of genome modifications facilitated by the CRISPR/Cas9 system along with key considerations for starting your own CRISPR genome engineering project.

  12. Mycobacterium tuberculosis EsxO (Rv2346c) promotes bacillary survival by inducing oxidative stress mediated genomic instability in macrophages.

    PubMed

    Mohanty, Soumitra; Dal Molin, Michael; Ganguli, Geetanjali; Padhi, Avinash; Jena, Prajna; Selchow, Petra; Sengupta, Srabasti; Meuli, Michael; Sander, Peter; Sonawane, Avinash

    2016-01-01

    Mycobacterium tuberculosis (Mtb) survives inside the macrophages by modulating the host immune responses in its favor. The 6-kDa early secretory antigenic target (ESAT-6; esxA) of Mtb is known as a potent virulence and T-cell antigenic determinant. At least 23 such ESAT-6 family proteins are encoded in the genome of Mtb; however, the function of many of them is still unknown. We herein report that ectopic expression of Mtb Rv2346c (esxO), a member of ESAT-6 family proteins, in non-pathogenic Mycobacterium smegmatis strain (MsmRv2346c) aids host cell invasion and intracellular bacillary persistence. Further mechanistic studies revealed that MsmRv2346c infection abated macrophage immunity by inducing host cell death and genomic instability as evident from the appearance of several DNA damage markers. We further report that the induction of genomic instability in infected cells was due to increase in the hosts oxidative stress responses. MsmRv2346c infection was also found to induce autophagy and modulate the immune function of macrophages. In contrast, blockade of Rv2346c induced oxidative stress by treatment with ROS inhibitor N-acetyl-L-cysteine prevented the host cell death, autophagy induction and genomic instability in infected macrophages. Conversely, MtbΔRv2346c mutant did not show any difference in intracellular survival and oxidative stress responses. We envision that Mtb ESAT-6 family protein Rv2346c dampens antibacterial effector functions namely by inducing oxidative stress mediated genomic instability in infected macrophages, while loss of Rv2346c gene function may be compensated by other redundant ESAT-6 family proteins. Thus EsxO plays an important role in mycobacterial pathogenesis in the context of innate immunity.

  13. Maintenance of Genome Integrity: How Mammalian Cells Orchestrate Genome Duplication by Coordinating Replicative and Specialized DNA Polymerases

    PubMed Central

    Barnes, Ryan; Eckert, Kristin

    2017-01-01

    Precise duplication of the human genome is challenging due to both its size and sequence complexity. DNA polymerase errors made during replication, repair or recombination are central to creating mutations that drive cancer and aging. Here, we address the regulation of human DNA polymerases, specifically how human cells orchestrate DNA polymerases in the face of stress to complete replication and maintain genome stability. DNA polymerases of the B-family are uniquely adept at accurate genome replication, but there are numerous situations in which one or more additional DNA polymerases are required to complete genome replication. Polymerases of the Y-family have been extensively studied in the bypass of DNA lesions; however, recent research has revealed that these polymerases play important roles in normal human physiology. Replication stress is widely cited as contributing to genome instability, and is caused by conditions leading to slowed or stalled DNA replication. Common Fragile Sites epitomize “difficult to replicate” genome regions that are particularly vulnerable to replication stress, and are associated with DNA breakage and structural variation. In this review, we summarize the roles of both the replicative and Y-family polymerases in human cells, and focus on how these activities are regulated during normal and perturbed genome replication. PMID:28067843

  14. Maintenance of Genome Integrity: How Mammalian Cells Orchestrate Genome Duplication by Coordinating Replicative and Specialized DNA Polymerases.

    PubMed

    Barnes, Ryan; Eckert, Kristin

    2017-01-06

    Precise duplication of the human genome is challenging due to both its size and sequence complexity. DNA polymerase errors made during replication, repair or recombination are central to creating mutations that drive cancer and aging. Here, we address the regulation of human DNA polymerases, specifically how human cells orchestrate DNA polymerases in the face of stress to complete replication and maintain genome stability. DNA polymerases of the B-family are uniquely adept at accurate genome replication, but there are numerous situations in which one or more additional DNA polymerases are required to complete genome replication. Polymerases of the Y-family have been extensively studied in the bypass of DNA lesions; however, recent research has revealed that these polymerases play important roles in normal human physiology. Replication stress is widely cited as contributing to genome instability, and is caused by conditions leading to slowed or stalled DNA replication. Common Fragile Sites epitomize "difficult to replicate" genome regions that are particularly vulnerable to replication stress, and are associated with DNA breakage and structural variation. In this review, we summarize the roles of both the replicative and Y-family polymerases in human cells, and focus on how these activities are regulated during normal and perturbed genome replication.

  15. An Integrated Metabolomic and Genomic Mining Workflow To Uncover the Biosynthetic Potential of Bacteria.

    PubMed

    Maansson, Maria; Vynne, Nikolaj G; Klitgaard, Andreas; Nybo, Jane L; Melchiorsen, Jette; Nguyen, Don D; Sanchez, Laura M; Ziemert, Nadine; Dorrestein, Pieter C; Andersen, Mikael R; Gram, Lone

    2016-01-01

    Microorganisms are a rich source of bioactives; however, chemical identification is a major bottleneck. Strategies that can prioritize the most prolific microbial strains and novel compounds are of great interest. Here, we present an integrated approach to evaluate the biosynthetic richness in bacteria and mine the associated chemical diversity. Thirteen strains closely related to Pseudoalteromonas luteoviolacea isolated from all over the Earth were analyzed using an untargeted metabolomics strategy, and metabolomic profiles were correlated with whole-genome sequences of the strains. We found considerable diversity: only 2% of the chemical features and 7% of the biosynthetic genes were common to all strains, while 30% of all features and 24% of the genes were unique to single strains. The list of chemical features was reduced to 50 discriminating features using a genetic algorithm and support vector machines. Features were dereplicated by tandem mass spectrometry (MS/MS) networking to identify molecular families of the same biosynthetic origin, and the associated pathways were probed using comparative genomics. Most of the discriminating features were related to antibacterial compounds, including the thiomarinols that were reported from P. luteoviolacea here for the first time. By comparative genomics, we identified the biosynthetic cluster responsible for the production of the antibiotic indolmycin, which could not be predicted with standard methods. In conclusion, we present an efficient, integrative strategy for elucidating the chemical richness of a given set of bacteria and link the chemistry to biosynthetic genes. IMPORTANCE We here combine chemical analysis and genomics to probe for new bioactive secondary metabolites based on their pattern of distribution within bacterial species. We demonstrate the usefulness of this combined approach in a group of marine Gram-negative bacteria closely related to Pseudoalteromonas luteoviolacea, which is a species known

  16. An Integrated Metabolomic and Genomic Mining Workflow To Uncover the Biosynthetic Potential of Bacteria

    PubMed Central

    Maansson, Maria; Vynne, Nikolaj G.; Klitgaard, Andreas; Nybo, Jane L.; Melchiorsen, Jette; Nguyen, Don D.; Sanchez, Laura M.; Ziemert, Nadine; Dorrestein, Pieter C.

    2016-01-01

    ABSTRACT Microorganisms are a rich source of bioactives; however, chemical identification is a major bottleneck. Strategies that can prioritize the most prolific microbial strains and novel compounds are of great interest. Here, we present an integrated approach to evaluate the biosynthetic richness in bacteria and mine the associated chemical diversity. Thirteen strains closely related to Pseudoalteromonas luteoviolacea isolated from all over the Earth were analyzed using an untargeted metabolomics strategy, and metabolomic profiles were correlated with whole-genome sequences of the strains. We found considerable diversity: only 2% of the chemical features and 7% of the biosynthetic genes were common to all strains, while 30% of all features and 24% of the genes were unique to single strains. The list of chemical features was reduced to 50 discriminating features using a genetic algorithm and support vector machines. Features were dereplicated by tandem mass spectrometry (MS/MS) networking to identify molecular families of the same biosynthetic origin, and the associated pathways were probed using comparative genomics. Most of the discriminating features were related to antibacterial compounds, including the thiomarinols that were reported from P. luteoviolacea here for the first time. By comparative genomics, we identified the biosynthetic cluster responsible for the production of the antibiotic indolmycin, which could not be predicted with standard methods. In conclusion, we present an efficient, integrative strategy for elucidating the chemical richness of a given set of bacteria and link the chemistry to biosynthetic genes. IMPORTANCE We here combine chemical analysis and genomics to probe for new bioactive secondary metabolites based on their pattern of distribution within bacterial species. We demonstrate the usefulness of this combined approach in a group of marine Gram-negative bacteria closely related to Pseudoalteromonas luteoviolacea, which is a

  17. Inhibition of CRISPR/Cas9-Mediated Genome Engineering by a Type I Interferon-Induced Reduction in Guide RNA Expression.

    PubMed

    Machitani, Mitsuhiro; Sakurai, Fuminori; Wakabayashi, Keisaku; Nakatani, Kosuke; Takayama, Kazuo; Tachibana, Masashi; Mizuguchi, Hiroyuki

    2017-01-01

    Clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-mediated genome engineering technology is a powerful tool for generation of cells and animals with engineered mutations in their genomes. In order to introduce the CRISPR/Cas9 system into target cells, nonviral and viral vectors are often used; however, such vectors trigger innate immune responses associated with production of type I interferons (IFNs). We have recently demonstrated that type I IFNs inhibit short-hairpin RNA-mediated gene silencing, which led us to hypothesize that type I IFNs may also inhibit CRISPR/Cas9-mediated genome mutagenesis. Here we investigated this hypothesis. A single-strand annealing assay using a reporter plasmid demonstrated that CRISPR/Cas9-mediated cleavage efficiencies of the target double-stranded DNA were significantly reduced by IFNα. A mismatch recognition nuclease-dependent genotyping assay also demonstrated that IFNα reduced insertion or deletion (indel) mutation levels by approximately half. Treatment with IFNα did not alter Cas9 protein expression levels, whereas the copy numbers of guide RNA (gRNA) were significantly reduced by IFNα stimulation. These results indicate that type I IFNs significantly reduce gRNA expression levels following introduction of the CRISPR/Cas9 system in the cells, leading to a reduction in the efficiencies of CRISPR/Cas9-mediated genome mutagenesis. Our findings provide important clues for the achievement of efficient genome engineering using the CRISPR/Cas9 system.

  18. Highly fluorescent GFPm 2+ -based genome integration-proficient promoter probe vector to study Mycobacterium tuberculosis promoters in infected macrophages.

    PubMed

    Roy, Sougata; Narayana, Yeddula; Balaji, Kithiganahalli Narayanaswamy; Ajitkumar, Parthasarathi

    2012-01-01

    Study of activity of cloned promoters in slow-growing Mycobacterium tuberculosis during long-term growth conditions in vitro or inside macrophages, requires a genome-integration proficient promoter probe vector, which can be stably maintained even without antibiotics, carrying a substrate-independent, easily scorable and highly sensitive reporter gene. In order to meet this requirement, we constructed pAKMN2, which contains mycobacterial codon-optimized gfp(m) (2+) gene, coding for GFP(m) (2+) of highest fluorescence reported till date, mycobacteriophage L5 attP-int sequence for genome integration, and a multiple cloning site. pAKMN2 showed stable integration and expression of GFP(m) (2+) from M. tuberculosis and M. smegmatis genome. Expression of GFP(m) (2+), driven by the cloned minimal promoters of M. tuberculosis cell division gene, ftsZ (MtftsZ), could be detected in the M. tuberculosis/pAKMN2-promoter integrants, growing at exponential phase in defined medium in vitro and inside macrophages. Stable expression from genome-integrated format even without antibiotic, and high sensitivity of detection by flow cytometry and fluorescence imaging, in spite of single copy integration, make pAKMN2 useful for the study of cloned promoters of any mycobacterial species under long-term in vitro growth or stress conditions, or inside macrophages.

  19. One-step high-efficiency CRISPR/Cas9-mediated genome editing in Streptomyces.

    PubMed

    Huang, He; Zheng, Guosong; Jiang, Weihong; Hu, Haifeng; Lu, Yinhua

    2015-04-01

    The RNA-guided DNA editing technology CRISPRs (clustered regularly interspaced short palindromic repeats)/Cas9 had been used to introduce double-stranded breaks into genomes and to direct subsequent site-specific insertions/deletions or the replacement of genetic material in bacteria, such as Escherichia coli, Streptococcus pneumonia, and Lactobacillus reuteri. In this study, we established a high-efficiency CRISPR/Cas9 genome editing plasmid pKCcas9dO for use in Streptomyces genetic manipulation, which comprises a target-specific guide RNA, a codon-optimized cas9, and two homology-directed repair templates. By delivering pKCcas9dO series editing plasmids into the model strain Streptomyces coelicolor M145, through one-step intergeneric transfer, we achieved the genome editing at different levels with high efficiencies of 60%-100%, including single gene deletion, such as actII-orf4, redD, and glnR, and single large-size gene cluster deletion, such as the antibiotic biosynthetic clusters of actinorhodin (ACT) (21.3 kb), undecylprodigiosin (RED) (31.6 kb), and Ca(2+)-dependent antibiotic (82.8 kb). Furthermore, we also realized simultaneous deletions of actII-orf4 and redD, and of the ACT and RED biosynthetic gene clusters with high efficiencies of 54% and 45%, respectively. Finally, we applied this system to introduce nucleotide point mutations into the rpsL gene, which conferred the mutants with resistance to streptomycin. Notably, using this system, the time required for one round of genome modification is reduced by one-third or one-half of those for conventional methods. These results clearly indicate that the established CRISPR/Cas9 genome editing system substantially improves the genome editing efficiency compared with the currently existing methods in Streptomyces, and it has promise for application to genome modification in other Actinomyces species.

  20. CRISPR/Cas9 mediated genome editing in ES cells and its application for chimeric analysis in mice

    PubMed Central

    Oji, Asami; Noda, Taichi; Fujihara, Yoshitaka; Miyata, Haruhiko; Kim, Yeon Joo; Muto, Masanaga; Nozawa, Kaori; Matsumura, Takafumi; Isotani, Ayako; Ikawa, Masahito

    2016-01-01

    Targeted gene disrupted mice can be efficiently generated by expressing a single guide RNA (sgRNA)/CAS9 complex in the zygote. However, the limited success of complicated genome editing, such as large deletions, point mutations, and knockins, remains to be improved. Further, the mosaicism in founder generations complicates the genotypic and phenotypic analyses in these animals. Here we show that large deletions with two sgRNAs as well as dsDNA-mediated point mutations are efficient in mouse embryonic stem cells (ESCs). The dsDNA-mediated gene knockins are also feasible in ESCs. Finally, we generated chimeric mice with biallelic mutant ESCs for a lethal gene, Dnajb13, and analyzed their phenotypes. Not only was the lethal phenotype of hydrocephalus suppressed, but we also found that Dnajb13 is required for sperm cilia formation. The combination of biallelic genome editing in ESCs and subsequent chimeric analysis provides a useful tool for rapid gene function analysis in the whole organism. PMID:27530713

  1. Existence of variant strains Fowlpox virus integrated with Reticuloendotheliosis virus in its genome in field isolates in Tanzania.

    PubMed

    Mzula, Alexanda; Masola, Selemani N; Kasanga, Christopher J; Wambura, Philemon N

    2014-06-01

    Fowlpox virus (FPV) is one example of poultry viruses which undergoes recombination with Reticuloendotheliosis virus (REV). Trepidation had been raised, and it was well established on augmented pathogenicity of the FPV upon integration of the full intact REV. In this study, we therefore intended at assessing the integration of REV into FPV genome of the field isolates obtained in samples collected from different regions of Tanzania. DNA extraction of 85 samples (scabs) was performed, and FPV-specific PCR was done by the amplification of the highly conserved P4b gene. Evaluation of FPV-REV recombination was done to FPV-specific PCR positively identified samples by amplifying the env gene and REV long terminal repeats (5' LTR). A 578-bp PCR product was amplified from 43 samples. We are reporting for the first time in Tanzania the existence of variant stains of FPV integrated with REV in its genome as 65 % of FPV identified isolates were having full intact REV integration, 21 % had partial FPV-REV env gene integration and 5 % had partial 5' LTR integration. Despite of the fact that FPV-REV integrated stains prevailed, FPV-REV-free isolates (9 %) also existed. In view of the fact that full intact REV integration is connected with increased pathogenicity of FPV, its existence in the FPV genome of most field isolates could have played a role in increased endemic, sporadic and recurring outbreaks in selected areas in Tanzania.

  2. Viral RNA switch mediates the dynamic control of flavivirus replicase recruitment by genome cyclization

    PubMed Central

    Liu, Zhong-Yu; Li, Xiao-Feng; Jiang, Tao; Deng, Yong-Qiang; Ye, Qing; Zhao, Hui; Yu, Jiu-Yang; Qin, Cheng-Feng

    2016-01-01

    Viral replicase recruitment and long-range RNA interactions are essential for RNA virus replication, yet the mechanism of their interplay remains elusive. Flaviviruses include numerous important human pathogens, e.g., dengue virus (DENV) and Zika virus (ZIKV). Here, we revealed a highly conserved, conformation-tunable cis-acting element named 5′-UAR-flanking stem (UFS) in the flavivirus genomic 5′ terminus. We demonstrated that the UFS was critical for efficient NS5 recruitment and viral RNA synthesis in different flaviviruses. Interestingly, stabilization of the DENV UFS impaired both genome cyclization and vRNA replication. Moreover, the UFS unwound in response to genome cyclization, leading to the decreased affinity of NS5 for the viral 5′ end. Thus, we propose that the UFS is switched by genome cyclization to regulate dynamic RdRp binding for vRNA replication. This study demonstrates that the UFS enables communication between flavivirus genome cyclization and RdRp recruitment, highlighting the presence of switch-like mechanisms among RNA viruses. DOI: http://dx.doi.org/10.7554/eLife.17636.001 PMID:27692070

  3. CRISPR/Cpf1-mediated DNA-free plant genome editing

    PubMed Central

    Kim, Hyeran; Kim, Sang-Tae; Ryu, Jahee; Kang, Beum-Chang; Kim, Jin-Soo; Kim, Sang-Gyu

    2017-01-01

    Cpf1, a type V CRISPR effector, recognizes a thymidine-rich protospacer-adjacent motif and induces cohesive double-stranded breaks at the target site guided by a single CRISPR RNA (crRNA). Here we show that Cpf1 can be used as a tool for DNA-free editing of plant genomes. We describe the delivery of recombinant Cpf1 proteins with in vitro transcribed or chemically synthesized target-specific crRNAs into protoplasts isolated from soybean and wild tobacco. Designed crRNAs are unique and do not have similar sequences (≤3 mismatches) in the entire soybean reference genome. Targeted deep sequencing analyses show that mutations are successfully induced in FAD2 paralogues in soybean and AOC in wild tobacco. Unlike SpCas9, Cpf1 mainly induces various nucleotide deletions at target sites. No significant mutations are detected at potential off-target sites in the soybean genome. These results demonstrate that Cpf1–crRNA complex is an effective DNA-free genome-editing tool for plant genome editing. PMID:28205546

  4. Fitness Cost Implications of PhiC31-Mediated Site-Specific Integrations in Target-Site Strains of the Mexican Fruit Fly, Anastrepha ludens (Diptera: Tephritidae)

    PubMed Central

    Meza, José S.; Díaz-Fleischer, Francisco; Sánchez-Velásquez, Lázaro R.; Zepeda-Cisneros, Cristina Silvia; Handler, Alfred M.; Schetelig, Marc F.

    2014-01-01

    Site-specific recombination technologies are powerful new tools for the manipulation of genomic DNA in insects that can improve transgenesis strategies such as targeting transgene insertions, allowing transgene cassette exchange and DNA mobilization for transgene stabilization. However, understanding the fitness cost implications of these manipulations for transgenic strain applications is critical. In this study independent piggyBac-mediated attP target-sites marked with DsRed were created in several genomic positions in the Mexican fruit fly, Anastrepha ludens. Two of these strains, one having an autosomal (attP_F7) and the other a Y-linked (attP_2-M6y) integration, exhibited fitness parameters (dynamic demography and sexual competitiveness) similar to wild type flies. These strains were thus selected for targeted insertion using, for the first time in mexfly, the phiC31-integrase recombination system to insert an additional EGFP-marked transgene to determine its effect on host strain fitness. Fitness tests showed that the integration event in the int_2-M6y recombinant strain had no significant effect, while the int_F7 recombinant strain exhibited significantly lower fitness relative to the original attP_F7 target-site host strain. These results indicate that while targeted transgene integrations can be achieved without an additional fitness cost, at some genomic positions insertion of additional DNA into a previously integrated transgene can have a significant negative effect. Thus, for targeted transgene insertions fitness costs must be evaluated both previous to and subsequent to new site-specific insertions in the target-site strain. PMID:25303238

  5. Fitness cost implications of PhiC31-mediated site-specific integrations in target-site strains of the Mexican fruit fly, Anastrepha ludens (Diptera: Tephritidae).

    PubMed

    Meza, José S; Díaz-Fleischer, Francisco; Sánchez-Velásquez, Lázaro R; Zepeda-Cisneros, Cristina Silvia; Handler, Alfred M; Schetelig, Marc F

    2014-01-01

    Site-specific recombination technologies are powerful new tools for the manipulation of genomic DNA in insects that can improve transgenesis strategies such as targeting transgene insertions, allowing transgene cassette exchange and DNA mobilization for transgene stabilization. However, understanding the fitness cost implications of these manipulations for transgenic strain applications is critical. In this study independent piggyBac-mediated attP target-sites marked with DsRed were created in several genomic positions in the Mexican fruit fly, Anastrepha ludens. Two of these strains, one having an autosomal (attP_F7) and the other a Y-linked (attP_2-M6y) integration, exhibited fitness parameters (dynamic demography and sexual competitiveness) similar to wild type flies. These strains were thus selected for targeted insertion using, for the first time in mexfly, the phiC31-integrase recombination system to insert an additional EGFP-marked transgene to determine its effect on host strain fitness. Fitness tests showed that the integration event in the int_2-M6y recombinant strain had no significant effect, while the int_F7 recombinant strain exhibited significantly lower fitness relative to the original attP_F7 target-site host strain. These results indicate that while targeted transgene integrations can be achieved without an additional fitness cost, at some genomic positions insertion of additional DNA into a previously integrated transgene can have a significant negative effect. Thus, for targeted transgene insertions fitness costs must be evaluated both previous to and subsequent to new site-specific insertions in the target-site strain.

  6. Integration of PacBio RS into Massive Parallel Sequencing and Data Analysis Pipelining at the UC Davis Genome Center

    PubMed Central

    Vanessa, Rashbrook; O'Geen, Henriette; Nguyen, Oanh; Ashtari, Siranoosh; Fan, Xiaohong; Kim, Ryan

    2013-01-01

    Whole genome sequencing and genomic biology has been widely adopted in many fields of biology as next-generation sequencing technology (NGS) has rapidly improved quality, read length, and throughput to make whole genome sequencing and association studies possible in a very cost effective manner. Continued improvement and development of sample preparation protocols and data analysis tools have been significant in helping to extend genome sequencing technology to genomes that were previously difficult to sequence. Recent arrival of Pacific Biosciences RS (PacBio) contributed in furthering such opportunity by providing options for single molecule long read sequencing in real time and kinetic analysis (methylation). PacBio has been employed successfully for sequencing low complexity genomic region such as extremely high GC, long repeats, rearrangement, gene fusion, etc. In this poster we present the optimization of PacBio sample preparation that was fine-tuned to meet unique challenges of sequencing through “difficult-to-sequence” template. We discuss the integration of PacBio into the wet lab equipped with other NGS platforms and data pipelining workflow including cloud computing and robotic sample preparation at the Genome Center. UC Davis Genome Center currently operates NGS technology platforms including HiSeq, MiSeq, PacBio, and has genotyping capacity using Illumina Infinium and GoldenGate technology. UC Davis Genome Center and Bioinformatics Program provides most up-to-date genome technology and informatics support tailored for specific biological goals meeting needs for more than 80 faculty members within Genome Center and more than 200 campus and off-campus researchers.

  7. CRISPR/Cas9-Mediated Genome Editing of Epigenetic Factors for Cancer Therapy.

    PubMed

    Yao, Shaohua; He, Zhiyao; Chen, Chong

    2015-07-01

    Advances in engineered recombinant nuclease have provided facile and reliable methods for genome editing. Especially with the development of the CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR-associated protein-9 nuclease) system, the discovery of various versions of Cas9 proteins and delivery carriers, it is now practicable to introduce desired mutations into the genome, to correct disease-related mutations, and to activate or suppress genes of interest. Epigenetic regulators are often disturbed in cancer cells and are essential for the transformation of normal to cancerous cells. Tumor-related epigenetic alterations or epigenetic factor mutations play a major part during the various steps of carcinogenesis and affect a variety of cancer-related genes and a wide range of cancerous phenotypes. Therefore, epigenetic regulatory enzymes might be candidate targets for cancer therapy. In this review, we discuss prospects of CRISPR/Cas9-based genome editing in targeting epigenetics for cancer gene therapy.

  8. Integrated Transcriptomic-Proteomic Analysis Using a Proteogenomic Workflow Refines Rat Genome Annotation.

    PubMed

    Kumar, Dhirendra; Yadav, Amit Kumar; Jia, Xinying; Mulvenna, Jason; Dash, Debasis

    2016-01-01

    Proteogenomic re-annotation and mRNA splicing information can lead to the discovery of various protein forms for eukaryotic model organisms like rat. However, detection of novel proteoforms using mass spectrometry proteomics data remains a formidable challenge. We developed EuGenoSuite, an open source multiple algorithmic proteomic search tool and utilized it in our in-house integrated transcriptomic-proteomic pipeline to facilitate automated proteogenomic analysis. Using four proteogenomic pipelines (integrated transcriptomic-proteomic, Peppy, Enosi, and ProteoAnnotator) on publicly available RNA-sequence and MS proteomics data, we discovered 363 novel peptides in rat brain microglia representing novel proteoforms for 249 gene loci in the rat genome. These novel peptides aided in the discovery of novel exons, translation of annotated untranslated regions, pseudogenes, and splice variants for various loci; many of which have known disease associations, including neurological disorders like schizophrenia, amyotrophic lateral sclerosis, etc. Novel isoforms were also discovered for genes implicated in cardiovascular diseases and breast cancer for which rats are considered model organisms. Our integrative multi-omics data analysis not only enables the discovery of new proteoforms but also generates an improved reference for human disease studies in the rat model.

  9. Integrated molecular analysis reveals complex interactions between genomic and epigenomic alterations in esophageal adenocarcinomas

    PubMed Central

    Peng, DunFa; Guo, Yan; Chen, Heidi; Zhao, Shilin; Washington, Kay; Hu, TianLing; Shyr, Yu; El-Rifai, Wael

    2017-01-01

    The incidence of esophageal adenocarcinoma (EAC) is rapidly rising in the United States and Western countries. In this study, we carried out an integrative molecular analysis to identify interactions between genomic and epigenomic alterations in regulating gene expression networks in EAC. We detected significant alterations in DNA copy numbers (CN), gene expression levels, and DNA methylation profiles. The integrative analysis demonstrated that altered expression of 1,755 genes was associated with changes in CN or methylation. We found that expression alterations in 84 genes were associated with changes in both CN and methylation. These data suggest a strong interaction between genetic and epigenetic events to modulate gene expression in EAC. Of note, bioinformatics analysis detected a prominent K-RAS signature and predicted activation of several important transcription factor networks, including β-catenin, MYB, TWIST1, SOX7, GATA3 and GATA6. Notably, we detected hypomethylation and overexpression of several pro-inflammatory genes such as COX2, IL8 and IL23R, suggesting an important role of epigenetic regulation of these genes in the inflammatory cascade associated with EAC. In summary, this integrative analysis demonstrates a complex interaction between genetic and epigenetic mechanisms providing several novel insights for our understanding of molecular events in EAC. PMID:28102292

  10. Concurrent triplication and uniparental isodisomy: evidence for microhomology-mediated break-induced replication model for genomic rearrangements.

    PubMed

    Sahoo, Trilochan; Wang, Jia-Chi; Elnaggar, Mohamed M; Sanchez-Lara, Pedro; Ross, Leslie P; Mahon, Loretta W; Hafezi, Katayoun; Deming, Abigail; Hinman, Lynne; Bruno, Yovana; Bartley, James A; Liehr, Thomas; Anguiano, Arturo; Jones, Marilyn

    2015-01-01

    Whole-genome oligonucleotide single-nucleotide polymorphism (oligo-SNP) arrays enable simultaneous interrogation of copy number variations (CNVs), copy neutral regions of homozygosity (ROH) and uniparental disomy (UPD). Structural variation in the human genome contributes significantly to genetic variation, and often has deleterious effects leading to disease causation. Co-occurrence of CNV and regions of allelic homozygosity in tandem involving the same chromosomal arm are extremely rare. Replication-based mechanisms such as microhomology-mediated break-induced replication (MMBIR) are recent models predicted to induce structural rearrangements and gene dosage aberrations; however, supportive evidence in humans for one-ended DNA break repair coupled with MMBIR giving rise to interstitial copy number gains and distal loss of heterozygosity has not been documented. We report on the identification and characterization of two cases with interstitial triplication followed by uniparental isodisomy (isoUPD) for remainder of the chromosomal arm. Case 1 has a triplication at 9q21.11-q21.33 and segmental paternal isoUPD for 9q21.33-qter, and presented with citrullinemia with a homozygous mutation in the argininosuccinate synthetase gene (ASS1 at 9q34.1). Case 2 has a triplication at 22q12.1-q12.2 and segmental maternal isoUPD 22q12.2-qter, and presented with hearing loss, mild dysmorphic features and bilateral iris coloboma. Interstitial triplication coupled with distal segmental isoUPD is a novel finding that provides human evidence for one-ended DNA break and replication-mediated repair. Both copy number gains and isoUPD may contribute to the phenotype. Significantly, these cases represent the first detailed genomic analysis that provides support for a MMBIR mechanism inducing copy number gains and segmental isoUPD in tandem.

  11. DNA-PKcs, ATM, and ATR Interplay Maintains Genome Integrity during Neurogenesis.

    PubMed

    Enriquez-Rios, Vanessa; Dumitrache, Lavinia C; Downing, Susanna M; Li, Yang; Brown, Eric J; Russell, Helen R; McKinnon, Peter J

    2017-01-25

    The DNA damage response (DDR) orchestrates a network of cellular processes that integrates cell-cycle control and DNA repair or apoptosis, which serves to maintain genome stability. DNA-PKcs (the catalytic subunit of the DNA-dependent kinase, encoded by PRKDC), ATM (ataxia telangiectasia, mutated), and ATR (ATM and Rad3-related) are related PI3K-like protein kinases and central regulators of the DDR. Defects in these kinases have been linked to neurodegenerative or neurodevelopmental syndromes. In all cases, the key neuroprotective function of these kinases is uncertain. It also remains unclear how interactions between the three DNA damage-responsive kinases coordinate genome stability, particularly in a physiological context. Here, we used a genetic approach to identify the neural function of DNA-PKcs and the interplay between ATM and ATR during neurogenesis. We found that DNA-PKcs loss in the mouse sensitized neuronal progenitors to apoptosis after ionizing radiation because of excessive DNA damage. DNA-PKcs was also required to prevent endogenous DNA damage accumulation throughout the adult brain. In contrast, ATR coordinated the DDR during neurogenesis to direct apoptosis in cycling neural progenitors, whereas ATM regulated apoptosis in both proliferative and noncycling cells. We also found that ATR controls a DNA damage-induced G2/M checkpoint in cortical progenitors, independent of ATM and DNA-PKcs. These nonoverlapping roles were further confirmed via sustained murine embryonic or cortical development after all three kinases were simultaneously inactivated. Thus, our results illustrate how DNA-PKcs, ATM, and ATR have unique and essential roles during the DDR, collectively ensuring comprehensive genome maintenance in the nervous system.

  12. Interaction with PALB2 Is Essential for Maintenance of Genomic Integrity by BRCA2

    PubMed Central

    Hartford, Suzanne A.; Chittela, Rajanikant; Ding, Xia; Martin, Betty; Burkett, Sandra; Haines, Diana C.; Southon, Eileen; Tessarollo, Lino; Sharan, Shyam K.

    2016-01-01

    Human breast cancer susceptibility gene, BRCA2, encodes a 3418-amino acid protein that is essential for maintaining genomic integrity. Among the proteins that physically interact with BRCA2, Partner and Localizer of BRCA2 (PALB2), which binds to the N-terminal region of BRCA2, is vital for its function by facilitating its subnuclear localization. A functional redundancy has been reported between this N-terminal PALB2-binding domain and the C-terminal DNA-binding domain of BRCA2, which undermines the relevance of the interaction between these two proteins. Here, we describe a genetic approach to examine the functional significance of the interaction between BRCA2 and PALB2 by generating a knock-in mouse model of Brca2 carrying a single amino acid change (Gly25Arg, Brca2G25R) that disrupts this interaction. In addition, we have combined Brca2G25R homozygosity as well as hemizygosity with Palb2 and Trp53 heterozygosity to generate an array of genotypically and phenotypically distinct mouse models. Our findings reveal defects in body size, fertility, meiotic progression, and genome stability, as well as increased tumor susceptibility in these mice. The severity of the phenotype increased with a decrease in the interaction between BRCA2 and PALB2, highlighting the significance of this interaction. In addition, our findings also demonstrate that hypomorphic mutations such as Brca2G25R have the potential to be more detrimental than the functionally null alleles by increasing genomic instability to a level that induces tumorigenesis, rather than apoptosis. PMID:27490902

  13. Plant Clonal Integration Mediates the Horizontal Redistribution of Soil Resources, Benefiting Neighboring Plants

    PubMed Central

    Ye, Xue-Hua; Zhang, Ya-Lin; Liu, Zhi-Lan; Gao, Shu-Qin; Song, Yao-Bin; Liu, Feng-Hong; Dong, Ming

    2016-01-01

    Resources such as water taken up by plants can be released into soils through hydraulic redistribution and can also be translocated by clonal integration within a plant clonal network. We hypothesized that the resources from one (donor) microsite could be translocated within a clonal network, released into different (recipient) microsites and subsequently used by neighbor plants in the recipient microsite. To test these hypotheses, we conducted two experiments in which connected and disconnected ramet pairs of Potentilla anserina were grown under both homogeneous and heterogeneous water regimes, with seedlings of Artemisia ordosica as neighbors. The isotopes [15N] and deuterium were used to trace the translocation of nitrogen and water, respectively, within the clonal network. The water and nitrogen taken up by P. anserina ramets in the donor microsite were translocated into the connected ramets in the recipient microsites. Most notably, portions of the translocated water and nitrogen were released into the recipient microsite and were used by the neighboring A. ordosica, which increased growth of the neighboring A. ordosica significantly. Therefore, our hypotheses were supported, and plant clonal integration mediated the horizontal hydraulic redistribution of resources, thus benefiting neighboring plants. Such a plant clonal integration-mediated resource redistribution in horizontal space may have substantial effects on the interspecific relations and composition of the community and consequently on ecosystem processes. PMID:26904051

  14. VaProS: a database-integration approach for protein/genome information retrieval.

    PubMed

    Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

    2016-12-01

    Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .

  15. Integrative computational approach for genome-based study of microbial lipid-degrading enzymes.

    PubMed

    Vorapreeda, Tayvich; Thammarongtham, Chinae; Laoteng, Kobkul

    2016-07-01

    Lipid-degrading or lipolytic enzymes have gained enormous attention in academic and industrial sectors. Several efforts are underway to discover new lipase enzymes from a variety of microorganisms with particular catalytic properties to be used for extensive applications. In addition, various tools and strategies have been implemented to unravel the functional relevance of the versatile lipid-degrading enzymes for special purposes. This review highlights the study of microbial lipid-degrading enzymes through an integrative computational approach. The identification of putative lipase genes from microbial genomes and metagenomic libraries using homology-based mining is discussed, with an emphasis on sequence analysis of conserved motifs and enzyme topology. Molecular modelling of three-dimensional structure on the basis of sequence similarity is shown to be a potential approach for exploring the structural and functional relationships of candidate lipase enzymes. The perspectives on a discriminative framework of cutting-edge tools and technologies, including bioinformatics, computational biology, functional genomics and functional proteomics, intended to facilitate rapid progress in understanding lipolysis mechanism and to discover novel lipid-degrading enzymes of microorganisms are discussed.

  16. Clinical Implementation of Integrated Genomic Profiling in Patients with Advanced Cancers.

    PubMed

    Borad, Mitesh J; Egan, Jan B; Condjella, Rachel M; Liang, Winnie S; Fonseca, Rafael; Ritacca, Nicole R; McCullough, Ann E; Barrett, Michael T; Hunt, Katherine S; Champion, Mia D; Patel, Maitray D; Young, Scott W; Silva, Alvin C; Ho, Thai H; Halfdanarson, Thorvardur R; McWilliams, Robert R; Lazaridis, Konstantinos N; Ramanathan, Ramesh K; Baker, Angela; Aldrich, Jessica; Kurdoglu, Ahmet; Izatt, Tyler; Christoforides, Alexis; Cherni, Irene; Nasser, Sara; Reiman, Rebecca; Cuyugan, Lori; McDonald, Jacquelyn; Adkins, Jonathan; Mastrian, Stephen D; Valdez, Riccardo; Jaroszewski, Dawn E; Von Hoff, Daniel D; Craig, David W; Stewart, A Keith; Carpten, John D; Bryce, Alan H

    2016-12-01

    DNA focused panel sequencing has been rapidly adopted to assess therapeutic targets in advanced/refractory cancer. Integrated Genomic Profiling (IGP) utilising DNA/RNA with tumour/normal comparisons in a Clinical Laboratory Improvement Amendments (CLIA) compliant setting enables a single assay to provide: therapeutic target prioritisation, novel target discovery/application and comprehensive germline assessment. A prospective study in 35 advanced/refractory cancer patients was conducted using CLIA-compliant IGP. Feasibility was assessed by estimating time to results (TTR), prioritising/assigning putative therapeutic targets, assessing drug access, ascertaining germline alterations, and assessing patient preferences/perspectives on data use/reporting. Therapeutic targets were identified using biointelligence/pathway analyses and interpreted by a Genomic Tumour Board. Seventy-five percent of cases harboured 1-3 therapeutically targetable mutations/case (median 79 mutations of potential functional significance/case). Median time to CLIA-validated results was 116 days with CLIA-validation of targets achieved in 21/22 patients. IGP directed treatment was instituted in 13 patients utilising on/off label FDA approved drugs (n = 9), clinical trials (n = 3) and single patient IND (n = 1). Preliminary clinical efficacy was noted in five patients (two partial response, three stable disease). Although barriers to broader application exist, including the need for wider availability of therapies, IGP in a CLIA-framework is feasible and valuable in selection/prioritisation of anti-cancer therapeutic targets.

  17. Integrated Genomic and Network-Based Analyses of Complex Diseases and Human Disease Network.

    PubMed

    Al-Harazi, Olfat; Al Insaif, Sadiq; Al-Ajlan, Monirah A; Kaya, Namik; Dzimiri, Nduna; Colak, Dilek

    2016-06-20

    A disease phenotype generally reflects various pathobiological processes that interact in a complex network. The highly interconnected nature of the human protein interaction network (interactome) indicates that, at the molecular level, it is difficult to consider diseases as being independent of one another. Recently, genome-wide molecular measurements, data mining and bioinformatics approaches have provided the means to explore human diseases from a molecular basis. The exploration of diseases and a system of disease relationships based on the integration of genome-wide molecular data with the human interactome could offer a powerful perspective for understanding the molecular architecture of diseases. Recently, subnetwork markers have proven to be more robust and reliable than individual biomarker genes selected based on gene expression profiles alone, and achieve higher accuracy in disease classification. We have applied one of these methodologies to idiopathic dilated cardiomyopathy (IDCM) data that we have generated using a microarray and identified significant subnetworks associated with the disease. In this paper, we review the recent endeavours in this direction, and summarize the existing methodologies and computational tools for network-based analysis of complex diseases and molecular relationships among apparently different disorders and human disease network. We also discuss the future research trends and topics of this promising field.

  18. Requirement of DDX39 DEAD box RNA helicase for genome integrity and telomere protection.

    PubMed

    Yoo, Hyun Hee; Chung, In Kwon

    2011-08-01

    Human chromosome ends associate with shelterin, a six-protein complex that protects telomeric DNA from being recognized as sites of DNA damage. The shelterin subunit TRF2 has been implicated in the protection of chromosome ends by facilitating their organization into the protective capping structure and by associating with several accessory proteins involved in various DNA transactions. Here we describe the characterization of DDX39 DEAD-box RNA helicase as a novel TRF2-interacting protein. DDX39 directly interacts with the telomeric repeat binding factor homology domain of TRF2 via the FXLXP motif (where X is any amino acid). DDX39 is also found in association with catalytically competent telomerase in cell lysates through an interaction with hTERT but has no effect on telomerase activity. Whereas overexpression of DDX39 in telomerase-positive human cancer cells led to progressive telomere elongation, depletion of endogenous DDX39 by small hairpin RNA (shRNA) resulted in telomere shortening. Furthermore, depletion of DDX39 induced DNA-damage response foci at internal genome as well as telomeres as evidenced by telomere dysfunction-induced foci. Some of the metaphase chromosomes showed no telomeric signal at chromatid ends, suggesting an aberrant telomere structure. Our findings suggest that DDX39, in addition to its role in mRNA splicing and nuclear export, is required for global genome integrity as well as telomere protection and represents a new pathway for telomere maintenance by modulating telomere length homeostasis.

  19. Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

    PubMed

    Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

    2016-06-08

    Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named LTM (logical transformation of model) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.

  20. Inferring drug-disease associations from integration of chemical, genomic and phenotype data using network propagation

    PubMed Central

    2013-01-01

    Background During the last few years, the knowledge of drug, disease phenotype and protein has been rapidly accumulated and more and more scientists have been drawn the attention to inferring drug-disease associations by computational method. Development of an integrated approach for systematic discovering drug-disease associations by those informational data is an important issue. Methods We combine three different networks of drug, genomic and disease phenotype and assign the weights to the edges from available experimental data and knowledge. Given a specific disease, we use our network propagation approach to infer the drug-disease associations. Results We apply prostate cancer and colorectal cancer as our test data. We use the manually curated drug-disease associations from comparative toxicogenomics database to be our benchmark. The ranked results show that our proposed method obtains higher specificity and sensitivity and clearly outperforms previous methods. Our result also show that our method with off-targets information gets higher performance than that with only primary drug targets in both test data. Conclusions We clearly demonstrate the feasibility and benefits of using network-based analyses of chemical, genomic and phenotype data to reveal drug-disease associations. The potential associations inferred by our method provide new perspectives for toxicogenomics and drug reposition evaluation. PMID:24565337

  1. An integrated genomic approach to the assessment and treatment of acute myeloid leukemia.

    PubMed

    Godley, Lucy A; Cunningham, John; Dolan, M Eileen; Huang, R Stephanie; Gurbuxani, Sandeep; McNerney, Megan E; Larson, Richard A; Leong, Hoyee; Lussier, Yves; Onel, Kenan; Odenike, Olatoyosi; Stock, Wendy; White, Kevin P; Le Beau, Michelle M

    2011-04-01

    Traditionally, new scientific advances have been applied quickly to the leukemias based on the ease with which relatively pure samples of malignant cells can be obtained. Currently, our arsenal of approaches used to characterize an individual's acute myeloid leukemia (AML) combines hematopathologic evaluation, flow cytometry, cytogenetic analysis, and molecular studies focused on a few key genes. The advent of high-throughput methods capable of full-genome evaluation presents new options for a revolutionary change in the way we diagnose, characterize, and treat AML. Next-generation DNA sequencing techniques allow full sequencing of a cancer genome or transcriptome, with the hope that this will be affordable for routine clinical care within the decade. Microarray-based testing will define gene and miRNA expression, DNA methylation patterns, chromosomal imbalances, and predisposition to disease and chemosensitivity. The vision for the future entails an integrated and automated approach to these analyses, bringing the possibility of formulating an individualized treatment plan within days of a patient's initial presentation. With these expectations comes the hope that such an approach will lead to decreased toxicities and prolonged survival for patients.

  2. Genome-wide exonic small interference RNA-mediated gene silencing regulates sexual reproduction in the homothallic fungus Fusarium graminearum

    PubMed Central

    Park, Ae Ran; Lim, Jae Yun; Shin, Chanseok

    2017-01-01

    Various ascomycete fungi possess sex-specific molecular mechanisms, such as repeat-induced point mutations, meiotic silencing by unpaired DNA, and unusual adenosine-to-inosine RNA editing, for genome defense or gene regulation. Using a combined analysis of functional genetics and deep sequencing of small noncoding RNA (sRNA), mRNA, and the degradome, we found that the sex-specifically induced exonic small interference RNA (ex-siRNA)-mediated RNA interference (RNAi) mechanism has an important role in fine-tuning the transcriptome during ascospore formation in the head blight fungus Fusarium graminearum. Approximately one-third of the total sRNAs were produced from the gene region, and sRNAs with an antisense direction or 5′-U were involved in post-transcriptional gene regulation by reducing the stability of the corresponding gene transcripts. Although both Dicers and Argonautes partially share their functions, the sex-specific RNAi pathway is primarily mediated by FgDicer1 and FgAgo2, while the constitutively expressed RNAi components FgDicer2 and FgAgo1 are responsible for hairpin-induced RNAi. Based on our results, we concluded that F. graminearum primarily utilizes ex-siRNA-mediated RNAi for ascosporogenesis but not for genome defenses and other developmental stages. Each fungal species appears to have evolved RNAi-based gene regulation for specific developmental stages or stress responses. This study provides new insights into the regulatory role of sRNAs in fungi and other lower eukaryotes. PMID:28146558

  3. Antibody-mediated rejection, T cell-mediated rejection, and the injury-repair response: new insights from the Genome Canada studies of kidney transplant biopsies.

    PubMed

    Halloran, Philip F; Reeve, Jeff P; Pereira, Andre B; Hidalgo, Luis G; Famulski, Konrad S

    2014-02-01

    Prospective studies of unselected indication biopsies from kidney transplants, combining conventional assessment with molecular analysis, have created a new understanding of transplant disease states and their outcomes. A large-scale Genome Canada grant permitted us to use conventional and molecular phenotypes to create a new disease classification. T cell-mediated rejection (TCMR), characterized histologically or molecularly, has little effect on outcomes. Antibody-mediated rejection (ABMR) manifests as microcirculation lesions and transcript changes reflecting endothelial injury, interferon-γ effects, and natural killer cells. ABMR is frequently C4d negative and has been greatly underestimated by conventional criteria. Indeed, ABMR, triggered in some cases by non-adherence, is the major disease causing failure. Progressive dysfunction is usually attributable to specific diseases, and pure calcineurin inhibitor toxicity rarely explains failure. The importance of ABMR argues against immunosuppressive drug minimization and stands as a barrier to tolerance induction. Microarrays also defined the transcripts induced by acute kidney injury (AKI), which correlate with reduced function, whereas histologic changes of acute tubular injury do not. AKI transcripts are induced in kidneys with late dysfunction, and are better predictors of failure than fibrosis and inflammation. Thus progression reflects ongoing parenchymal injury, usually from identifiable diseases such as ABMR, not destructive fibrosis.

  4. Expression of Active Subunit of Nitrogenase via Integration into Plant Organelle Genome

    PubMed Central

    Groat, Jeanna; Staub, Jeffrey M.; Stephens, Michael

    2016-01-01

    Nitrogen availability is crucial for crop yield with nitrogen fertilizer accounting for a large percentage of farmers’ expenses. However, an untimely or excessive application of fertilizer can increase risks of negative environmental effects. These factors, along with the environmental and energy costs of synthesizing nitrogen fertilizer, led us to seek out novel biotechnology-driven approaches to supply nitrogen to plants. The strategy we focused on involves transgenic expression of nitrogenase, a bacterial multi-subunit enzyme that can capture atmospheric nitrogen. Here we report expression of the active Fe subunit of nitrogenase via integration into the tobacco plastid genome of bacterial gene sequences modified for expression in plastid. Our study suggests that it will be possible to engineer plants that are able to produce their own nitrogen fertilizer by expressing nitrogenase genes in plant plastids. PMID:27529475

  5. Integration of Genomic Data Enables Selective Discovery of Breast Cancer Drivers

    PubMed Central