Science.gov

Sample records for genomic integration mediated

  1. Altering genomic integrity: heavy metal exposure promotes trans-posable element-mediated damage

    PubMed Central

    Morales, Maria E.; Servant, Geraldine; Ade, Catherine; Roy-Enge, Astrid M.

    2015-01-01

    Maintenance of genomic integrity is critical for cellular homeostasis and survival. The active transposable elements (TEs) composed primarily of three mobile element lineages LINE-1, Alu, and SVA comprise approximately 30% of the mass of the human genome. For the past two decades, studies have shown that TEs significantly contribute to genetic instability and that TE-caused damages are associated with genetic diseases and cancer. Different environmental exposures, including several heavy metals, influence how TEs interact with its host genome increasing their negative impact. This mini-review provides some basic knowledge on TEs, their contribution to disease and an overview of the current knowledge on how heavy metals influence TE-mediated damage. PMID:25774044

  2. iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis.

    PubMed

    Huang, Yen-Tsung; Liang, Liming; Moffatt, Miriam F; Cookson, William O C M; Lin, Xihong

    2015-07-01

    Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful. PMID:25997986

  3. iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis

    PubMed Central

    Huang, Yen-Tsung; Liang, Liang; Moffatt, Miriam F.; Cookson, William O. C. M.; Lin, Xihong

    2015-01-01

    Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful. PMID:25997986

  4. An Integrated Genomic Strategy Delineates Candidate Mediator Genes Regulating Grain Size and Weight in Rice

    PubMed Central

    Malik, Naveen; Dwivedi, Nidhi; Singh, Ashok K.; Parida, Swarup K.; Agarwal, Pinky; Thakur, Jitendra K.; Tyagi, Akhilesh K.

    2016-01-01

    The present study deployed a Mediator (MED) genes-mediated integrated genomic strategy for understanding the complex genetic architecture of grain size/weight quantitative trait in rice. The targeted multiplex amplicon resequencing of 55 MED genes annotated from whole rice genome in 384 accessions discovered 3971 SNPs, which were structurally and functionally annotated in diverse coding and non-coding sequence-components of genes. Association analysis, using the genotyping information of 3971 SNPs in a structured population of 384 accessions (with 50–100 kb linkage disequilibrium decay), detected 10 MED gene-derived SNPs significantly associated (46% combined phenotypic variation explained) with grain length, width and weight in rice. Of these, one strong grain weight-associated non-synonymous SNP (G/A)-carrying OsMED4_2 gene was validated successfully in low- and high-grain weight parental accessions and homozygous individuals of a rice mapping population. The seed-specific expression, including differential up/down-regulation of three grain size/weight-associated MED genes (including OsMED4_2) in six low and high-grain weight rice accessions was evident. Altogether, combinatorial genomic approach involving haplotype-based association analysis delineated diverse functionally relevant natural SNP-allelic variants in 10 MED genes, including three potential novel SNP haplotypes in an OsMED4_2 gene governing grain size/weight differentiation in rice. These molecular tags have potential to accelerate genomics-assisted crop improvement in rice. PMID:27000976

  5. Integrative modeling of multi-platform genomic data under the framework of mediation analysis.

    PubMed

    Huang, Yen-Tsung

    2015-01-15

    Given the availability of genomic data, there have been emerging interests in integrating multi-platform data. Here, we propose to model genetics (single nucleotide polymorphism (SNP)), epigenetics (DNA methylation), and gene expression data as a biological process to delineate phenotypic traits under the framework of causal mediation modeling. We propose a regression model for the joint effect of SNPs, methylation, gene expression, and their nonlinear interactions on the outcome and develop a variance component score test for any arbitrary set of regression coefficients. The test statistic under the null follows a mixture of chi-square distributions, which can be approximated using a characteristic function inversion method or a perturbation procedure. We construct tests for candidate models determined by different combinations of SNPs, DNA methylation, gene expression, and interactions and further propose an omnibus test to accommodate different models. We then study three path-specific effects: the direct effect of SNPs on the outcome, the effect mediated through expression, and the effect through methylation. We characterize correspondences between the three path-specific effects and coefficients in the regression model, which are influenced by causal relations among SNPs, DNA methylation, and gene expression. We illustrate the utility of our method in two genomic studies and numerical simulation studies. PMID:25316269

  6. Integrative modeling of multi-platform genomic data under the framework of mediation analysis

    PubMed Central

    Huang, Yen-Tsung

    2014-01-01

    Given the availability of genomic data, there have been emerging interests in integrating multi-platform data. Here, we propose to model genetics (single nucleotide polymorphism (SNP)), epigenetics (DNA methylation), and gene expression data as a biological process to delineate phenotypic traits under the framework of causal mediation modeling. We propose a regression model for the joint effect of SNPs, methylation, gene expression, and their nonlinear interactions on the outcome and develop a variance component score test for any arbitrary set of regression coefficients. The test statistic under the null follows a mixture of chi-square distributions, which can be approximated using a characteristic function inversion method or a perturbation procedure. We construct tests for candidate models determined by different combinations of SNPs, DNA methylation, gene expression, and interactions and further propose an omnibus test to accommodate different models. We then study three path-specific effects: the direct effect of SNPs on the outcome, the effect mediated through expression, and the effect through methylation. We characterize correspondences between the three path-specific effects and coefficients in the regression model, which are influenced by causal relations among SNPs, DNA methylation, and gene expression. We illustrate the utility of our method in two genomic studies and numerical simulation studies. PMID:25316269

  7. Integrated Genomics Identifies Convergence of Ankylosing Spondylitis with Global Immune Mediated Disease Pathways

    PubMed Central

    Uddin, Mohammed; Codner, Dianne; Mahmud Hasan, S M; Scherer, Stephen W; O’Rielly, Darren D; Rahman, Proton

    2015-01-01

    Ankylosing spondylitis(AS), a highly heritable complex inflammatory arthritis. Although, a handful of non-HLA risk loci have been identified, capturing the unexplained genetic contribution to AS pathogenesis remains a challenge attributed to additive, pleiotropic and epistatic-interactions at the molecular level. Here, we developed multiple integrated genomic approaches to quantify molecular convergence of non-HLA loci with global immune mediated diseases. We show that non-HLA genes are significantly sensitive to deleterious mutation accumulation in the general population compared with tolerant genes. Human developmental proteomics (prenatal to adult) analysis revealed that proteins encoded by non-HLA AS risk loci are 2-fold more expressed in adult hematopoietic cells.Enrichment analysis revealed AS risk genes overlap with a significant number of immune related pathways (p < 0.0001 to 9.8 × 10-12). Protein-protein interaction analysis revealed non-shared AS risk genes are highly clustered seeds that significantly converge (empirical; p < 0.01 to 1.6 × 10-4) into networks of global immune mediated disease risk loci. We have also provided initial evidence for the involvement of STAT2/3 in AS pathogenesis. Collectively, these findings highlight molecular insight on non-HLA AS risk loci that are not exclusively connected with overlapping immune mediated diseases; rather a component of common pathophysiological pathways with other immune mediated diseases. This information will be pivotal to fully explain AS pathogenesis and identify new therapeutic targets. PMID:25980808

  8. Zbtb1 Safeguards Genome Integrity and Prevents p53-Mediated Apoptosis in Proliferating Lymphoid Progenitors.

    PubMed

    Cao, Xin; Lu, Ying; Zhang, Xianyu; Kovalovsky, Damian

    2016-08-15

    Expression of the transcription factor Zbtb1 is required for normal lymphoid development. We report in the present study that Zbtb1 maintains genome integrity in immune progenitors, without which cells undergo increased DNA damage and p53-mediated apoptosis during replication and differentiation. Increased DNA damage in Zbtb1-mutant (ScanT) progenitors was due to increased sensitivity to replication stress, which was a consequence of inefficient activation of the S-phase checkpoint response. Increased p53-mediated apoptosis affected not only lymphoid but also myeloid development in competitive bone marrow chimeras, and prevention of apoptosis by transgenic Bcl2 expression and p53 deficiency rescued lymphoid as well as myeloid development from Zbtb1-mutant progenitors. Interestingly, however, protection from apoptosis rescued only the early stages of T cell development, and thymocytes remained arrested at the double-negative 3 developmental stage, indicating a strict requirement of Zbtb1 at later T cell developmental stages. Collectively, these results indicate that Zbtb1 prevents DNA damage in replicating immune progenitors, allowing the generation of B cells, T cells, and myeloid cells. PMID:27402700

  9. Integrative genomic analysis identifies epigenetic marks that mediate genetic risk for epithelial ovarian cancer

    PubMed Central

    2014-01-01

    Background Both genetic and epigenetic factors influence the development and progression of epithelial ovarian cancer (EOC). However, there is an incomplete understanding of the interrelationship between these factors and the extent to which they interact to impact disease risk. In the present study, we aimed to gain insight into this relationship by identifying DNA methylation marks that are candidate mediators of ovarian cancer genetic risk. Methods We used 214 cases and 214 age-matched controls from the Mayo Clinic Ovarian Cancer Study. Pretreatment, blood-derived DNA was profiled for genome-wide methylation (Illumina Infinium HumanMethylation27 BeadArray) and single nucleotide polymorphisms (SNPs, Illumina Infinium HD Human610-Quad BeadArray). The Causal Inference Test (CIT) was implemented to distinguish CpG sites that mediate genetic risk, from those that are consequential or independently acted on by genotype. Results Controlling for the estimated distribution of immune cells and other key covariates, our initial epigenome-wide association analysis revealed 1,993 significantly differentially methylated CpGs that between cases and controls (FDR, q < 0.05). The relationship between methylation and case-control status for these 1,993 CpGs was found to be highly consistent with the results of previously published, independent study that consisted of peripheral blood DNA methylation signatures in 131 pretreatment cases and 274 controls. Implementation of the CIT test revealed 17 CpG/SNP pairs, comprising 13 unique CpGs and 17 unique SNPs, which represent potential methylation-mediated relationships between genotype and EOC risk. Of these 13 CpGs, several are associated with immune related genes and genes that have been previously shown to exhibit altered expression in the context of cancer. Conclusions These findings provide additional insight into EOC etiology and may serve as novel biomarkers for EOC susceptibility. PMID:24479488

  10. An Integrated Genomic Analysis of Aryl Hydrocarbon Receptor-Mediated Inhibition of B-Cell Differentiation

    PubMed Central

    De Abrew, K. Nadira; Kaminski, Norbert E.; Thomas, Russell S.

    2010-01-01

    The aryl hydrocarbon receptor (AHR) agonist 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) alters differentiation of B cells and suppresses antibody production. A combination of whole-genome, microarray-based chromatin immunoprecipitation (ChIP-on-chip), and time course gene expression microarray analysis was performed on the mouse B-cell line CH12.LX following exposure to lipopolysaccharide (LPS) or LPS and TCDD to identify the primary and downstream transcriptional elements of B-cell differentiation that are altered by the AHR. ChIP-on-chip analysis identified 1893 regions with a significant increase in AHR binding with TCDD treatment. Transcription factor binding site analysis on the ChIP-on-chip data showed enrichment in AHR response elements. Other transcription factors showed significant coenrichment with AHR response elements. When ChIP-on-chip regions were compared with gene expression changes at the early time points, 78 genes were identified as potential direct targets of the AHR. AHR binding and expression changes were confirmed for a subset of genes in primary mouse B cells. Network analysis examining connections between the 78 potential AHR target genes and three transcription factors known to regulate B-cell differentiation indicated multiple paths for potential regulation by the AHR. Enrichment analysis on the differentially expressed genes at each time point evaluated the downstream impact of AHR-regulated gene expression changes on B-cell–related processes. AHR-mediated impairment of B-cell differentiation occurred at multiple nodes of the B-cell differentiation network and potentially through multiple mechanisms including direct cis-acting effects on key regulators of B-cell differentiation, indirect regulation of B-cell differentiation–related pathways, and transcriptional coregulation of target genes by AHR and other transcription factors. PMID:20819909

  11. Exogenous gene can be integrated into Nosema bombycis genome by mediating with a non-transposon vector.

    PubMed

    Guo, Rui; Cao, Guangli; Lu, Yahong; Xue, Renyu; Kumar, Dhiraj; Hu, Xiaolong; Gong, Chengliang

    2016-08-01

    Nosema bombycis, a microsporidium, is a pathogen of pebrine disease of silkworms, and its genomic DNA sequences had been determined. Thus far, the research of gene functions of microsporidium including N. bombycis cannot be performed with gain/loss of function. In the present study, we targeted to construct transgenic N. bombycis. Therefore, hemocytes of the infected silkworm were transfected with a non-transposon vector pIZT/V5-His vector in vivo, and the blood, in which the hemocyte with green fluorescence could be observed, was added to the cultured BmN cells. Furthermore, normal BmN cells were infected with germinated N. bombycis, and the infected cells were transfected with pIZT/V5-His. Continuous fluorescence observations exposed that there were N. bombycis with green fluorescence in some N. bombycis-infected cells, and the extracted genome from the purified N. bombycis spore was used as templates. PCR amplification was carried out with a pair of primers for specifically amplifying the green fluorescence protein (GFP) gene; a specific product representing the gfp gene could be amplified. Expression of the GFP protein through Western blotting also demonstrated that the gfp gene was perfectly inserted into the genome of N. bombysis. These results illustrated that exogenous gene can be integrated into N. bombycis genome by mediating with a non-transposon vector. Our research not only offers a strategy for research on gene function of N. bombycis but also provides an important reference for constructing genetically modified microsporidium utilized for biocontrol of pests. PMID:27083186

  12. Expression and genomic integration of transgenes after Agrobacterium-mediated transformation of mature barley embryos.

    PubMed

    Uçarlı, C; Tufan, F; Gürel, F

    2015-01-01

    Mature embryos in tissue cultures are advantageous because of their abundance and rapid germination, which reduces genomic instability problems. In this study, 2-day-old isolated mature barley embryos were infected with 2 Agrobacterium hypervirulent strains (AGL1 and EHA105), followed by a 3-day period of co-cultivation in the presence of L-cystein amino acid. Chimeric expression of the b-glucuronidase gene (gusA) directed by a viral promoter of strawberry vein banding virus was observed in coleoptile epidermal cells and seminal roots in 5-day-old germinated seedlings. In addition to varying infectivity patterns in different strains, there was a higher ratio of transient b-glucuronidase expression in developing coleoptiles than in embryonic roots, indicating the high competency of shoot apical meristem cells in the mature embryo. A total of 548 explants were transformed and 156 plants developed to maturity on G418 media after 18-25 days. We detected transgenes in 74% of the screened plant leaves by polymerase chain reaction, and 49% of these expressed neomycin phosphotransferase II gene following AGL1 transformation. Ten randomly selected T0 transformants were analyzed using thermal asymmetric interlaced polymerase chain reaction and 24 fragments ranged between 200-600 base pairs were sequenced. Three of the sequences flanked with transferred-DNA showed high similarity to coding regions of the barley genome, including alpha tubulin5, homeobox 1, and mitochondrial 16S genes. We observed 70-200-base pair filler sequences only in the coding regions of barley in this study. PMID:25730049

  13. XerD-mediated FtsK-independent integration of TLCϕ into the Vibrio cholerae genome.

    PubMed

    Midonet, Caroline; Das, Bhabatosh; Paly, Evelyne; Barre, Francois-Xavier

    2014-11-25

    As in most bacteria, topological problems arising from the circularity of the two Vibrio cholerae chromosomes, chrI and chrII, are resolved by the addition of a crossover at a specific site of each chromosome, dif, by two tyrosine recombinases, XerC and XerD. The reaction is under the control of a cell division protein, FtsK, which activates the formation of a Holliday Junction (HJ) intermediate by XerD catalysis that is resolved into product by XerC catalysis. Many plasmids and phages exploit Xer recombination for dimer resolution and for integration, respectively. In all cases so far described, they rely on an alternative recombination pathway in which XerC catalyzes the formation of a HJ independently of FtsK. This is notably the case for CTXϕ, the cholera toxin phage. Here, we show that in contrast, integration of TLCϕ, a toxin-linked cryptic satellite phage that is almost always found integrated at the chrI dif site before CTXϕ, depends on the formation of a HJ by XerD catalysis, which is then resolved by XerC catalysis. The reaction nevertheless escapes the normal cellular control exerted by FtsK on XerD. In addition, we show that the same reaction promotes the excision of TLCϕ, along with any CTXϕ copy present between dif and its left attachment site, providing a plausible mechanism for how chrI CTXϕ copies can be eliminated, as occurred in the second wave of the current cholera pandemic. PMID:25385643

  14. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity.

    PubMed

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  15. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity

    PubMed Central

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  16. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    PubMed Central

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease. PMID:27585646

  17. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer's disease.

    PubMed

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease. PMID:27585646

  18. CUL9 mediates the functions of the 3M complex and ubiquitylates survivin to maintain genome integrity

    PubMed Central

    Li, Zhijun; Pei, Xin-Hai; Yan, Jun; Yan, Feng; Cappell, Kathryn M.; Whitehurst, Angelique W.; Xiong, Yue

    2014-01-01

    SUMMARY The Cullin 9 (CUL9) gene encodes a putative E3 ligase that localizes in the cytoplasm. Cul9 null mice develop spontaneous tumors in multiple organs, however either the cellular or molecular mechanisms of CUL9 in tumor suppression are currently not known. We show here that deletion of Cul9 leads to abnormal nuclear morphology, increased DNA damage and aneuploidy. CUL9 knockdown rescues the microtubule and mitosis defects in cells depleted for CUL7 or OBSL1, two genes that are mutated in a mutually exclusive manner in 3M growth retardation syndrome and function in microtubule dynamics. CUL9 promotes the ubiquitylation and degradation of survivin and is inhibited by CUL7. Depletion of CUL7 decreases survivin level and overexpression of survivin rescues the defects caused by CUL7 depletion. We propose a 3M–CUL9-survivin pathway in maintaining microtubule and genome integrity, normal development and tumor suppression. PMID:24793696

  19. Integrative Genomics Implicates EGFR as a Downstream Mediator in NKX2-1 Amplified Non-Small Cell Lung Cancer

    PubMed Central

    Clarke, Nicole; Biscocho, Jewison; Kwei, Kevin A.; Davidson, Jean M.; Sridhar, Sushmita; Gong, Xue; Pollack, Jonathan R.

    2015-01-01

    NKX2-1, encoding a homeobox transcription factor, is amplified in approximately 15% of non-small cell lung cancers (NSCLC), where it is thought to drive cancer cell proliferation and survival. However, its mechanism of action remains largely unknown. To identify relevant downstream transcriptional targets, here we carried out a combined NKX2-1 transcriptome (NKX2-1 knockdown followed by RNAseq) and cistrome (NKX2-1 binding sites by ChIPseq) analysis in four NKX2-1-amplified human NSCLC cell lines. While NKX2-1 regulated genes differed among the four cell lines assayed, cell proliferation emerged as a common theme. Moreover, in 3 of the 4 cell lines, epidermal growth factor receptor (EGFR) was among the top NKX2-1 upregulated targets, which we confirmed at the protein level by western blot. Interestingly, EGFR knockdown led to upregulation of NKX2-1, suggesting a negative feedback loop. Consistent with this finding, combined knockdown of NKX2-1 and EGFR in NCI-H1819 lung cancer cells reduced cell proliferation (as well as MAP-kinase and PI3-kinase signaling) more than knockdown of either alone. Likewise, NKX2-1 knockdown enhanced the growth-inhibitory effect of the EGFR-inhibitor erlotinib. Taken together, our findings implicate EGFR as a downstream effector of NKX2-1 in NKX2-1 amplified NSCLC, with possible clinical implications, and provide a rich dataset for investigating additional mediators of NKX2-1 driven oncogenesis. PMID:26556242

  20. Yeast Oligo-mediated Genome Engineering (YOGE)

    PubMed Central

    DiCarlo, JE; Conley, AJ; Penttilä, M; Jäntti, J; Wang, HH; Church, GM

    2014-01-01

    High-frequency oligonucleotide-directed recombination engineering (recombineering) has enabled rapid modification of several prokaryotic genomes to date. Here, we present a method for oligonucleotide-mediated recombineering in the model eukaryote and industrial production host S. cerevisiae, which we call Yeast Oligo-mediated Genome Engineering (YOGE). Through a combination of overexpression and knockouts of relevant genes and optimization of transformation and oligonucleotide designs, we achieve high gene modification frequencies at levels that only require screening of dozens of cells. We demonstrate the robustness of our approach in three divergent yeast strains, including those involved in industrial production of bio-based chemicals. Furthermore, YOGE can be iteratively executed via cycling to generate genomic libraries up to 105 individuals at each round for diversity generation. YOGE cycling alone, or in combination with phenotypic selections or endonuclease-based negative genotypic selections, can be used to easily generate modified alleles in yeast populations with high frequencies. PMID:24160921

  1. Yeast oligo-mediated genome engineering (YOGE).

    PubMed

    DiCarlo, James E; Conley, Andrew J; Penttilä, Merja; Jäntti, Jussi; Wang, Harris H; Church, George M

    2013-12-20

    High-frequency oligonucleotide-directed recombination engineering (recombineering) has enabled rapid modification of several prokaryotic genomes to date. Here, we present a method for oligonucleotide-mediated recombineering in the model eukaryote and industrial production host Saccharomyces cerevisiae , which we call yeast oligo-mediated genome engineering (YOGE). Through a combination of overexpression and knockouts of relevant genes and optimization of transformation and oligonucleotide designs, we achieve high gene-modification frequencies at levels that only require screening of dozens of cells. We demonstrate the robustness of our approach in three divergent yeast strains, including those involved in industrial production of biobased chemicals. Furthermore, YOGE can be iteratively executed via cycling to generate genomic libraries up to 10 (5) individuals at each round for diversity generation. YOGE cycling alone or in combination with phenotypic selections or endonuclease-based negative genotypic selections can be used to generate modified alleles easily in yeast populations with high frequencies. PMID:24160921

  2. TAL effector-mediated genome visualization (TGV).

    PubMed

    Miyanari, Yusuke

    2014-09-01

    The three-dimensional remodeling of chromatin within nucleus is being recognized as determinant for genome regulation. Recent technological advances in live imaging of chromosome loci begun to explore the biological roles of the movement of the chromatin within the nucleus. To facilitate better understanding of the functional relevance and mechanisms regulating genome architecture, we applied transcription activator-like effector (TALE) technology to visualize endogenous repetitive genomic sequences in mouse cells. The application, called TAL effector-mediated genome visualization (TGV), allows us to label specific repetitive sequences and trace nuclear remodeling in living cells. Using this system, parental origin of chromosomes was specifically traced by distinction of single-nucleotide polymorphisms (SNPs). This review will present our approaches to monitor nuclear dynamics of target sequences and highlights key properties and potential uses of TGV. PMID:24704356

  3. Triplex-mediated genome targeting and editing.

    PubMed

    Reza, Faisal; Glazer, Peter M

    2014-01-01

    Genome targeting and editing in vitro and in vivo can be achieved through an interplay of exogenously introduced molecules and the induction of endogenous recombination machinery. The former includes a repertoire of sequence-specific binding molecules for targeted induction and appropriation of this machinery, such as by triplex-forming oligonucleotides (TFOs) or triplex-forming peptide nucleic acids (PNAs) and recombinagenic donor DNA, respectively. This versatile targeting and editing via recombination approach facilitates high-fidelity and low-off-target genome mutagenesis, repair, expression, and regulation. Herein, we describe the current state-of-the-art in triplex-mediated genome targeting and editing with a perspective towards potential translational and therapeutic applications. We detail several materials and methods for the design, delivery, and use of triplex-forming and recombinagenic molecules for mediating and introducing specific, heritable, and safe genomic modifications. Furthermore we denote some guidelines for endogenous genome targeting and editing site identification and techniques to test targeting and editing efficiency. PMID:24557900

  4. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  5. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  6. Zinc Finger Nuclease-Expressing Baculoviral Vectors Mediate Targeted Genome Integration of Reprogramming Factor Genes to Facilitate the Generation of Human Induced Pluripotent Stem Cells

    PubMed Central

    Phang, Rui-Zhe; Tay, Felix Chang; Goh, Sal-Lee; Lau, Cia-Hin; Zhu, Haibao; Tan, Wee-Kiat; Liang, Qingle; Chen, Can; Du, Shouhui; Li, Zhendong; Tay, Johan Chin-Kang; Wu, Chunxiao; Zeng, Jieming; Fan, Weimin; Toh, Han Chong

    2013-01-01

    Integrative gene transfer using retroviruses to express reprogramming factors displays high efficiency in generating induced pluripotent stem cells (iPSCs), but the value of the method is limited because of the concern over mutagenesis associated with random insertion of transgenes. Site-specific integration into a preselected locus by engineered zinc-finger nuclease (ZFN) technology provides a potential way to overcome the problem. Here, we report the successful reprogramming of human fibroblasts into a state of pluripotency by baculoviral transduction-mediated, site-specific integration of OKSM (Oct3/4, Klf4, Sox2, and c-myc) transcription factor genes into the AAVS1 locus in human chromosome 19. Two nonintegrative baculoviral vectors were used for cotransduction, one expressing ZFNs and another as a donor vector encoding the four transcription factors. iPSC colonies were obtained at a high efficiency of 12% (the mean value of eight individual experiments). All characterized iPSC clones carried the transgenic cassette only at the ZFN-specified AAVS1 locus. We further demonstrated that when the donor cassette was flanked by heterospecific loxP sequences, the reprogramming genes in iPSCs could be replaced by another transgene using a baculoviral vector-based Cre recombinase-mediated cassette exchange system, thereby producing iPSCs free of exogenous reprogramming factors. Although the use of nonintegrating methods to generate iPSCs is rapidly becoming a standard approach, methods based on site-specific integration of reprogramming factor genes as reported here hold the potential for efficient generation of genetically amenable iPSCs suitable for future gene therapy applications. PMID:24167318

  7. Integrative Genomics of Chronic Obstructive Pulmonary Disease

    PubMed Central

    Hobbs, Brian D.; Hersh, Craig P.

    2014-01-01

    Chronic obstructive pulmonary disease (COPD) is a complex disease with both environmental and genetic determinants, the most important of which is cigarette smoking. There is marked heterogeneity in the development of COPD among persons with similar cigarette smoking histories, which is likely partially explained by genetic variation. Genomic approaches such as genomewide association studies and gene expression studies have been used to discover genes and molecular pathways involved in COPD pathogenesis; however, these “first generation” omics studies have limitations. Integrative genomic studies are emerging which can combine genomic datasets to further examine the molecular underpinnings of COPD. Future research in COPD genetics will likely use network-based approaches to integrate multiple genomic data types in order to model the complex molecular interactions involved in COPD pathogenesis. This article reviews the genomic research to date and offers a vision for the future of integrative genomic research in COPD. PMID:25078622

  8. Reverse transcriptase: mediator of genomic plasticity.

    PubMed

    Brosius, J; Tiedge, H

    1995-01-01

    Reverse transcription has been an important mediator of genomic change. This influence dates back more than three billion years, when the RNA genome was converted into the DNA genome. While the current cellular role(s) of reverse transcriptase are not yet completely understood, it has become clear over the last few years that this enzyme is still responsible for generating significant genomic change and that its activities are one of the driving forces of evolution. Reverse transcriptase generates, for example, extra gene copies (retrogenes), using as a template mature messenger RNAs. Such retrogenes do not always end up as nonfunctional pseudogenes but form, after reinsertion into the genome, new unions with resident promoter elements that may alter the gene's temporal and/or spatial expression levels. More frequently, reverse transcriptase produces copies of nonmessenger RNAs, such as small nuclear or cytoplasmic RNAs. Extremely high copy numbers can be generated by this process. The resulting reinserted DNA copies are therefore referred to as short interspersed repetitive elements (SINEs). SINEs have long been considered selfish DNA, littering the genome via exponential propagation but not contributing to the host's fitness. Many SINEs, however, can give rise to novel genes encoding small RNAs, and are the migrant carriers of numerous control elements and sequence motifs that can equip resident genes with novel regulatory elements [Brosius J. and Gould S.J., Proc Natl Acad Sci USA 89, 10706-10710, 1992]. Retrosequences, such as SINEs and portions of retroelements (e.g., long terminal repeats, LTRs), are capable of donating sequence motifs for nucleosome positioning, DNA methylation, transcriptional enhancers and silencers, poly(A) addition sequences, determinants of RNA stability or transport, splice sites, and even amino acid codons for incorporation into open reading frames as novel protein domains. Retroposition can therefore be considered as a major

  9. Next-Generation Genomics: an Integrative Approach

    PubMed Central

    Hawkins, R. David; Hon, Gary C.; Ren, Bing

    2011-01-01

    Integrating results from diverse experiments is an essential process in our effort to understand the logic of complex systems, such as development, homeostasis and responses to the environment. With the advent of high-throughput methods - including genome-wide association studies (GWAS), ChIP-Seq, and RNA-Seq, etc., - acquisition of genome-scale data has never been easier. Epigenetics, transcriptomics, proteomics and genomics each provide an insightful, and yet single-dimensional, view of genome function; integrative analysis promises a unified, global view. However, the large amount of information and diverse technology platforms pose multiple challenges for data access and processing. This Review discusses emerging issues and strategies related to data integration in the era of next-generation genomics. PMID:20531367

  10. Integrated genome browser: visual analytics platform for genomics

    PubMed Central

    Norris, David C.; Loraine, Ann E.

    2016-01-01

    Motivation: Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Results: Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB’s ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. Availability and implementation: IGB is open source and is freely available from http://bioviz.org/igb. Contact: aloraine@uncc.edu PMID:27153568

  11. Transcription as a Threat to Genome Integrity.

    PubMed

    Gaillard, Hélène; Aguilera, Andrés

    2016-06-01

    Genomes undergo different types of sporadic alterations, including DNA damage, point mutations, and genome rearrangements, that constitute the basis for evolution. However, these changes may occur at high levels as a result of cell pathology and trigger genome instability, a hallmark of cancer and a number of genetic diseases. In the last two decades, evidence has accumulated that transcription constitutes an important natural source of DNA metabolic errors that can compromise the integrity of the genome. Transcription can create the conditions for high levels of mutations and recombination by its ability to open the DNA structure and remodel chromatin, making it more accessible to DNA insulting agents, and by its ability to become a barrier to DNA replication. Here we review the molecular basis of such events from a mechanistic perspective with particular emphasis on the role of transcription as a genome instability determinant. PMID:27023844

  12. Integrating Mediators and Moderators in Research Design

    ERIC Educational Resources Information Center

    MacKinnon, David P.

    2011-01-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating…

  13. Methods of Genomic Competency Integration in Practice

    PubMed Central

    Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

    2015-01-01

    Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through

  14. An Integrated System for Precise Genome Modification in Escherichia coli.

    PubMed

    Tas, Huseyin; Nguyen, Cac T; Patel, Ravish; Kim, Neil H; Kuhlman, Thomas E

    2015-01-01

    We describe an optimized system for the easy, effective, and precise modification of the Escherichia coli genome. Genome changes are introduced first through the integration of a 1.3 kbp Landing Pad consisting of a gene conferring resistance to tetracycline (tetA) or the ability to metabolize the sugar galactose (galK). The Landing Pad is then excised as a result of double-strand breaks by the homing endonuclease I-SceI, and replaced with DNA fragments bearing the desired change via λ-Red mediated homologous recombination. Repair of the double strand breaks and counterselection against the Landing Pad (using NiCl2 for tetA or 2-deoxy-galactose for galK) allows the isolation of modified bacteria without the use of additional antibiotic selection. We demonstrate the power of this method to make a variety of genome modifications: the exact integration, without any extraneous sequence, of the lac operon (~6.5 kbp) to any desired location in the genome and without the integration of antibiotic markers; the scarless deletion of ribosomal rrn operons (~6 kbp) through either intrachromosomal or oligonucleotide recombination; and the in situ fusion of native genes to fluorescent reporter genes without additional perturbation. PMID:26332675

  15. An Integrated System for Precise Genome Modification in Escherichia coli

    PubMed Central

    Tas, Huseyin; Nguyen, Cac T.; Patel, Ravish; Kim, Neil H.; Kuhlman, Thomas E.

    2015-01-01

    We describe an optimized system for the easy, effective, and precise modification of the Escherichia coli genome. Genome changes are introduced first through the integration of a 1.3 kbp Landing Pad consisting of a gene conferring resistance to tetracycline (tetA) or the ability to metabolize the sugar galactose (galK). The Landing Pad is then excised as a result of double-strand breaks by the homing endonuclease I-SceI, and replaced with DNA fragments bearing the desired change via λ-Red mediated homologous recombination. Repair of the double strand breaks and counterselection against the Landing Pad (using NiCl2 for tetA or 2-deoxy-galactose for galK) allows the isolation of modified bacteria without the use of additional antibiotic selection. We demonstrate the power of this method to make a variety of genome modifications: the exact integration, without any extraneous sequence, of the lac operon (~6.5 kbp) to any desired location in the genome and without the integration of antibiotic markers; the scarless deletion of ribosomal rrn operons (~6 kbp) through either intrachromosomal or oligonucleotide recombination; and the in situ fusion of native genes to fluorescent reporter genes without additional perturbation. PMID:26332675

  16. An Integrated Approach to Predictive Genomic Analytics

    SciTech Connect

    McDermott, Jason E.; Sanfilippo, Antonio P.; Taylor, Ronald C.; Baddeley, Robert L.; Riensche, Roderick M.; Jensen, Russell S.

    2010-08-02

    A variety of methods and algorithms have recently been employed in the analysis of gene expression data, including reverse-engineering and knowledge-based pathway modeling, semantic gene similarity, network analysis and clustering. These methods and algorithms address different subparts of the same overall challenge and need to be applied in combination to address predictive genomic analysis as a whole. In this paper, we present an integrated approach to predictive genomic analysis that achieves this objective and describe an application of the approach to the study of neuroprotection in stroke.

  17. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  18. Integrative Genomics and Computational Systems Medicine

    SciTech Connect

    McDermott, Jason E.; Huang, Yufei; Zhang, Bing; Xu, Hua; Zhao, Zhongming

    2014-01-01

    The exponential growth in generation of large amounts of genomic data from biological samples has driven the emerging field of systems medicine. This field is promising because it improves our understanding of disease processes at the systems level. However, the field is still in its young stage. There exists a great need for novel computational methods and approaches to effectively utilize and integrate various omics data.

  19. RNA-Mediated Epigenetic Programming of Genome Rearrangements

    PubMed Central

    Nowacki, Mariusz; Shetty, Keerthi; Landweber, Laura F.

    2012-01-01

    RNA, normally thought of as a conduit in gene expression, has a novel mode of action in ciliated protozoa. Maternal RNA templates provide both an organizing guide for DNA rearrangements and a template that can transport somatic mutations to the next generation. This opportunity for RNA-mediated genome rearrangement and DNA repair is profound in the ciliate Oxytricha, which deletes 95% of its germline genome during development in a process that severely fragments its chromosomes and then sorts and reorders the hundreds of thousands of pieces remaining. Oxytricha’s somatic nuclear genome is therefore an epigenome formed through RNA templates and signals arising from the previous generation. Furthermore, this mechanism of RNA-mediated epigenetic inheritance can function across multiple generations, and the discovery of maternal template RNA molecules has revealed new biological roles for RNA and has hinted at the power of RNA molecules to sculpt genomic information in cells. PMID:21801022

  20. Genomic, Proteomic, and Metabolomic Data Integration Strategies

    PubMed Central

    Wanichthanarak, Kwanjeera; Fahrmann, Johannes F; Grapov, Dmitry

    2015-01-01

    Robust interpretation of experimental results measuring discreet biological domains remains a significant challenge in the face of complex biochemical regulation processes such as organismal versus tissue versus cellular metabolism, epigenetics, and protein post-translational modification. Integration of analyses carried out across multiple measurement or omic platforms is an emerging approach to help address these challenges. This review focuses on select methods and tools for the integration of metabolomic with genomic and proteomic data using a variety of approaches including biochemical pathway-, ontology-, network-, and empirical-correlation-based methods. PMID:26396492

  1. Integrating Computer-Mediated Communication Strategy Instruction

    ERIC Educational Resources Information Center

    McNeil, Levi

    2016-01-01

    Communication strategies (CSs) play important roles in resolving problematic second language interaction and facilitating language learning. While studies in face-to-face contexts demonstrate the benefits of communication strategy instruction (CSI), there have been few attempts to integrate computer-mediated communication and CSI. The study…

  2. Adeno-Associated Virus Type 2 Wild-Type and Vector-Mediated Genomic Integration Profiles of Human Diploid Fibroblasts Analyzed by Third-Generation PacBio DNA Sequencing

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei

    2014-01-01

    ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will

  3. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational

  4. Multidimensional Genome-wide Analyses Show Accurate FVIII Integration by ZFN in Primary Human Cells

    PubMed Central

    Sivalingam, Jaichandran; Kenanov, Dimitar; Han, Hao; Nirmal, Ajit Johnson; Ng, Wai Har; Lee, Sze Sing; Masilamani, Jeyakumar; Phan, Toan Thang; Maurer-Stroh, Sebastian; Kon, Oi Lian

    2016-01-01

    Costly coagulation factor VIII (FVIII) replacement therapy is a barrier to optimal clinical management of hemophilia A. Therapy using FVIII-secreting autologous primary cells is potentially efficacious and more affordable. Zinc finger nucleases (ZFN) mediate transgene integration into the AAVS1 locus but comprehensive evaluation of off-target genome effects is currently lacking. In light of serious adverse effects in clinical trials which employed genome-integrating viral vectors, this study evaluated potential genotoxicity of ZFN-mediated transgenesis using different techniques. We employed deep sequencing of predicted off-target sites, copy number analysis, whole-genome sequencing, and RNA-seq in primary human umbilical cord-lining epithelial cells (CLECs) with AAVS1 ZFN-mediated FVIII transgene integration. We combined molecular features to enhance the accuracy and activity of ZFN-mediated transgenesis. Our data showed a low frequency of ZFN-associated indels, no detectable off-target transgene integrations or chromosomal rearrangements. ZFN-modified CLECs had very few dysregulated transcripts and no evidence of activated oncogenic pathways. We also showed AAVS1 ZFN activity and durable FVIII transgene secretion in primary human dermal fibroblasts, bone marrow- and adipose tissue-derived stromal cells. Our study suggests that, with close attention to the molecular design of genome-modifying constructs, AAVS1 ZFN-mediated FVIII integration in several primary human cell types may be safe and efficacious. PMID:26689265

  5. Site-specific recombination in the chicken genome using Flipase recombinase-mediated cassette exchange.

    PubMed

    Lee, Hong Jo; Lee, Hyung Chul; Kim, Young Min; Hwang, Young Sun; Park, Young Hyun; Park, Tae Sub; Han, Jae Yong

    2016-02-01

    Targeted genome recombination has been applied in diverse research fields and has a wide range of possible applications. In particular, the discovery of specific loci in the genome that support robust and ubiquitous expression of integrated genes and the development of genome-editing technology have facilitated rapid advances in various scientific areas. In this study, we produced transgenic (TG) chickens that can induce recombinase-mediated gene cassette exchange (RMCE), one of the site-specific recombination technologies, and confirmed RMCE in TG chicken-derived cells. As a result, we established TG chicken lines that have, Flipase (Flp) recognition target (FRT) pairs in the chicken genome, mediated by piggyBac transposition. The transgene integration patterns were diverse in each TG chicken line, and the integration diversity resulted in diverse levels of expression of exogenous genes in each tissue of the TG chickens. In addition, the replaced gene cassette was expressed successfully and maintained by RMCE in the FRT predominant loci of TG chicken-derived cells. These results indicate that targeted genome recombination technology with RMCE could be adaptable to TG chicken models and that the technology would be applicable to specific gene regulation by cis-element insertion and customized expression of functional proteins at predicted levels without epigenetic influence. PMID:26443821

  6. Domain-mediated protein interaction prediction: From genome to network.

    PubMed

    Reimand, Jüri; Hui, Shirley; Jain, Shobhit; Law, Brian; Bader, Gary D

    2012-08-14

    Protein-protein interactions (PPIs), involved in many biological processes such as cellular signaling, are ultimately encoded in the genome. Solving the problem of predicting protein interactions from the genome sequence will lead to increased understanding of complex networks, evolution and human disease. We can learn the relationship between genomes and networks by focusing on an easily approachable subset of high-resolution protein interactions that are mediated by peptide recognition modules (PRMs) such as PDZ, WW and SH3 domains. This review focuses on computational prediction and analysis of PRM-mediated networks and discusses sequence- and structure-based interaction predictors, techniques and datasets for identifying physiologically relevant PPIs, and interpreting high-resolution interaction networks in the context of evolution and human disease. PMID:22561014

  7. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  8. Transposon-mediated Genome Manipulations in Vertebrates

    PubMed Central

    Ivics, Zoltán; Li, Meng Amy; Mátés, Lajos; Boeke, Jef D.; Bradley, Allan; Izsvák, Zsuzsanna

    2010-01-01

    Transposable elements are segments of DNA with the unique ability to move about in the genome. This inherent feature can be exploited to harness these elements as gene vectors for diverse genome manipulations. Transposon-based genetic strategies have been established in vertebrate species over the last decade, and current progress in this field indicates that transposable elements will serve as indispensable tools in the genetic toolkit of vertebrate models. In particular, transposons can be applied as vectors for somatic and germline transgenesis, and as insertional mutagens in both loss-of-function and gain-of-function forward mutagenesis screens. The major advantage of using transposons as genetic tools is that they facilitate analysis of gene function in an easy, controlled and scalable manner. Transposon-based technologies are beginning to be exploited to link sequence information to gene functions in vertebrate models. In this article, we provide an overview of transposon-based methods used in vertebrate model organisms, and highlight the most important considerations concerning genetic applications of the transposon systems. PMID:19478801

  9. Integrated Genomic Characterization of Endometrial Carcinoma

    PubMed Central

    2013-01-01

    Summary We performed an integrated genomic, transcriptomic, and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumors and ~25% of high-grade endometrioid tumors have extensive copy number alterations, few DNA methylation changes, low ER/PR levels, and frequent TP53 mutations. Most endometrioid tumors have few copy number alterations or TP53 mutations but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A, KRAS and novel mutations in the SWI/SNF gene ARID5B. A subset of endometrioid tumors we identified had a dramatically increased transversion mutation frequency, and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy number low, and copy number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may impact post-surgical adjuvant treatment for women with aggressive tumors. PMID:23636398

  10. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish.

    PubMed

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-01-01

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish. PMID:27187373

  11. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    PubMed Central

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-01-01

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish. PMID:27187373

  12. LINE-1 Retrotransposons: Mediators of Somatic Variation in Neuronal Genomes?

    PubMed Central

    Singer, Tatjana; McConnell, Michael J.; Marchetto, Maria C.N.; Coufal, Nicole G.; Gage, Fred H.

    2010-01-01

    LINE-1 (L1) elements are retrotransposons that insert extra copies of themselves throughout the genome using a “copy and paste” mechanism. L1s have contributed ~20% to total human genome content and are able to influence chromosome integrity and gene expression upon reinsertion. Recent studies show that L1 elements are active and “jumping” during neuronal differentiation. New somatic L1 insertions may generate “genomic plasticity” in neurons by causing variation in genomic DNA sequences and by altering the transcriptome of individual cells. Thus, L1-induced variation may affect neuronal plasticity and behavior. Here, we discuss potential consequences of L1-induced neuronal diversity and propose that a mechanism generating diversity in the brain could broaden the spectrum of behavioral phenotypes that can originate from any single genome. PMID:20471112

  13. Genome integrity, stem cells and hyaluronan

    PubMed Central

    Darzynkiewicz, Zbigniew; Balazs, Endre A.

    2012-01-01

    Faithful preservation of genome integrity is the critical mission of stem cells as well as of germ cells. Reviewed are the following mechanisms involved in protecting DNA in these cells: (a) The efflux machinery that can pump out variety of genotoxins in ATP-dependent manner; (b) the mechanisms maintaining minimal metabolic activity which reduces generation of reactive oxidants, by-products of aerobic respiration; (c) the role of hypoxic niche of stem cells providing a gradient of variable oxygen tension; (d) (e) the presence of hyaluronan (HA) and HA receptors on stem cells and in the niche; (f) the role of HA in protecting DNA from oxidative damage; (g) the specific function of HA in protecting DNA in stem cells; (h) the interactions of HA with sperm cells and oocytes that also may shield their DNA from oxidative damage, and (e) mechanisms by which HA exerts the anti-oxidant activity. While HA has multitude of functions its anti-oxidant capabilities are often overlooked but may be of significance in preservation of integrity of stem and germ cells genome. PMID:22383371

  14. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  15. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  16. Efficient strategies for TALEN-mediated genome editing in mammalian cell lines.

    PubMed

    Valton, Julien; Cabaniols, Jean-Pierre; Galetto, Romàn; Delacote, Fabien; Duhamel, Marianne; Paris, Sebastien; Blanchard, Domique Alain; Lebuhotel, Céline; Thomas, Séverine; Moriceau, Sandra; Demirdjian, Raffy; Letort, Gil; Jacquet, Adeline; Gariboldi, Annabelle; Rolland, Sandra; Daboussi, Fayza; Juillerat, Alexandre; Bertonati, Claudia; Duclert, Aymeric; Duchateau, Philippe

    2014-09-01

    TALEN is one of the most widely used tools in the field of genome editing. It enables gene integration and gene inactivation in a highly efficient and specific fashion. Although very attractive, the apparent simplicity and high success rate of TALEN could be misleading for novices in the field of gene editing. Depending on the application, specific TALEN designs, activity assessments and screening strategies need to be adopted. Here we report different methods to efficiently perform TALEN-mediated gene integration and inactivation in different mammalian cell systems including induced pluripotent stem cells and delineate experimental examples associated with these approaches. PMID:25047178

  17. LKB1 preserves genome integrity by stimulating BRCA1 expression

    PubMed Central

    Gupta, Romi; Liu, Alex. Y.; Glazer, Peter M.; Wajapeyee, Narendra

    2015-01-01

    Serine/threonine kinase 11 (STK11, also known as LKB1) functions as a tumor suppressor in many human cancers. However, paradoxically loss of LKB1 in mouse embryonic fibroblast results in resistance to oncogene-induced transformation. Therefore, it is unclear why loss of LKB1 leads to increased predisposition to develop a wide variety of cancers. Here, we show that LKB1 protects cells from genotoxic stress. Cells lacking LKB1 display increased sensitivity to irradiation, accumulates more DNA double-strand breaks, display defective homology-directed DNA repair (HDR) and exhibit increased mutation rate, compared with that of LKB1-expressing cells. Conversely, the ectopic expression of LKB1 in cells lacking LKB1 protects them against genotoxic stress-induced DNA damage and prevents the accumulation of mutations. We find that LKB1 post-transcriptionally stimulates HDR gene BRCA1 expression by inhibiting the cytoplasmic localization of the RNA-binding protein, HU antigen R, in an AMP kinase-dependent manner and stabilizes BRCA1 mRNA. Cells lacking BRCA1 similar to the cell lacking LKB1 display increased genomic instability and ectopic expression of BRCA1 rescues LKB1 loss-induced sensitivity to genotoxic stress. Collectively, our results demonstrate that LKB1 is a crucial regulator of genome integrity and reveal a novel mechanism for LKB1-mediated tumor suppression with direct therapeutic implications for cancer prevention. PMID:25488815

  18. CRISPR mediated somatic cell genome engineering in the chicken.

    PubMed

    Véron, Nadège; Qu, Zhengdong; Kipen, Phoebe A S; Hirst, Claire E; Marcelle, Christophe

    2015-11-01

    Gene-targeted knockout technologies are invaluable tools for understanding the functions of genes in vivo. CRISPR/Cas9 system of RNA-guided genome editing is revolutionizing genetics research in a wide spectrum of organisms. Here, we combined CRISPR with in vivo electroporation in the chicken embryo to efficiently target the transcription factor PAX7 in tissues of the developing embryo. This approach generated mosaic genetic mutations within a wild-type cellular background. This series of proof-of-principle experiments indicate that in vivo CRISPR-mediated cell genome engineering is an effective method to achieve gene loss-of-function in the tissues of the chicken embryo and it completes the growing genetic toolbox to study the molecular mechanisms regulating development in this important animal model. PMID:26277216

  19. Precise in-frame integration of exogenous DNA mediated by CRISPR/Cas9 system in zebrafish.

    PubMed

    Hisano, Yu; Sakuma, Tetsushi; Nakade, Shota; Ohga, Rie; Ota, Satoshi; Okamoto, Hitoshi; Yamamoto, Takashi; Kawahara, Atsuo

    2015-01-01

    The CRISPR/Cas9 system provides a powerful tool for genome editing in various model organisms, including zebrafish. The establishment of targeted gene-disrupted zebrafish (knockouts) is readily achieved by CRISPR/Cas9-mediated genome modification. Recently, exogenous DNA integration into the zebrafish genome via homology-independent DNA repair was reported, but this integration contained various mutations at the junctions of genomic and integrated DNA. Thus, precise genome modification into targeted genomic loci remains to be achieved. Here, we describe efficient, precise CRISPR/Cas9-mediated integration using a donor vector harbouring short homologous sequences (10-40 bp) flanking the genomic target locus. We succeeded in integrating with high efficiency an exogenous mCherry or eGFP gene into targeted genes (tyrosinase and krtt1c19e) in frame. We found the precise in-frame integration of exogenous DNA without backbone vector sequences when Cas9 cleavage sites were introduced at both sides of the left homology arm, the eGFP sequence and the right homology arm. Furthermore, we confirmed that this precise genome modification was heritable. This simple method enables precise targeted gene knock-in in zebrafish. PMID:25740433

  20. Perspectives of integrative cancer genomics in next generation sequencing era.

    PubMed

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-06-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

  1. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    PubMed

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org. PMID:27252399

  2. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database

    PubMed Central

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T.; Karra, Kalpana; Hitz, Benjamin C.; Nash, Robert S.; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J.

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences. Database URL: www.yeastgenome.org PMID:27252399

  3. Report from the First Snake Genomics and Integrative Biology Meeting

    PubMed Central

    Castoe, Todd A.; Braun, Edward L.; Bronikowski, Anne M.; Cox, Christian L.; Rabosky, Alison R. Davis; Jason de Koning, A.P.; Dobry, Jason; Fujita, Matthew K.; Giorgianni, Matt W; Hargreaves, Adam; Henkel, Christiaan V.; Mackessy, Stephen P.; O’Meally, Denis; Rokyta, Darin R.; Secor, Stephen M.; Streicher, Jeffrey W.; Wray, Kenneth P.; Yokoyama, Ken D.; Pollock, David D.

    2012-01-01

    This report summarizes the proceedings of the 1st Snake Genomics and Integrative Biology Meeting held in Vail, CO USA, 5-8 October 2011. The meeting had over twenty registered participants, and was conducted as a single session of presentations. Goals of the meeting included coordination of genomic data collection and fostering collaborative interactions among researchers using snakes as model systems. PMID:23451292

  4. Methods for integration site distribution analyses in animal cell genomes

    PubMed Central

    Ciuffi, Angela; Ronen, Keshet; Brady, Troy; Malani, Nirav; Wang, Gary; Berry, Charles C.; Bushman, Frederic D.

    2014-01-01

    The question of where retroviral DNA becomes integrated in chromosomes is important for understanding (i) the mechanisms of viral growth, (ii) devising new anti-retroviral therapy, (iii) understanding how genomes evolve, and (iv) developing safer methods for gene therapy. With the completion of genome sequences for many organisms, it has become possible to study integration targeting by cloning and sequencing large numbers of host–virus DNA junctions, then mapping the host DNA segments back onto the genomic sequence. This allows statistical analysis of the distribution of integration sites relative to the myriad types of genomic features that are also being mapped onto the sequence scaffold. Here we present methods for recovering and analyzing integration site sequences. PMID:19038346

  5. Integrated proteomic and genomic analysis of colorectal cancer

    Cancer.gov

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  6. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Zhou, Jizhong; He, Zhili

    2014-04-08

    As a part of the Shewanella Federation project, we have used integrated genomic, proteomic and computational technologies to study various aspects of energy metabolism of two Shewanella strains from a systems-level perspective.

  7. Applied plant genomics: the secret is integration.

    PubMed

    Osterlund, Mark T; Paterson, Andrew H

    2002-04-01

    Although concerted efforts to understand selected botanical models have been made, the resulting basic knowledge varies in its applicability to other diverse species including the major crops. Recent advances in high-throughput genomics are offering new avenues through which to exploit model systems for the study of botanical diversity, providing prospects for crop improvement. In particular, whole-genome sequencing has provided opportunities for the broader application of reverse genetics, expression profiling, and molecular mapping in diverse species. PMID:11856610

  8. Integrated Microbial Genomes (IMG) System from the DOE Joint Genome Institute (JGI)

    DOE Data Explorer

    The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov. [Abstract from The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions; Victor M. Markowitz, Ernest Szeto, Krishna Palaniappan, Yuri Grechkin, Ken Chu, I-Min A. Chen, Inna Dubchak, Iain Anderson, Athanasios Lykidis, Konstantinos Mavromatis, Natalia N. Ivanova and Nikos C. Kyrpides; Nucleic Acids Research, 2008, Vol. 36. (Database Issue) See also the companion system, Integrated Microbial Genomes with Microbiome Samples.

  9. Recombination-mediated genetic engineering of large genomic DNA transgenes.

    PubMed

    Ejsmont, Radoslaw Kamil; Ahlfeld, Peter; Pozniakovsky, Andrei; Stewart, A Francis; Tomancak, Pavel; Sarov, Mihail

    2011-01-01

    Faithful gene activity reporters are a useful tool for evo-devo studies enabling selective introduction of specific loci between species and assaying the activity of large gene regulatory sequences. The use of large genomic constructs such as BACs and fosmids provides an efficient platform for exploration of gene function under endogenous regulatory control. Despite their large size they can be easily engineered using in vivo homologous recombination in Escherichia coli (recombineering). We have previously demonstrated that the efficiency and fidelity of recombineering are sufficient to allow high-throughput transgene engineering in liquid culture, and have successfully applied this approach in several model systems. Here, we present a detailed protocol for recombineering of BAC/fosmid transgenes for expression of fluorescent or affinity tagged proteins in Drosophila under endogenous in vivo regulatory control. The tag coding sequence is seamlessly recombineered into the genomic region contained in the BAC/fosmid clone, which is then integrated into the fly genome using ϕC31 recombination. This protocol can be easily adapted to other recombineering projects. PMID:22065454

  10. CRISPR-Cas9-Mediated Genome Editing in Leishmania donovani

    PubMed Central

    Zhang, Wen-Wei

    2015-01-01

    ABSTRACT The prokaryotic CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9, an RNA-guided endonuclease, has been shown to mediate efficient genome editing in a wide variety of organisms. In the present study, the CRISPR-Cas9 system has been adapted to Leishmania donovani, a protozoan parasite that causes fatal human visceral leishmaniasis. We introduced the Cas9 nuclease into L. donovani and generated guide RNA (gRNA) expression vectors by using the L. donovani rRNA promoter and the hepatitis delta virus (HDV) ribozyme. It is demonstrated within that L. donovani mainly used homology-directed repair (HDR) and microhomology-mediated end joining (MMEJ) to repair the Cas9 nuclease-created double-strand DNA break (DSB). The nonhomologous end-joining (NHEJ) pathway appears to be absent in L. donovani. With this CRISPR-Cas9 system, it was possible to generate knockouts without selection by insertion of an oligonucleotide donor with stop codons and 25-nucleotide homology arms into the Cas9 cleavage site. Likewise, we disrupted and precisely tagged endogenous genes by inserting a bleomycin drug selection marker and GFP gene into the Cas9 cleavage site. With the use of Hammerhead and HDV ribozymes, a double-gRNA expression vector that further improved gene-targeting efficiency was developed, and it was used to make precise deletion of the 3-kb miltefosine transporter gene (LdMT). In addition, this study identified a novel single point mutation caused by CRISPR-Cas9 in LdMT (M381T) that led to miltefosine resistance, a concern for the only available oral antileishmanial drug. Together, these results demonstrate that the CRISPR-Cas9 system represents an effective genome engineering tool for L. donovani. PMID:26199327

  11. Integrated genomic characterization of IDH1-mutant glioma malignant progression

    PubMed Central

    Bai, Hanwen; Harmanci, Akdes Serin; Erson-Omay, E Zeynep; Li, Jie; Coşkun, Süleyman; Simon, Matthias; Krischek, Boris; Özduman, Koray; Omay, S Bülent; Sorensen, Eric A; Turcan, Şevin; Bakırcığlu, Mehmet; Carrión-Grant, Geneive; Murray, Phillip B; Clark, Victoria E; Ercan-Sencicek, A Gulhan; Knight, James; Sencar, Leman; Altınok, Selin; Kaulen, Leon D; Gülez, Burcu; Timmer, Marco; Schramm, Johannes; Mishra-Gorur, Ketu; Henegariu, Octavian; Moliterno, Jennifer; Louvi, Angeliki; Chan, Timothy A; Tannheimer, Stacey L; Pamir, M Necmettin; Vortmeyer, Alexander O; Bilguvar, Kaya; Yasuno, Katsuhito; Günel, Murat

    2016-01-01

    Gliomas represent approximately 30% of all central nervous system tumors and 80% of malignant brain tumors1. To understand the molecular mechanisms underlying the malignant progression of low-grade gliomas with mutations in IDH1 (encoding isocitrate dehydrogenase 1), we studied paired tumor samples from 41 patients, comparing higher-grade, progressed samples to their lower-grade counterparts. Integrated genomic analyses, including whole-exome sequencing and copy number, gene expression and DNA methylation profiling, demonstrated nonlinear clonal expansion of the original tumors and identified oncogenic pathways driving progression. These include activation of the MYC and RTK-RAS-PI3K pathways and upregulation of the FOXM1- and E2F2-mediated cell cycle transitions, as well as epigenetic silencing of developmental transcription factor genes bound by Polycomb repressive complex 2 in human embryonic stem cells. Our results not only provide mechanistic insight into the genetic and epigenetic mechanisms driving glioma progression but also identify inhibition of the bromodomain and extraterminal (BET) family as a potential therapeutic approach. PMID:26618343

  12. Genome Instability Mediates the Loss of Key Traits by Acinetobacter baylyi ADP1 during Laboratory Evolution

    PubMed Central

    Renda, Brian A.; Dasgupta, Aurko; Leon, Dacia

    2014-01-01

    Acinetobacter baylyi ADP1 has the potential to be a versatile bacterial host for synthetic biology because it is naturally transformable. To examine the genetic reliability of this desirable trait and to understand the potential stability of other engineered capabilities, we propagated ADP1 for 1,000 generations of growth in rich nutrient broth and analyzed the genetic changes that evolved by whole-genome sequencing. Substantially reduced transformability and increased cellular aggregation evolved during the experiment. New insertions of IS1236 transposable elements and IS1236-mediated deletions led to these phenotypes in most cases and were common overall among the selected mutations. We also observed a 49-kb deletion of a prophage region that removed an integration site, which has been used for genome engineering, from every evolved genome. The comparatively low rates of these three classes of mutations in lineages that were propagated with reduced selection for 7,500 generations indicate that they increase ADP1 fitness under common laboratory growth conditions. Our results suggest that eliminating transposable elements and other genetic failure modes that affect key organismal traits is essential for improving the reliability of metabolic engineering and genome editing in undomesticated microbial hosts, such as Acinetobacter baylyi ADP1. PMID:25512307

  13. Amplification, Next-generation Sequencing, and Genomic DNA Mapping of Retroviral Integration Sites.

    PubMed

    Serrao, Erik; Cherepanov, Peter; Engelman, Alan N

    2016-01-01

    Retroviruses exhibit signature integration preferences on both the local and global scales. Here, we present a detailed protocol for (1) generation of diverse libraries of retroviral integration sites using ligation-mediated PCR (LM-PCR) amplification and next-generation sequencing (NGS), (2) mapping the genomic location of each virus-host junction using BEDTools, and (3) analyzing the data for statistical relevance. Genomic DNA extracted from infected cells is fragmented by digestion with restriction enzymes or by sonication. After suitable DNA end-repair, double-stranded linkers are ligated onto the DNA ends, and semi-nested PCR is conducted using primers complementary to both the long terminal repeat (LTR) end of the virus and the ligated linker DNA. The PCR primers carry sequences required for DNA clustering during NGS, negating the requirement for separate adapter ligation. Quality control (QC) is conducted to assess DNA fragment size distribution and adapter DNA incorporation prior to NGS. Sequence output files are filtered for LTR-containing reads, and the sequences defining the LTR and the linker are cropped away. Trimmed host cell sequences are mapped to a reference genome using BLAT and are filtered for minimally 97% identity to a unique point in the reference genome. Unique integration sites are scrutinized for adjacent nucleotide (nt) sequence and distribution relative to various genomic features. Using this protocol, integration site libraries of high complexity can be constructed from genomic DNA in three days. The entire protocol that encompasses exogenous viral infection of susceptible tissue culture cells to integration site analysis can therefore be conducted in approximately one to two weeks. Recent applications of this technology pertain to longitudinal analysis of integration sites from HIV-infected patients. PMID:27023428

  14. A physical map of the papaya genome with integrated genetic map and genome sequence

    PubMed Central

    2009-01-01

    Background Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we constructed a BAC-based physical map using a high information-content fingerprinting approach to assist whole genome shotgun sequence assembly. Results The physical map consists of 963 contigs, representing 9.4× genome equivalents, and was integrated with the genetic map and genome sequence using BAC end sequences and a sequence-tagged high-density genetic map. The estimated genome coverage of the physical map is about 95.8%, while 72.4% of the genome was aligned to the genetic map. A total of 1,181 high quality overgo (overlapping oligonucleotide) probes representing conserved sequences in Arabidopsis and genetically mapped loci in Brassica were anchored on the physical map, which provides a foundation for comparative genomics in the Brassicales. The integrated genetic and physical map aligned with the genome sequence revealed recombination hotspots as well as regions suppressed for recombination across the genome, particularly on the recently evolved sex chromosomes. Suppression of recombination spread to the adjacent region of the male specific region of the Y chromosome (MSY), and recombination rates were recovered gradually and then exceeded the genome average. Recombination hotspots were observed at about 10 Mb away on both sides of the MSY, showing 7-fold increase compared with the genome wide average, demonstrating the dynamics of recombination of the sex chromosomes. Conclusion A BAC-based physical map of papaya was constructed and integrated with the genetic map and genome sequence. The integrated map facilitated the draft genome assembly

  15. Roles of DNA helicases in the maintenance of genome integrity

    PubMed Central

    Bochman, Matthew L

    2014-01-01

    Genome integrity is achieved and maintained by the sum of all of the processes in the cell that ensure the faithful duplication and repair of DNA, as well as its genetic transmission from one cell division to the next. As central players in virtually all of the DNA transactions that occur in vivo, DNA helicases (molecular motors that unwind double-stranded DNA to produce single-stranded substrates) represent a crucial enzyme family that is necessary for genomic stability. Indeed, mutations in many human helicase genes are linked to a variety of diseases with symptoms that can be generally described as genomic instability, such as predispositions to cancers. This review focuses on the roles of both DNA replication helicases and recombination/repair helicases in maintaining genome integrity and provides a brief overview of the diseases related to defects in these enzymes. PMID:27308340

  16. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M.; Micheals, G.S.; Taylor, R.

    1992-12-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator`s tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  17. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M. ); Micheals, G.S.; Taylor, R. . Div. of Computer Resources and Technology)

    1992-01-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator's tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  18. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species. PMID:23324169

  19. Identifying potential cancer driver genes by genomic data integration

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-12-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis.

  20. Mutation Detection with Next-Generation Resequencing through a Mediator Genome

    SciTech Connect

    Wurtzel, Omri; Dori-Bachash, Mally; Pietrokovski, Shmuel; Jurkevitch, Edouard; Sorek, Rotem

    2010-12-20

    The affordability of next generation sequencing (NGS) is transforming the field of mutation analysis in bacteria. The genetic basis for phenotype alteration can be identified directly by sequencing the entire genome of the mutant and comparing it to the wild-type (WT) genome, thus identifying acquired mutations. A major limitation for this approach is the need for an a-priori sequenced reference genome for the WT organism, as the short reads of most current NGS approaches usually prohibit de-novo genome assembly. To overcome this limitation we propose a general framework that utilizes the genome of relative organisms as mediators for comparing WT and mutant bacteria. Under this framework, both mutant and WT genomes are sequenced with NGS, and the short sequencing reads are mapped to the mediator genome. Variations between the mutant and the mediator that recur in the WT are ignored, thus pinpointing the differences between the mutant and the WT. To validate this approach we sequenced the genome of Bdellovibrio bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent mutant, and compared both to the mediator species Bdellovibrio bacteriovorus HD100. Although the mutant and the mediator sequences differed in more than 28,000 nucleotide positions, our approach enabled pinpointing the single causative mutation. Experimental validation in 53 additional mutants further established the implicated gene. Our approach extends the applicability of NGS-based mutant analyses beyond the domain of available reference genomes.

  1. Integrator mediates the biogenesis of enhancer RNAs

    PubMed Central

    Lai, Fan; Gardini, Alessandro; Zhang, Anda; Shiekhattar, Ramin

    2015-01-01

    Integrator is a multi-subunit complex stably associated with the C-terminal domain (CTD) of RNA polymerase II (RNAPII) 1. Integrator is endowed with a core catalytic RNA endonuclease activity, which is required for the 3′-end processing of non-polyadenylated RNAPII-dependent uridylate-rich small nuclear RNA genes (UsnRNAs) 1. Here, we examined the requirement of Integrator in the biogenesis of transcripts derived from distal regulatory elements (enhancers) involved in tissue- and temporal-specific regulation of gene expression 2–5. Integrator is recruited to enhancers and super-enhancers in a stimulus-dependent manner. Functional depletion of Integrator subunits diminishes the signal-dependent induction of eRNAs and abrogates the stimulus-induced enhancer-promoter chromatin looping. Global nuclear run-on and RNAPII profiling reveals a role for Integrator in 3′-end cleavage of eRNAs primary transcripts leading to transcriptional termination. In the absence of Integrator, eRNAs remain bound to RNAPII and their primary transcripts accumulates. Importantly, the induction of eRNAs and gene expression responsiveness requires the catalytic activity of Integrator complex. We propose a role for Integrator in biogenesis of eRNAs and enhancer function in metazoans. PMID:26308897

  2. Integrated genomic characterization of papillary thyroid carcinoma.

    PubMed

    2014-10-23

    Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D, and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors, and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  3. Integrated Genomic Characterization of Papillary Thyroid Carcinoma

    PubMed Central

    Agrawal, Nishant; Akbani, Rehan; Aksoy, B. Arman; Ally, Adrian; Arachchi, Harindra; Asa, Sylvia L.; Auman, J. Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B.; Behera, Madhusmita; Bernard, Brady; Beroukhim, Rameen; Bishop, Justin A.; Black, Aaron D.; Bodenheimer, Tom; Boice, Lori; Bootwalla, Moiz S.; Bowen, Jay; Bowlby, Reanne; Bristow, Christopher A.; Brookens, Robin; Brooks, Denise; Bryant, Robert; Buda, Elizabeth; Butterfield, Yaron S.N.; Carling, Tobias; Carlsen, Rebecca; Carter, Scott L.; Carty, Sally E.; Chan, Timothy A.; Chen, Amy Y.; Cherniack, Andrew D.; Cheung, Dorothy; Chin, Lynda; Cho, Juok; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Ciriello, Giovanni; Clarke, Amanda; Clayman, Gary L.; Cope, Leslie; Copland, John; Covington, Kyle; Danilova, Ludmila; Davidsen, Tanja; Demchok, John A.; DiCara, Daniel; Dhalla, Noreen; Dhir, Rajiv; Dookran, Sheliann S.; Dresdner, Gideon; Eldridge, Jonathan; Eley, Greg; El-Naggar, Adel K.; Eng, Stephanie; Fagin, James A.; Fennell, Timothy; Ferris, Robert L.; Fisher, Sheila; Frazer, Scott; Frick, Jessica; Gabriel, Stacey B.; Ganly, Ian; Gao, Jianjiong; Garraway, Levi A.; Gastier-Foster, Julie M.; Getz, Gad; Gehlenborg, Nils; Ghossein, Ronald; Gibbs, Richard A.; Giordano, Thomas J.; Gomez-Hernandez, Karen; Grimsby, Jonna; Gross, Benjamin; Guin, Ranabir; Hadjipanayis, Angela; Harper, Hollie A.; Hayes, D. Neil; Heiman, David I.; Herman, James G.; Hoadley, Katherine A.; Hofree, Matan; Holt, Robert A.; Hoyle, Alan P.; Huang, Franklin W.; Huang, Mei; Hutter, Carolyn M.; Ideker, Trey; Iype, Lisa; Jacobsen, Anders; Jefferys, Stuart R.; Jones, Corbin D.; Jones, Steven J.M.; Kasaian, Katayoon; Kebebew, Electron; Khuri, Fadlo R.; Kim, Jaegil; Kramer, Roger; Kreisberg, Richard; Kucherlapati, Raju; Kwiatkowski, David J.; Ladanyi, Marc; Lai, Phillip H.; Laird, Peter W.; Lander, Eric; Lawrence, Michael S.; Lee, Darlene; Lee, Eunjung; Lee, Semin; Lee, William; Leraas, Kristen M.; Lichtenberg, Tara M.; Lichtenstein, Lee; Lin, Pei; Ling, Shiyun; Liu, Jinze; Liu, Wenbin; Liu, Yingchun; LiVolsi, Virginia A.; Lu, Yiling; Ma, Yussanne; Mahadeshwar, Harshad S.; Marra, Marco A.; Mayo, Michael; McFadden, David G.; Meng, Shaowu; Meyerson, Matthew; Mieczkowski, Piotr A.; Miller, Michael; Mills, Gordon; Moore, Richard A.; Mose, Lisle E.; Mungall, Andrew J.; Murray, Bradley A.; Nikiforov, Yuri E.; Noble, Michael S.; Ojesina, Akinyemi I.; Owonikoko, Taofeek K.; Ozenberger, Bradley A.; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J.; Parker, Joel S.; Paull, Evan O.; Pedamallu, Chandra Sekhar; Perou, Charles M.; Prins, Jan F.; Protopopov, Alexei; Ramalingam, Suresh S.; Ramirez, Nilsa C.; Ramirez, Ricardo; Raphael, Benjamin J.; Rathmell, W. Kimryn; Ren, Xiaojia; Reynolds, Sheila M.; Rheinbay, Esther; Ringel, Matthew D.; Rivera, Michael; Roach, Jeffrey; Robertson, A. Gordon; Rosenberg, Mara W.; Rosenthall, Matthew; Sadeghi, Sara; Saksena, Gordon; Sander, Chris; Santoso, Netty; Schein, Jacqueline E.; Schultz, Nikolaus; Schumacher, Steven E.; Seethala, Raja R.; Seidman, Jonathan; Senbabaoglu, Yasin; Seth, Sahil; Sharpe, Samantha; Mills Shaw, Kenna R.; Shen, John P.; Shen, Ronglai; Sherman, Steven; Sheth, Margi; Shi, Yan; Shmulevich, Ilya; Sica, Gabriel L.; Simons, Janae V.; Sipahimalani, Payal; Smallridge, Robert C.; Sofia, Heidi J.; Soloway, Matthew G.; Song, Xingzhi; Sougnez, Carrie; Stewart, Chip; Stojanov, Petar; Stuart, Joshua M.; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Tarnuzzer, Roy; Taylor, Barry S.; Thiessen, Nina; Thorne, Leigh; Thorsson, Vésteinn; Tuttle, R. Michael; Umbricht, Christopher B.; Van Den Berg, David J.; Vandin, Fabio; Veluvolu, Umadevi; Verhaak, Roel G.W.; Vinco, Michelle; Voet, Doug; Walter, Vonn; Wang, Zhining; Waring, Scot; Weinberger, Paul M.; Weinstein, John N.; Weisenberger, Daniel J.; Wheeler, David; Wilkerson, Matthew D.; Wilson, Jocelyn; Williams, Michelle; Winer, Daniel A.; Wise, Lisa; Wu, Junyuan; Xi, Liu; Xu, Andrew W.; Yang, Liming; Yang, Lixing; Zack, Travis I.; Zeiger, Martha A.; Zeng, Dong; Zenklusen, Jean Claude; Zhao, Ni; Zhang, Hailei; Zhang, Jianhua; Zhang, Jiashan (Julia); Zhang, Wei; Zmuda, Erik; Zou., Lihua

    2014-01-01

    Summary Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  4. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  5. G protein-coupled receptors: extranuclear mediators for the non-genomic actions of steroids.

    PubMed

    Wang, Chen; Liu, Yi; Cao, Ji-Min

    2014-01-01

    Steroids hormones possess two distinct actions, a delayed genomic effect and a rapid non-genomic effect. Rapid steroid-triggered signaling is mediated by specific receptors localized most often to the plasma membrane. The nature of these receptors is of great interest and accumulated data suggest that G protein-coupled receptors (GPCRs) are appealing candidates. Increasing evidence regarding the interaction between steroids and specific membrane proteins, as well as the involvement of G protein and corresponding downstream signaling, have led to identification of physiologically relevant GPCRs as steroid extranuclear receptors. Examples include G protein-coupled receptor 30 (GPR30) for estrogen, membrane progestin receptor for progesterone, G protein-coupled receptor family C group 6 member A (GPRC6A) and zinc transporter member 9 (ZIP9) for androgen, and trace amine associated receptor 1 (TAAR1) for thyroid hormone. These receptor-mediated biological effects have been extended to reproductive development, cardiovascular function, neuroendocrinology and cancer pathophysiology. However, although great progress have been achieved, there are still important questions that need to be answered, including the identities of GPCRs responsible for the remaining steroids (e.g., glucocorticoid), the structural basis of steroids and GPCRs' interaction and the integration of extranuclear and nuclear signaling to the final physiological function. Here, we reviewed the several significant developments in this field and highlighted a hypothesis that attempts to explain the general interaction between steroids and GPCRs. PMID:25257522

  6. Integration of cancer genomics with treatment selection: from the genome to predictive biomarkers

    PubMed Central

    Ow, Thomas J.; Sandulache, Vlad C.; Skinner, Heath D.; Myers, Jeffrey N.

    2013-01-01

    The field of cancer genomics is rapidly advancing as new technology provides detailed genetic and epigenetic profiling of human cancers. The amount of new data available describing the genetic make-up of tumors is paralleled by rapid advances in drug discovery and molecular therapy currently under investigation to treat these diseases. This review summarizes the challenges and approaches associated with the integration of genomic data into the development of new biomarkers in the management of cancer. PMID:24037788

  7. The genomic basis of vomeronasal-mediated behaviour.

    PubMed

    Ibarra-Soria, Ximena; Levitin, Maria O; Logan, Darren W

    2014-02-01

    The vomeronasal organ (VNO) is a chemosensory subsystem found in the nose of most mammals. It is principally tasked with detecting pheromones and other chemical signals that initiate innate behavioural responses. The VNO expresses subfamilies of vomeronasal receptors (VRs) in a cell-specific manner: each sensory neuron expresses just one or two receptors and silences all the other receptor genes. VR genes vary greatly in number within mammalian genomes, from no functional genes in some primates to many hundreds in rodents. They bind semiochemicals, some of which are also encoded in gene families that are coexpanded in species with correspondingly large VR repertoires. Protein and peptide cues that activate the VNO tend to be expressed in exocrine tissues in sexually dimorphic, and sometimes individually variable, patterns. Few chemical ligand-VR-behaviour relationships have been fully elucidated to date, largely due to technical difficulties in working with large, homologous gene families with high sequence identity. However, analysis of mouse lines with mutations in genes involved in ligand-VR signal transduction has revealed that the VNO mediates a range of social behaviours, including male-male and maternal aggression, sexual attraction, lordosis, and selective pregnancy termination, as well as interspecific responses such as avoidance and defensive behaviours. The unusual logic of VR expression now offers an opportunity to map the specific neural circuits that drive these behaviours. PMID:23884334

  8. Integrative analysis of genome-wide RNA interference screens.

    PubMed

    Berndt, Jason D; Biechele, Travis L; Moon, Randall T; Major, Michael B

    2009-01-01

    High-throughput genetic screens have exponentially increased the functional annotation of the genome over the past 10 years. Likewise, genome-scale efforts to map DNA methylation, chromatin state and occupancy, messenger RNA expression patterns, and disease-associated genetic polymorphisms, and proteome-wide efforts to map protein-protein interactions, have also created vast resources of data. An emerging trend involves combining multiple types of data, referred to as integrative screening. Examples include papers that report integrated data generated from large-scale RNA interference screens on the Wnt/beta-catenin pathway with either genotypic or proteomic data in colorectal cancer. These studies demonstrate the power of data integration to generate focused, validated data sets and to identify high-confidence candidate genes for follow-up experiments. We present the ongoing evolution and new strategies for the integrative screening approach with respect to understanding and treating human disease. PMID:19436058

  9. Performing integrative functional genomics analysis in GeneWeaver.org.

    PubMed

    Jay, Jeremy J; Chesler, Elissa J

    2014-01-01

    Functional genomics experiments and analyses give rise to large sets of results, each typically quantifying the relation of molecular entities including genes, gene products, polymorphisms, and other genomic features with biological characteristics or processes. There is tremendous utility and value in using these data in an integrative fashion to find convergent evidence for the role of genes in various processes, to identify functionally similar molecular entities, or to compare processes based on their genomic correlates. However, these gene-centered data are often deposited in diverse and non-interoperable stores. Therefore, integration requires biologists to implement computational algorithms and harmonization of gene identifiers both within and across species. The GeneWeaver web-based software system brings together a large data archive from diverse functional genomics data with a suite of combinatorial tools in an interactive environment. Account management features allow data and results to be shared among user-defined groups. Users can retrieve curated gene set data, upload, store, and share their own experimental results and perform integrative analyses including novel algorithmic approaches for set-set integration of genes and functions. PMID:24233775

  10. Integrated translational genomics for analysis of complex traits in sorghum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  11. An integrative genomic and proteomic approach to chemosensitivity prediction

    PubMed Central

    Ma, Yan; Ding, Zhenyu; Qian, Yong; Wan, Ying-Wooi; Tosun, Kursad; Shi, Xianglin; Castranova, Vincent; Harner, E. James; Guo, Nancy I.

    2009-01-01

    New computational approaches are needed to integrate both protein expression and gene expression profiles, extending beyond the correlation analyses of gene and protein expression profiles in the current practices. Here, we developed an algorithm to classify cell line chemosensitivity based on integrated transcriptional and proteomic profiles. We sought to determine whether a combination of gene and protein expression profiles of untreated cells was able to enhance the performance of chemosensitivity prediction. An integrative feature selection scheme was employed to identify chemosensitivity determinants from genome-wide transcriptional profiles and 52 protein expression levels in 60 human cancer cell lines (the NCI-60). A set of 118 anti-cancer drugs whose mechanisms of action were putatively understood was evaluated. Classifiers of the complete range of drug response (sensitive, intermediate, or resistant) were generated for the evaluated anti-cancer drugs, one for each agent. The classifiers were designed to be independent of the cells' tissue origins. The classification accuracy of all the evaluated 118 agents was remarkably better (P<0.001) than that would be achieved by chance. Furthermore, 76 out of the 118 classifiers identified from integrated genomic and protein profiles significantly (P<0.05) improved the accuracy of protein expression-based classifiers identified previously. These results demonstrate that our integrated genomic and proteomic approach enhances the performance of chemosensitivity prediction. This study presents a new analytical framework to identify integrated gene and protein expression signatures for predicting cellular behavior and clinical outcome in general. PMID:19082483

  12. Integrative clinical genomics of advanced prostate cancer.

    PubMed

    Robinson, Dan; Van Allen, Eliezer M; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim; Bradley, Robert K; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P; Hussain, Maha; Feng, Felix Y; Tomlins, Scott A; Cooney, Kathleen A; Smith, David C; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E; Morris, Michael J; Solomon, Stephen B; Durack, Jeremy C; Reuter, Victor E; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T; Bowden, Michaela; Balk, Stephen P; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y; Mostaghel, Elahe A; Cheng, Heather H; Mulcahy, Hyojeong; True, Lawrence D; Plymate, Stephen R; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Perez-Lopez, Raquel; Demichelis, Francesca; Robinson, Brian D; Schiffman, Marc; Nanus, David M; Tagawa, Scott T; Sigaras, Alexandros; Eng, Kenneth W; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I; Scher, Howard I; Pienta, Kenneth J; Kantoff, Philip; de Bono, Johann S; Rubin, Mark A; Nelson, Peter S; Garraway, Levi A; Sawyers, Charles L; Chinnaiyan, Arul M

    2015-05-21

    Toward development of a precision medicine framework for metastatic, castration-resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole-exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53, and PTEN were frequent (40%-60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified new genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin, and ZBTB16/PLZF. Moreover, aberrations of BRCA2, BRCA1, and ATM were observed at substantially higher frequencies (19.3% overall) compared to those in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration, including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides clinically actionable information that could impact treatment decisions for these affected individuals. PMID:26000489

  13. Integrated genomic analyses of ovarian carcinoma.

    PubMed

    2011-06-30

    A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology. PMID:21720365

  14. Integrative clinical genomics of advanced prostate cancer

    PubMed Central

    Dan, Robinson; Van Allen, Eliezer M.; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J.; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim M.; Bradley, Robert K.; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P.; Hussain, Maha; Feng, Felix Y.; Tomlins, Scott A.; Cooney, Kathleen A.; Smith, David C.; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E.; Morris, Michael J.; Solomon, Stephen B.; Durack, Jeremy C.; Reuter, Victor E.; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T.; Bowden, Michaela; Balk, Stephen P.; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y.; Mostaghel, Elahe A.; Cheng, Heather H.; Mulcahy, Hyojeong; True, Lawrence D.; Plymate, Stephen R.; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Lopez, Raquel Perez; Demichelis, Francesca; Robinson, Brian D.; Schiffman, Marc A.; Nanus, David M.; Tagawa, Scott T.; Sigaras, Alexandros; Eng, Kenneth W.; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I.; Scher, Howard I.; Pienta, Kenneth J.; Kantoff, Philip; de Bono, Johann S.; Rubin, Mark A.; Nelson, Peter S.; Garraway, Levi A.; Sawyers, Charles L.; Chinnaiyan, Arul M.

    2015-01-01

    SUMMARY Toward development of a precision medicine framework for metastatic, castration resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53 and PTEN were frequent (40–60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified novel genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin and ZBTB16/PLZF. Aberrations of BRCA2, BRCA1 and ATM were observed at substantially higher frequencies (19.3% overall) than seen in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides evidence that clinical sequencing in mCRPC is feasible and could impact treatment decisions in significant numbers of affected individuals. PMID:26000489

  15. Integrated Genomic Analyses of Ovarian Carcinoma

    PubMed Central

    2011-01-01

    Summary The Cancer Genome Atlas (TCGA) project has analyzed mRNA expression, miRNA expression, promoter methylation, and DNA copy number in 489 high-grade serous ovarian adenocarcinomas (HGS-OvCa) and the DNA sequences of exons from coding genes in 316 of these tumors. These results show that HGS-OvCa is characterized by TP53 mutations in almost all tumors (96%); low prevalence but statistically recurrent somatic mutations in 9 additional genes including NF1, BRCA1, BRCA2, RB1, and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three miRNA subtypes, four promoter methylation subtypes, a transcriptional signature associated with survival duration and shed new light on the impact on survival of tumors with BRCA1/2 and CCNE1 aberrations. Pathway analyses suggested that homologous recombination is defective in about half of tumors, and that Notch and FOXM1 signaling are involved in serous ovarian cancer pathophysiology. PMID:21720365

  16. Integrating genomic selection into dairy cattle breeding programmes: a review.

    PubMed

    Bouquet, A; Juga, J

    2013-05-01

    Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However

  17. Defining nephrotic syndrome from an integrative genomics perspective.

    PubMed

    Sampson, Matthew G; Hodgin, Jeffrey B; Kretzler, Matthias

    2015-01-01

    Nephrotic syndrome (NS) is a clinical condition with a high degree of morbidity and mortality, caused by failure of the glomerular filtration barrier, resulting in massive proteinuria. Our current diagnostic, prognostic and therapeutic decisions in NS are largely based upon clinical or histological patterns such as "focal segmental glomerulosclerosis" or "steroid sensitive". Yet these descriptive classifications lack the precision to explain the physiologic origins and clinical heterogeneity observed in this syndrome. A more precise definition of NS is required to identify mechanisms of disease and capture various clinical trajectories. An integrative genomics approach to NS applies bioinformatics and computational methods to comprehensive experimental, molecular and clinical data for holistic disease definition. A unique aspect is analysis of data together to discover NS-associated molecules, pathways, and networks. Integrating multidimensional datasets from the outset highlights how molecular lesions impact the entire individual. Data sets integrated range from genetic variation to gene expression, to histologic changes, to progression of chronic kidney disease (CKD). This review will introduce the tenets of integrative genomics and suggest how it can increase our understanding of NS from molecular and pathophysiological perspectives. A diverse group of genome-scale experiments are presented that have sought to define molecular signatures of NS. Finally, the Nephrotic Syndrome Study Network (NEPTUNE) will be introduced as an international, prospective cohort study of patients with NS that utilizes an integrated systems genomics approach from the outset. A major NEPTUNE goal is to achieve comprehensive disease definition from a genomics perspective and identify shared molecular drivers of disease. PMID:24890338

  18. Computational and molecular tools for scalable rAAV-mediated genome editing.

    PubMed

    Stoimenov, Ivaylo; Ali, Muhammad Akhtar; Pandzic, Tatjana; Sjöblom, Tobias

    2015-03-11

    The rapid discovery of potential driver mutations through large-scale mutational analyses of human cancers generates a need to characterize their cellular phenotypes. Among the techniques for genome editing, recombinant adeno-associated virus (rAAV)-mediated gene targeting is suited for knock-in of single nucleotide substitutions and to a lesser degree for gene knock-outs. However, the generation of gene targeting constructs and the targeting process is time-consuming and labor-intense. To facilitate rAAV-mediated gene targeting, we developed the first software and complementary automation-friendly vector tools to generate optimized targeting constructs for editing human protein encoding genes. By computational approaches, rAAV constructs for editing ~71% of bases in protein-coding exons were designed. Similarly, ~81% of genes were predicted to be targetable by rAAV-mediated knock-out. A Gateway-based cloning system for facile generation of rAAV constructs suitable for robotic automation was developed and used in successful generation of targeting constructs. Together, these tools enable automated rAAV targeting construct design, generation as well as enrichment and expansion of targeted cells with desired integrations. PMID:25488813

  19. Integrated Genomic Analysis of Pancreatic Ductal Adenocarcinomas Reveals Genomic Rearrangement Events as Significant Drivers of Disease.

    PubMed

    Murphy, Stephen J; Hart, Steven N; Halling, Geoffrey C; Johnson, Sarah H; Smadbeck, James B; Drucker, Travis; Lima, Joema Felipe; Rohakhtar, Fariborz Rakhshan; Harris, Faye R; Kosari, Farhad; Subramanian, Subbaya; Petersen, Gloria M; Wiltshire, Timothy D; Kipp, Benjamin R; Truty, Mark J; McWilliams, Robert R; Couch, Fergus J; Vasmatzis, George

    2016-02-01

    Many somatic mutations have been detected in pancreatic ductal adenocarcinoma (PDAC), leading to the identification of some key drivers of disease progression, but the involvement of large genomic rearrangements has often been overlooked. In this study, we performed mate pair sequencing (MPseq) on genomic DNA from 24 PDAC tumors, including 15 laser-captured microdissected PDAC and 9 patient-derived xenografts, to identify genome-wide rearrangements. Large genomic rearrangements with intragenic breakpoints altering key regulatory genes involved in PDAC progression were detected in all tumors. SMAD4, ZNF521, and FHIT were among the most frequently hit genes. Conversely, commonly reported genes with copy number gains, including MYC and GATA6, were frequently observed in the absence of direct intragenic breakpoints, suggesting a requirement for sustaining oncogenic function during PDAC progression. Integration of data from MPseq, exome sequencing, and transcriptome analysis of primary PDAC cases identified limited overlap in genes affected by both rearrangements and point mutations. However, significant overlap was observed in major PDAC-associated signaling pathways, with all PDAC exhibiting reduced SMAD4 expression, reduced SMAD-dependent TGFβ signaling, and increased WNT and Hedgehog signaling. The frequent loss of SMAD4 and FHIT due to genomic rearrangements strongly implicates these genes as key drivers of PDAC, thus highlighting the strengths of an integrated genomic and transcriptomic approach for identifying mechanisms underlying disease initiation and progression. PMID:26676757

  20. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  1. DemaDb: an integrated dematiaceous fungal genomes database.

    PubMed

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my. PMID:26980516

  2. GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

    PubMed Central

    Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191

  3. Megx.net: integrated database resource for marine ecological genomics.

    PubMed

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  4. Megx.net: integrated database resource for marine ecological genomics

    PubMed Central

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  5. Brown Planthopper Nudivirus DNA Integrated in Its Host Genome

    PubMed Central

    Cheng, Ruo-Lin; Xi, Yu; Lou, Yi-Han; Wang, Zhuo; Xu, Ji-Yu; Xu, Hai-Jun

    2014-01-01

    ABSTRACT The brown planthopper (BPH), Nilaparvata lugens (Hemiptera:Delphacidae), is one of the most destructive insect pests of rice crops in Asia. Nudivirus-like sequences were identified during the whole-genome sequencing of BPH. PCR examination showed that the virus sequences were present in all of the 22 BPH populations collected from East, Southeast, and South Asia. Thirty-two of the 33 nudivirus core genes were identified, including 20 homologues of baculovirus core genes. In addition, several gene clusters that were arranged collinearly with those of other nudiviruses were found in the partial virus genome. In a phylogenetic tree constructed using the supermatrix method, the original virus was grouped with other nudiviruses and was closely related to polydnavirus. Taken together, these data indicated that the virus sequences belong to a new member of the family Nudiviridae. More specifically, the virus sequences were integrated into the chromosome of its insect host during coevolution. This study is the first report of a large double-stranded circular DNA virus genome in a sap-sucking hemipteran insect. IMPORTANCE This is the first report of a large double-stranded DNA virus integrated genome in the planthopper, a plant sap-sucking hemipteran insect. It is an exciting addition to the evolutionary story of bracoviruses (polydnaviruses), nudiviruses, and baculoviruses. The results on the virus sequences integrated in the chromosomes of its insect host also represent a story of successful coevolution of an invertebrate virus and a plant sap-sucking insect. PMID:24574410

  6. Knowledge integration at the center of genomic medicine.

    PubMed

    Khoury, Muin J; Gwinn, Marta; Dotson, W David; Schully, Sheri D

    2012-07-01

    Three articles in this issue of Genetics in Medicine describe examples of "knowledge integration," involving methods for generating and synthesizing rapidly emerging information on health-related genomic technologies and engaging stakeholders around the evidence. Knowledge integration, the central process in translating genomic research, involves three closely related, iterative components: knowledge management, knowledge synthesis, and knowledge translation. Knowledge management is the ongoing process of obtaining, organizing, and displaying evolving evidence. For example, horizon scanning and "infoveillance" use emerging technologies to scan databases, registries, publications, and cyberspace for information on genomic applications. Knowledge synthesis is the process of conducting systematic reviews using a priori rules of evidence. For example, methods including meta-analysis, decision analysis, and modeling can be used to combine information from basic, clinical, and population research. Knowledge translation refers to stakeholder engagement and brokering to influence policy, guidelines and recommendations, as well as the research agenda to close knowledge gaps. The ultrarapid production of information requires adequate public and private resources for knowledge integration to support the evidence-based development of genomic medicine. PMID:22555656

  7. TALEN-mediated genome engineering to generate targeted mice.

    PubMed

    Sommer, Daniel; Peters, Annika E; Baumgart, Ann-Kathrin; Beyer, Marc

    2015-02-01

    Genetic mouse models are critical for biomedical research to understand gene function and pathophysiology. In the last years, the generation of genetic mouse models has been revolutionized by the emergence of transcription activator-like effector nucleases (TALENs). TALENs are programmable, sequence-specific DNA-binding proteins fused to a non-specific endonuclease domain used as powerful tools for site-specific induction of DNA double-strand breaks. These result in disruption of the gene product of the targeted locus by mutations induced during repair by error-prone non-homologous end-joining. Alternatively, these DNA double-strand breaks can be exploited to integrate a user-defined sequence by homologous recombination if an appropriate repair plasmid is provided. In this review, we highlight the major technological improvements for genome editing in murine oocytes which have been achieved using TALENs, discuss current limitations of the technology, suggest strategies to broadly apply TALENs, and describe possible future directions to facilitate gene editing in murine oocytes. PMID:25596827

  8. Transgene integration and organization in cotton (Gossypium hirsutum L.) genome.

    PubMed

    Zhang, Jun; Cai, Lin; Cheng, Jiaqin; Mao, Huizhu; Fan, Xiaoping; Meng, Zhaohong; Chan, Ka Man; Zhang, Huijun; Qi, Jianfei; Ji, Lianghui; Hong, Yan

    2008-04-01

    While genetically modified upland cotton (Gossypium hirsutum L.) varieties are ranked among the most successful genetically modified organisms (GMO), there is little knowledge on transgene integration in the cotton genome, partly because of the difficulty in obtaining large numbers of transgenic plants. In this study, we analyzed 139 independently derived T0 transgenic cotton plants transformed by Agrobacterium tumefaciens strain AGL1 carrying a binary plasmid pPZP-GFP. It was found by PCR that as many as 31% of the plants had integration of vector backbone sequences. Of the 110 plants with good genomic Southern blot results, 37% had integration of a single T-DNA, 24% had two T-DNA copies and 39% had three or more copies. Multiple copies of the T-DNA existed either as repeats in complex loci or unlinked loci. Our further analysis of two T1 populations showed that segregants with a single T-DNA and no vector sequence could be obtained from T0 plants having multiple T-DNA copies and vector sequence. Out of the 57 T-DNA/T-DNA junctions cloned from complex loci, 27 had canonical T-DNA tandem repeats, the rest (30) had deletions to T-DNAs or had inclusion of vector sequences. Overlapping micro-homology was present for most of the T-DNA/T-DNA junctions (38/57). Right border (RB) ends of the T-DNA were precise while most left border (LB) ends (64%) had truncations to internal border sequences. Sequencing of collinear vector integration outside LB in 33 plants gave evidence that collinear vector sequence was determined in agrobacterium culture. Among the 130 plants with characterized flanking sequences, 12% had the transgene integrated into coding sequences, 12% into repetitive sequences, 7% into rDNAs. Interestingly, 7% had the transgene integrated into chloroplast derived sequences. Nucleotide sequence comparison of target sites in cotton genome before and after T-DNA integration revealed overlapping microhomology between target sites and the T-DNA (8/8), deletions to

  9. Integrating hospital information systems in healthcare institutions: a mediation architecture.

    PubMed

    El Azami, Ikram; Cherkaoui Malki, Mohammed Ouçamah; Tahon, Christian

    2012-10-01

    Many studies have examined the integration of information systems into healthcare institutions, leading to several standards in the healthcare domain (CORBAmed: Common Object Request Broker Architecture in Medicine; HL7: Health Level Seven International; DICOM: Digital Imaging and Communications in Medicine; and IHE: Integrating the Healthcare Enterprise). Due to the existence of a wide diversity of heterogeneous systems, three essential factors are necessary to fully integrate a system: data, functions and workflow. However, most of the previous studies have dealt with only one or two of these factors and this makes the system integration unsatisfactory. In this paper, we propose a flexible, scalable architecture for Hospital Information Systems (HIS). Our main purpose is to provide a practical solution to insure HIS interoperability so that healthcare institutions can communicate without being obliged to change their local information systems and without altering the tasks of the healthcare professionals. Our architecture is a mediation architecture with 3 levels: 1) a database level, 2) a middleware level and 3) a user interface level. The mediation is based on two central components: the Mediator and the Adapter. Using the XML format allows us to establish a structured, secured exchange of healthcare data. The notion of medical ontology is introduced to solve semantic conflicts and to unify the language used for the exchange. Our mediation architecture provides an effective, promising model that promotes the integration of hospital information systems that are autonomous, heterogeneous, semantically interoperable and platform-independent. PMID:22086739

  10. Integrated analysis of genome-wide genetic and epigenetic association data for identification of disease mechanisms.

    PubMed

    Ke, Xiayi; Cortina-Borja, Mario; Silva, Bruno Cesar; Lowe, Robert; Rakyan, Vardhman; Balding, David

    2013-11-01

    Many human diseases are multifactorial, involving multiple genetic and environmental factors impacting on one or more biological pathways. Much of the environmental effect is believed to be mediated through epigenetic changes. Although many genome-wide genetic and epigenetic association studies have been conducted for different diseases and traits, it is still far from clear to what extent the genomic loci and biological pathways identified in the genetic and epigenetic studies are shared. There is also a lack of statistical tools to assess these important aspects of disease mechanisms. In the present study, we describe a protocol for the integrated analysis of genome-wide genetic and epigenetic data based on permutation of a sum statistic for the combined effects in a locus or pathway. The method was then applied to published type 1 diabetes (T1D) genome-wide- and epigenome-wide-association studies data to identify genomic loci and biological pathways that are associated with T1D genetically and epigenetically. Through combined analysis, novel loci and pathways were also identified, which could add to our understanding of disease mechanisms of T1D as well as complex diseases in general. PMID:24071862

  11. Integrative functional genomic analysis unveils the differing dysregulated metabolic processes across hepatocellular carcinoma stages.

    PubMed

    Ramesh, Vignesh; Ganesan, Kumaresan

    2016-08-15

    Hepatocellular carcinoma (HCC) is a highly heterogeneous disease and the development of targeted therapeutics is still at an early stage. The 'omics' based genome-wide profiling comprising the transcriptome, miRNome and proteome are highly useful in identifying the deregulated molecular processes involved in hepatocarcinogenesis. One of the end products and processes of the central dogma being the metabolites and metabolic processes mediate the cellular functions. In recent years, metabolomics based investigations have revealed the major deregulated metabolic processes involved in carcinogenesis. However, the integrative analysis of the holistic metabolic processes with genomics is at an early stage. Since the gene-sets are highly useful in assessing the biological processes and pathways, we made an attempt to infer the deregulated cellular metabolic processes involved in HCC by employing metabolism associated gene-set enrichment analysis. Further, the metabolic process enrichment scores were integrated with the transcriptome profiles of HCC. Integrative analysis shows three distinct metabolic deregulations: i) hepatocyte function related molecular processes involving lipid/fatty acid/bile acid synthesis, ii) inflammatory processes with cytokine, sphingolipid & chondriotin sulphate metabolism and iii) enriched nucleotide metabolic process involving purine/pyrimidine & glucose mediated catabolic process, in hepatocarcinogenesis. The three distinct metabolic processes were found to occur both in tumor and liver cancer cell line profiles. Unsupervised hierarchical clustering of the metabolic processes along with clinical sample information has identified two major clusters based on AFP (alpha-fetoprotein) and metastasis. The study reveals the three major regulatory processes involved in HCC stages. PMID:27107678

  12. Using biological networks to integrate, visualize and analyze genomics data.

    PubMed

    Charitou, Theodosia; Bryan, Kenneth; Lynn, David J

    2016-01-01

    Network biology is a rapidly developing area of biomedical research and reflects the current view that complex phenotypes, such as disease susceptibility, are not the result of single gene mutations that act in isolation but are rather due to the perturbation of a gene's network context. Understanding the topology of these molecular interaction networks and identifying the molecules that play central roles in their structure and regulation is a key to understanding complex systems. The falling cost of next-generation sequencing is now enabling researchers to routinely catalogue the molecular components of these networks at a genome-wide scale and over a large number of different conditions. In this review, we describe how to use publicly available bioinformatics tools to integrate genome-wide 'omics' data into a network of experimentally-supported molecular interactions. In addition, we describe how to visualize and analyze these networks to identify topological features of likely functional relevance, including network hubs, bottlenecks and modules. We show that network biology provides a powerful conceptual approach to integrate and find patterns in genome-wide genomic data but we also discuss the limitations and caveats of these methods, of which researchers adopting these methods must remain aware. PMID:27036106

  13. PhytoPath: an integrative resource for plant pathogen genomics.

    PubMed

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D; Staines, Daniel M; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  14. An Integrative Method for Accurate Comparative Genome Mapping

    PubMed Central

    Swidan, Firas; Rocha, Eduardo P. C; Shmoish, Michael; Pinter, Ron Y

    2006-01-01

    We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying “maximal similar segments,” and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes. PMID:16933978

  15. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  16. Integrative Functional Genomics Implicates EPB41 Dysregulation in Hepatocellular Carcinoma Risk.

    PubMed

    Yang, Xinyu; Yu, Dianke; Ren, Yanli; Wei, Jinyu; Pan, Wenting; Zhou, Changchun; Zhou, Liqing; Liu, Yu; Yang, Ming

    2016-08-01

    Genome-wide association studies (GWASs) have provided many insights into cancer genetics. However, the molecular mechanisms of many susceptibility SNPs defined by GWASs in cancer heritability and in promoting cancer risk remain elusive. New research strategies, including functional evaluations, are warranted to systematically explore truly causal genetic variants. In this study, we developed an integrative functional genomics methodology to identify cancer susceptibility SNPs in transcription factor-binding sites across the whole genome. Employing integration of functional genomic data from c-Myc cistromics, 1000 Genomes, and the TRANSFAC matrix, we successfully annotated 12 SNPs present in the c-Myc cistrome with properties consistent with modulating c-Myc binding affinity in hepatocellular carcinoma (HCC). After genotyping these 12 SNPs in 1,806 HBV-related HCC case subjects and 1,708 control subjects, we identified a HCC susceptibility SNP, rs157224G>T, in Chinese populations (T allele: odds ratio = 1.64, 95% confidence interval = 1.32-2.02; p = 5.2 × 10(-6)). This polymorphism leads to HCC predisposition through modifying c-Myc-mediated transcriptional regulation of EPB41, with the risk rs157224T allele showing significantly decreased gene expression. Based on cell proliferation, wound healing, and transwell assays as well as the mouse xenograft model, we identify EPB41 as a HCC susceptibility gene in vitro and in vivo. Consistent with this notion, we note that EPB41 expression is significantly decreased in HCC tissue specimens, especially in portal vein metastasis or intrahepatic metastasis, compared to normal tissues. Our results highlight the involvement of regulatory genetic variants in HCC and provide pathogenic insights of this malignancy via a genome-wide approach. PMID:27453575

  17. Genome-wide analyses of LINE–LINE-mediated nonallelic homologous recombination

    PubMed Central

    Startek, Michał; Szafranski, Przemyslaw; Gambin, Tomasz; Campbell, Ian M.; Hixson, Patricia; Shaw, Chad A.; Stankiewicz, Paweł; Gambin, Anna

    2015-01-01

    Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE–LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE–LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE–LINE rearrangements. Our data indicate that LINE–LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability. PMID:25613453

  18. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  19. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation.

    PubMed

    Brenna, Andrea; Montanini, Barbara; Muggiano, Eleonora; Proietto, Marco; Filetici, Patrizia; Ottonello, Simone; Ballario, Paola

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi ("truffles") with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites. PMID:24949275

  20. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation

    PubMed Central

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi (“truffles”) with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites. PMID:24949275

  1. Stacking multiple transgenes at a selected genomic site via repeated recombinase-mediated DNA cassette exchanges.

    PubMed

    Li, Zhongsen; Moon, Bryan P; Xing, Aiqiu; Liu, Zhan-Bin; McCardell, Richard P; Damude, Howard G; Falco, S Carl

    2010-10-01

    Recombinase-mediated DNA cassette exchange (RMCE) has been successfully used to insert transgenes at previously characterized genomic sites in plants. Following the same strategy, groups of transgenes can be stacked to the same site through multiple rounds of RMCE. A gene-silencing cassette, designed to simultaneously silence soybean (Glycine max) genes fatty acid ω-6 desaturase 2 (FAD2) and acyl-acyl carrier protein thioesterase 2 (FATB) to improve oleic acid content, was first inserted by RMCE at a precharacterized genomic site in soybean. Selected transgenic events were subsequently retransformed with the second DNA construct containing a Yarrowia lipolytica diacylglycerol acyltransferase gene (DGAT1) to increase oil content by the enhancement of triacylglycerol biosynthesis and three other genes, a Corynebacterium glutamicum dihydrodipicolinate synthetase gene (DHPS), a barley (Hordeum vulgare) high-lysine protein gene (BHL8), and a truncated soybean cysteine synthase gene (CGS), to improve the contents of the essential amino acids lysine and methionine. Molecular characterization confirmed that the second RMCE successfully stacked the four overexpression cassettes to the previously integrated FAD2-FATB gene-silencing cassette. Phenotypic analyses indicated that all the transgenes expressed expected phenotypes. PMID:20720171

  2. Evolution of simple sequence repeat-mediated phase variation in bacterial genomes.

    PubMed

    Bayliss, Christopher D; Palmer, Michael E

    2012-09-01

    Mutability as mechanism for rapid adaptation to environmental challenge is an alluringly simple concept whose apotheosis is realized in simple sequence repeats (SSR). Bacterial genomes of several species contain SSRs with a proven role in adaptation to environmental fluctuations. SSRs are hypermutable and generate reversible mutations in localized regions of bacterial genomes, leading to phase variable ON/OFF switches in gene expression. The application of genetic, bioinformatic, and mathematical/computational modeling approaches are revolutionizing our current understanding of how genomic molecular forces and environmental factors influence SSR-mediated adaptation and led to evolution of this mechanism of localized hypermutation in bacterial genomes. PMID:22954215

  3. Construction of an integrated database to support genomic sequence analysis

    SciTech Connect

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  4. The Npl3 hnRNP prevents R-loop-mediated transcription–replication conflicts and genome instability

    PubMed Central

    Santos-Pereira, José M.; Herrero, Ana B.; García-Rubio, María L.; Marín, Antonio; Moreno, Sergio; Aguilera, Andrés

    2013-01-01

    Transcription is a major obstacle for replication fork (RF) progression and a cause of genome instability. Part of this instability is mediated by cotranscriptional R loops, which are believed to increase by suboptimal assembly of the nascent messenger ribonucleoprotein particle (mRNP). However, no clear evidence exists that heterogeneous nuclear RNPs (hnRNPs), the basic mRNP components, prevent R-loop stabilization. Here we show that yeast Npl3, the most abundant RNA-binding hnRNP, prevents R-loop-mediated genome instability. npl3Δ cells show transcription-dependent and R-loop-dependent hyperrecombination and genome-wide replication obstacles as determined by accumulation of the Rrm3 helicase. Such obstacles preferentially occur at long and highly expressed genes, to which Npl3 is preferentially bound in wild-type cells, and are reduced by RNase H1 overexpression. The resulting replication stress confers hypersensitivity to double-strand break-inducing agents. Therefore, our work demonstrates that mRNP factors are critical for genome integrity and opens the option of using them as therapeutic targets in anti-cancer treatment. PMID:24240235

  5. MarinegenomicsDB: an integrated genome viewer for community-based annotation of genomes.

    PubMed

    Koyanagi, Ryo; Takeuchi, Takeshi; Hisata, Kanako; Gyoja, Fuki; Shoguchi, Eiichi; Satoh, Nori; Kawashima, Takeshi

    2013-10-01

    We constructed a web-based genome annotation platform, MarinegenomicsDB, to integrate genome data from various marine organisms including the pearl oyster Pinctada fucata and the coral Acropora digitifera. This newly developed viewer application provides open access to published data and a user-friendly environment for community-based manual gene annotation. Development on a flexible framework enables easy expansion of the website on demand. To date, more than 2000 genes have been annotated using this system. In the future, the website will be expanded to host a wider variety of data, more species, and different types of genome-wide analyses. The website is available at the following URL: http://marinegenomics.oist.jp. PMID:24125644

  6. Precision genome editing in plants via gene targeting and piggyBac-mediated marker excision

    PubMed Central

    Nishizawa-Yokoi, Ayako; Endo, Masaki; Ohtsuki, Namie; Saika, Hiroaki; Toki, Seiichi

    2015-01-01

    Precise genome engineering via homologous recombination (HR)-mediated gene targeting (GT) has become an essential tool in molecular breeding as well as in basic plant science. As HR-mediated GT is an extremely rare event, positive–negative selection has been used extensively in flowering plants to isolate cells in which GT has occurred. In order to utilize GT as a methodology for precision mutagenesis, the positive selectable marker gene should be completely eliminated from the GT locus. Here, we introduce targeted point mutations conferring resistance to herbicide into the rice acetolactate synthase (ALS) gene via GT with subsequent marker excision by piggyBac transposition. Almost all regenerated plants expressing piggyBac transposase contained exclusively targeted point mutations without concomitant re-integration of the transposon, resulting in these progeny showing a herbicide bispyribac sodium (BS)-tolerant phenotype. This approach was also applied successfully to the editing of a microRNA targeting site in the rice cleistogamy 1 gene. Therefore, our approach provides a general strategy for the targeted modification of endogenous genes in plants. PMID:25284193

  7. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Andrei L. Osterman, Ph.D.

    2012-12-17

    Integration of bioinformatics and experimental techniques was applied to mapping and characterization of the key components (pathways, enzymes, transporters, regulators) of the core metabolic machinery in Shewanella oneidensis and related species with main focus was on metabolic and regulatory pathways involved in utilization of various carbon and energy sources. Among the main accomplishments reflected in ten joint publications with other participants of Shewanella Federation are: (i) A systems-level reconstruction of carbohydrate utilization pathways in the genus of Shewanella (19 species). This analysis yielded reconstruction of 18 sugar utilization pathways including 10 novel pathway variants and prediction of > 60 novel protein families of enzymes, transporters and regulators involved in these pathways. Selected functional predictions were verified by focused biochemical and genetic experiments. Observed growth phenotypes were consistent with bioinformatic predictions providing strong validation of the technology and (ii) Global genomic reconstruction of transcriptional regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors, 8 riboswitches and 6 translational attenuators. Of those, 45 regulons were inferred directly from the genome context analysis, whereas others were propagated from previously characterized regulons in other species. Selected regulatory predictions were experimentally tested. Integration of this analysis with microarray data revealed overall consistency and provided additional layer of interactions between regulons. All the results were captured in the new database RegPrecise, which is a joint development with the LBNL team. A more detailed analysis of the individual subsystems, pathways and regulons in Shewanella spp included bioinfiormatics-based prediction and experimental characterization of: (i) N-Acetylglucosamine catabolic pathway; (ii)Lactate utilization machinery; (iii) Novel Nrt

  8. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/ PMID:23482072

  9. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace.

    PubMed

    Qu, Kun; Garamszegi, Sara; Wu, Felix; Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P; Lee, Brian T; Kuhn, Robert M; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y; Mesirov, Jill P

    2016-03-01

    Complex biomedical analyses require the use of multiple software tools in concert and remain challenging for much of the biomedical research community. We introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource that currently supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate integrative analysis by non-programmers, it offers a growing set of 'recipes', short workflows to guide investigators through high-utility analysis tasks. PMID:26780094

  10. Comparative genomic analysis of integral membrane transport proteins in ciliates.

    PubMed

    Kumar, Ujjwal; Saier, Milton H

    2015-01-01

    Integral membrane transport proteins homologous to those found in the Transporter Classification Database (TCDB; www.tcdb.org) were identified and bioinformatically characterized by transporter class, family, and substrate specificity in three ciliates, Paramecium tetraurelia (Para), Tetrahymena thermophila (Tetra), and Ichthyophthirius multifiliis (Ich). In these three organisms, 1,326 of 39,600 proteins (3.4%), 1,017 of 24,800 proteins (4.2%), and 504 out of 8,100 proteins (6.2%) integral membrane transport proteins were identified, respectively. Thus, an inverse relationship was observed between the % transporters identified and the number of total proteins per genome reported. This surprising observation provides insight into the evolutionary process, giving rise to genome reduction following whole genome duplication (as in the case of Para) or during pathogenic association with a host organism (Ich). Of these transport proteins in Para and Tetra, about 41% were channels (more than any other type of organism studied), 31% were secondary carriers (fewer than most eukaryotes) and 26% were primary active transporters, mostly ATP-hydrolysis driven (more than most other eukaryotes). In Ich, the number of channels was selectively reduced by 66%, relative to Para and Tetra. Para has four times more inorganic anion transporters than Tetra, and Ich has nonselectively lost most of these. Tetra and Ich preferentially transport sugars and monocarboxylates while Para prefers di- and tricarboxylates. These observations serve to characterize the transport proteins of these related ciliates, providing insight into their nutrition and metabolism. PMID:25099884

  11. An integrated semiconductor device enabling non-optical genome sequencing.

    PubMed

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-21

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome. PMID:21776081

  12. STINGRAY: system for integrated genomic resources and analysis

    PubMed Central

    2014-01-01

    Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. Conclusion STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/. PMID:24606808

  13. An integrated BAC/BIBAC-based physical and genetic map of the cotton genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Integrated genome-wide genetic and physical maps are crucial to many aspects of cotton genome research. We report a genome-wide BAC/BIBAC-based physical and genetic map of the upland cotton genome using a high-resolution and high-throughput capillary-based fingerprinting method. The map was constr...

  14. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  15. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

    PubMed

    Moss, Nathan A; Bertin, Matthew J; Kleigrewe, Karin; Leão, Tiago F; Gerwick, Lena; Gerwick, William H

    2016-03-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313

  16. [From population genetics to population genomics of forest trees: integrated population genomics approach].

    PubMed

    Krutovskiĭ, K V

    2006-10-01

    Early works by Altukhov and his associates on pine and spruce laid the foundation for Russian population genetic studies on tree species with the use of molecular genetic markers. In recent years, these species have become especially popular as nontraditional eukaryotic models for population and evolutionary genomic research. Tree species with large, cross-pollinating native populations, high genetic and phenotypic variation, growing in diverse environments and affected by environmental changes during hundreds of years of their individual development, are an ideal model for studying the molecular genetic basis of adaptation. The great advance in this field is due to the rapid development of population genomics in the last few years. In the broad sense, population genomics is a novel, fast-developing discipline, combining traditional population genetic approaches with the genomic level of analysis. Thousands of genes with known function and sometimes known genomic localization can be simultaneously studied in many individuals. This opens new prospects for obtaining statistical estimates for a great number of genes and segregating elements. Mating system, gene exchange, reproductive population size, population disequilibrium, interaction among populations, and many other traditional problems of population genetics can be now studied using data on variation in many genes. Moreover, population genomic analysis allows one to distinguish factors that affect individual genes, alleles, or nucleotides (such as, for example, natural selection) from factors affecting the entire genome (e.g., demography). This paper presents a brief review of traditional methods of studying genetic variation in forest tree species and introduces a new, integrated population genomics approach. The main stages of the latter are : (1) selection of genes, which are tentatively involved in variation of adaptive traits, by means of a detailed examination of the regulation and the expression of

  17. TALEN-mediated genome editing: prospects and perspectives

    SciTech Connect

    Wright, DA; Li, T; Yang, B; Spalding, MH

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges, and future prospects of this quickly evolving area of study and application.

  18. Frequency and Spectrum of Genomic Integration of Recombinant Adeno-Associated Virus Serotype 8 Vector in Neonatal Mouse Liver▿

    PubMed Central

    Inagaki, Katsuya; Piao, Chuncheng; Kotchey, Nicole M.; Wu, Xiaolin; Nakai, Hiroyuki

    2008-01-01

    Neonatal injection of recombinant adeno-associated virus serotype 8 (rAAV8) vectors results in widespread transduction in multiple organs and therefore holds promise in neonatal gene therapy. On the other hand, insertional mutagenesis causing liver cancer has been implicated in rAAV-mediated neonatal gene transfer. Here, to better understand rAAV integration in neonatal livers, we investigated the frequency and spectrum of genomic integration of rAAV8 vectors in the liver following intraperitoneal injection of 2.0 × 1011 vector genomes at birth. This dose was sufficient to transduce a majority of hepatocytes in the neonatal period. In the first approach, we injected mice with a β-galactosidase-expressing vector at birth and quantified rAAV integration events by taking advantage of liver regeneration in a chronic hepatitis animal model and following partial hepatectomy. In the second approach, we performed a new, quantitative rAAV vector genome rescue assay by which we identified rAAV integration sites and quantified integrations. As a result, we find that at least ∼0.05% of hepatocytes contained rAAV integration, while the average copy number of integrated double-stranded vector genome per cell in the liver was ∼0.2, suggesting concatemer integration. Twenty-three of 34 integrations (68%) occurred in genes, but none of them were near the mir-341 locus, the common rAAV integration site found in mouse hepatocellular carcinoma. Thus, rAAV8 vector integration occurs preferentially in genes at a frequency of 1 in approximately 103 hepatocytes when a majority of hepatocytes are once transduced in the neonatal period. Further studies are warranted to elucidate the relationship between vector dose and integration frequency or spectrum. PMID:18614641

  19. Integrated Genome-Based Studies of Shewanella Echophysiology

    SciTech Connect

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high

  20. An integrative computational approach for prioritization of genomic variants.

    PubMed

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Cem, Meydan; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R; Mirzaa, Ghayda M; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E; Ross, M Elizabeth; Maltsev, Natalia; Gilliam, T Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  1. An Integrative Computational Approach for Prioritization of Genomic Variants

    PubMed Central

    Wang, Sheng; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  2. An integrative computational approach for prioritization of genomic variants

    SciTech Connect

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad; Huang, Qingyang

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.

  3. An integrative computational approach for prioritization of genomic variants

    DOE PAGESBeta

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; et al

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less

  4. Integrative Genomics Identifies Gene Signature Associated with Melanoma Ulceration

    PubMed Central

    Toth, Reka; Vizkeleti, Laura; Herandez-Vargas, Hector; Lazar, Viktoria; Emri, Gabriella; Szatmari, Istvan; Herceg, Zdenko; Adany, Roza; Balazs, Margit

    2013-01-01

    Background Despite the extensive research approaches applied to characterise malignant melanoma, no specific molecular markers are available that are clearly related to the progression of this disease. In this study, our aims were to define a gene expression signature associated with the clinical outcome of melanoma patients and to provide an integrative interpretation of the gene expression -, copy number alterations -, and promoter methylation patterns that contribute to clinically relevant molecular functional alterations. Methods Gene expression profiles were determined using the Affymetrix U133 Plus2.0 array. The NimbleGen Human CGH Whole-Genome Tiling array was used to define CNAs, and the Illumina GoldenGate Methylation platform was applied to characterise the methylation patterns of overlapping genes. Results We identified two subclasses of primary melanoma: one representing patients with better prognoses and the other being characteristic of patients with unfavourable outcomes. We assigned 1,080 genes as being significantly correlated with ulceration, 987 genes were downregulated and significantly enriched in the p53, Nf-kappaB, and WNT/beta-catenin pathways. Through integrated genome analysis, we defined 150 downregulated genes whose expression correlated with copy number losses in ulcerated samples. These genes were significantly enriched on chromosome 6q and 10q, which contained a total of 36 genes. Ten of these genes were downregulated and involved in cell-cell and cell-matrix adhesion or apoptosis. The expression and methylation patterns of additional genes exhibited an inverse correlation, suggesting that transcriptional silencing of these genes is driven by epigenetic events. Conclusion Using an integrative genomic approach, we were able to identify functionally relevant molecular hotspots characterised by copy number losses and promoter hypermethylation in distinct molecular subtypes of melanoma that contribute to specific transcriptomic silencing

  5. A New Approach to Dissect Nuclear Organization: TALE-Mediated Genome Visualization (TGV).

    PubMed

    Miyanari, Yusuke

    2016-01-01

    Spatiotemporal organization of chromatin within the nucleus has so far remained elusive. Live visualization of nuclear remodeling could be a promising approach to understand its functional relevance in genome functions and mechanisms regulating genome architecture. Recent technological advances in live imaging of chromosomes begun to explore the biological roles of the movement of the chromatin within the nucleus. Here I describe a new technique, called TALE-mediated genome visualization (TGV), which allows us to visualize endogenous repetitive sequence including centromeric, pericentromeric, and telomeric repeats in living cells. PMID:26443216

  6. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two brea...

  7. Bilayer-thickness-mediated interactions between integral membrane proteins.

    PubMed

    Kahraman, Osman; Koch, Peter D; Klug, William S; Haselwandter, Christoph A

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  8. Bilayer-thickness-mediated interactions between integral membrane proteins

    NASA Astrophysics Data System (ADS)

    Kahraman, Osman; Koch, Peter D.; Klug, William S.; Haselwandter, Christoph A.

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  9. Potential pitfalls of CRISPR/Cas9-mediated genome editing.

    PubMed

    Peng, Rongxue; Lin, Guigao; Li, Jinming

    2016-04-01

    Recently, a novel technique named the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas)9 system has been rapidly developed. This genome editing tool has improved our ability tremendously with respect to exploring the pathogenesis of diseases and correcting disease mutations, as well as phenotypes. With a short guide RNA, Cas9 can be precisely directed to target sites, and functions as an endonuclease to efficiently produce breaks in DNA double strands. Over the past 30 years, CRISPR has evolved from the 'curious sequences of unknown biological function' into a promising genome editing tool. As a result of the incessant development in the CRISPR/Cas9 system, Cas9 co-expressed with custom guide RNAs has been successfully used in a variety of cells and organisms. This genome editing technology can also be applied to synthetic biology, functional genomic screening, transcriptional modulation and gene therapy. However, although CRISPR/Cas9 has a broad range of action in science, there are several aspects that affect its efficiency and specificity, including Cas9 activity, target site selection and short guide RNA design, delivery methods, off-target effects and the incidence of homology-directed repair. In the present review, we highlight the factors that affect the utilization of CRISPR/Cas9, as well as possible strategies for handling any problems. Addressing these issues will allow us to take better advantage of this technique. In addition, we also review the history and rapid development of the CRISPR/Cas system from the time of its initial discovery in 2012. PMID:26535798

  10. Integrated Genomic and Gene Expression Profiling Identifies Two Major Genomic Circuits in Urothelial Carcinoma

    PubMed Central

    Lindgren, David; Sjödahl, Gottfrid; Lauss, Martin; Staaf, Johan; Chebil, Gunilla; Lövgren, Kristina; Gudjonsson, Sigurdur; Liedberg, Fredrik; Patschan, Oliver; Månsson, Wiking; Fernö, Mårten; Höglund, Mattias

    2012-01-01

    Similar to other malignancies, urothelial carcinoma (UC) is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21), and BCL2L1 (20q11). We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development. PMID:22685613

  11. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    SciTech Connect

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  12. Integrated cytogenetics and genomics analysis of transposable elements in the Nile tilapia, Oreochromis niloticus.

    PubMed

    Valente, Guilherme; Kocher, Thomas; Eickbush, Thomas; Simões, Rafael P; Martins, Cesar

    2016-06-01

    Integration of cytogenetics and genomics has become essential to a better view of architecture and function of genomes. Although the advances on genomic sequencing have contributed to study genes and genomes, the repetitive DNA fraction of the genome is still enigmatic and poorly understood. Among repeated DNAs, transposable elements (TEs) are major components of eukaryotic chromatin and their investigation has been hindered even after the availability of whole sequenced genomes. The cytogenetic mapping of TEs in chromosomes has proved to be of high value to integrate information from the micro level of nucleotide sequence to a cytological view of chromosomes. Different TEs have been cytogenetically mapped in cichlids; however, neither details about their genomic arrangement nor appropriated copy number are well defined by these approaches. The current study integrates TEs distribution in Nile tilapia Oreochromis niloticus genome based on cytogenetic and genomics/bioinformatics approach. The results showed that some elements are not randomly distributed and that some are genomic dependent on each other. Moreover, we found extensive overlap between genomics and cytogenetics data and that tandem duplication may be the major mechanism responsible for the genomic dynamics of TEs here analyzed. This paper provides insights in the genomic organization of TEs under an integrated view based on cytogenetics and genomics. PMID:26860923

  13. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  14. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  15. Red-Mediated Transposition and Final Release of the Mini-F Vector of a Cloned Infectious Herpesvirus Genome

    PubMed Central

    Wussow, Felix; Fickenscher, Helmut; Tischer, B. Karsten

    2009-01-01

    Bacterial artificial chromosomes (BACs) are well-established cloning vehicles for functional genomics and for constructing targeting vectors and infectious viral DNA clones. Red-recombination-based mutagenesis techniques have enabled the manipulation of BACs in Escherichia coli without any remaining operational sequences. Here, we describe that the F-factor-derived vector sequences can be inserted into a novel position and seamlessly removed from the present location of the BAC-cloned DNA via synchronous Red-recombination in E. coli in an en passant mutagenesis-based procedure. Using this technique, the mini-F elements of a cloned infectious varicella zoster virus (VZV) genome were specifically transposed into novel positions distributed over the viral DNA to generate six different BAC variants. In comparison to the other constructs, a BAC variant with mini-F sequences directly inserted into the junction of the genomic termini resulted in highly efficient viral DNA replication-mediated spontaneous vector excision upon virus reconstitution in transfected VZV-permissive eukaryotic cells. Moreover, the derived vector-free recombinant progeny exhibited virtually indistinguishable genome properties and replication kinetics to the wild-type virus. Thus, a sequence-independent, efficient, and easy-to-apply mini-F vector transposition procedure eliminates the last hurdle to perform virtually any kind of imaginable targeted BAC modifications in E. coli. The herpesviral terminal genomic junction was identified as an optimal mini-F vector integration site for the construction of an infectious BAC, which allows the rapid generation of mutant virus without any unwanted secondary genome alterations. The novel mini-F transposition technique can be a valuable tool to optimize, repair or restructure other established BACs as well and may facilitate the development of gene therapy or vaccine vectors. PMID:19997639

  16. Integrated genome-wide analysis of genomic changes and gene regulation in human adrenocortical tissue samples

    PubMed Central

    Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S.; Kebebew, Electron

    2015-01-01

    To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics. PMID:26446994

  17. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data.

    PubMed

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org. PMID:17932055

  18. Cas9-Mediated Genome Engineering in Drosophila melanogaster.

    PubMed

    Housden, Benjamin E; Perrimon, Norbert

    2016-01-01

    The recent development of the CRISPR-Cas9 system for genome engineering has revolutionized our ability to modify the endogenous DNA sequence of many organisms, including Drosophila This system allows alteration of DNA sequences in situ with single base-pair precision and is now being used for a wide variety of applications. To use the CRISPR system effectively, various design parameters must be considered, including single guide RNA target site selection and identification of successful editing events. Here, we review recent advances in CRISPR methodology in Drosophila and introduce protocols for some of the more difficult aspects of CRISPR implementation: designing and generating CRISPR reagents and detecting indel mutations by high-resolution melt analysis. PMID:27587786

  19. Androgen receptor-mediated non-genomic regulation of prostate cancer cell proliferation

    PubMed Central

    Liao, Ross S.; Ma, Shihong; Miao, Lu; Li, Rui; Yin, Yi

    2013-01-01

    Androgen receptor (AR)-mediated signaling is necessary for prostate cancer cell proliferation and an important target for therapeutic drug development. Canonically, AR signals through a genomic or transcriptional pathway, involving the translocation of androgen-bound AR to the nucleus, its binding to cognate androgen response elements on promoter, with ensuing modulation of target gene expression, leading to cell proliferation. However, prostate cancer cells can show dose-dependent proliferation responses to androgen within minutes, without the need for genomic AR signaling. This proliferation response known as the non-genomic AR signaling is mediated by cytoplasmic AR, which facilitates the activation of kinase-signaling cascades, including the Ras-Raf-1, phosphatidyl-inositol 3-kinase (PI3K)/Akt and protein kinase C (PKC), which in turn converge on mitogen-activated protein kinase (MAPK)/extracellular signal-regulated kinase (ERK) activation, leading to cell proliferation. Further, since activated ERK may also phosphorylate AR and its coactivators, the non-genomic AR signaling may enhance AR genomic activity. Non-genomic AR signaling may occur in an ERK-independent manner, via activation of mammalian target of rapamycin (mTOR) pathway, or modulation of intracellular Ca2+ concentration through plasma membrane G protein-coupled receptors (GPCRs). These data suggest that therapeutic strategies aimed at preventing AR nuclear translocation and genomic AR signaling alone may not completely abrogate AR signaling. Thus, elucidation of mechanisms that underlie non-genomic AR signaling may identify potential mechanisms of resistance to current anti-androgens and help developing novel therapies that abolish all AR signaling in prostate cancer. PMID:26816736

  20. Examination of host genome for the presence of integrated fragments of Solenopsis invicta virus 1

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A series of oligonucleotide primer pairs covering the entire genome of Solenopsis invicta virus 1 (SINV-1) were used to probe the Solenopsis invicta genome for integrated fragments of the viral genome. All of the oligonucleotide primer sets yielded amplicons of anticipated size from cDNA created f...

  1. Integration of optical devices and nanotechnology for conducting genome research

    NASA Astrophysics Data System (ADS)

    Chung, Pei-Yu; Parag, Parekh; Zhu, Zhi; Chegini, Claudine; Schultz, Gregory; Tan, Weihong; Jiang, Peng; Batich, Christopher

    2011-06-01

    SPR based sensing techniques utilize a spectroscopy for transducing biomolecular binding events to variations in spectra. This label-free and real-time technique has widely applied for conducting biomedical research. In this study, we present a spectroscopy-based SPR system for monitoring binding between human serum albumin and nucleic acid library. Compared with conventional SPR technique, this novel system utilizes cost-effective nanostructured arrays and a portable UV-Vis spectrometer. These advantages enable a promising development of a portable analytical device for widespread applications. Meanwhile, multispectral analysis used here also helps increase the sensitivity, and thus transducing the binding event to optical signal efficiently. The result demonstrates that this cost-effective and portable system could be applied for a future application of selecting target aptamer. Moreover, we also present surface enhanced Raman spectroscopy (SERS) on the nanostructured arrays in a label-free approach. This integration of multiple spectroscopy technologies is utilized for conducting genome research efficiently.

  2. Genome and proteome annotation: organization, interpretation and integration

    PubMed Central

    Reeves, Gabrielle A.; Talavera, David; Thornton, Janet M.

    2008-01-01

    Recent years have seen a huge increase in the generation of genomic and proteomic data. This has been due to improvements in current biological methodologies, the development of new experimental techniques and the use of computers as support tools. All these raw data are useless if they cannot be properly analysed, annotated, stored and displayed. Consequently, a vast number of resources have been created to present the data to the wider community. Annotation tools and databases provide the means to disseminate these data and to comprehend their biological importance. This review examines the various aspects of annotation: type, methodology and availability. Moreover, it puts a special interest on novel annotation fields, such as that of phenotypes, and highlights the recent efforts focused on the integrating annotations. PMID:19019817

  3. CRISPR/Cas9-mediated genome engineering of CHO cell factories: Application and perspectives.

    PubMed

    Lee, Jae Seong; Grav, Lise Marie; Lewis, Nathan E; Faustrup Kildegaard, Helene

    2015-07-01

    Chinese hamster ovary (CHO) cells are the most widely used production host for therapeutic proteins. With the recent emergence of CHO genome sequences, CHO cell line engineering has taken on a new aspect through targeted genome editing. The bacterial clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system enables rapid, easy and efficient engineering of mammalian genomes. It has a wide range of applications from modification of individual genes to genome-wide screening or regulation of genes. Facile genome editing using CRISPR/Cas9 empowers researchers in the CHO community to elucidate the mechanistic basis behind high level production of proteins and product quality attributes of interest. In this review, we describe the basis of CRISPR/Cas9-mediated genome editing and its application for development of next generation CHO cell factories while highlighting both future perspectives and challenges. As one of the main drivers for the CHO systems biology era, genome engineering with CRISPR/Cas9 will pave the way for rational design of CHO cell factories. PMID:26058577

  4. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease. PMID:26395457

  5. Integrated Genomic and Epigenomic Analysis of Breast Cancer Brain Metastasis

    PubMed Central

    Salhia, Bodour; Kiefer, Jeff; Ross, Julianna T. D.; Metapally, Raghu; Martinez, Rae Anne; Johnson, Kyle N.; DiPerna, Danielle M.; Paquette, Kimberly M.; Jung, Sungwon; Nasser, Sara; Wallstrom, Garrick; Tembe, Waibhav; Baker, Angela; Carpten, John; Resau, Jim; Ryken, Timothy; Sibenaller, Zita; Petricoin, Emanuel F.; Liotta, Lance A.; Ramanathan, Ramesh K.; Berens, Michael E.; Tran, Nhan L.

    2014-01-01

    The brain is a common site of metastatic disease in patients with breast cancer, which has few therapeutic options and dismal outcomes. The purpose of our study was to identify common and rare events that underlie breast cancer brain metastasis. We performed deep genomic profiling, which integrated gene copy number, gene expression and DNA methylation datasets on a collection of breast brain metastases. We identified frequent large chromosomal gains in 1q, 5p, 8q, 11q, and 20q and frequent broad-level deletions involving 8p, 17p, 21p and Xq. Frequently amplified and overexpressed genes included ATAD2, BRAF, DERL1, DNMTRB and NEK2A. The ATM, CRYAB and HSPB2 genes were commonly deleted and underexpressed. Knowledge mining revealed enrichment in cell cycle and G2/M transition pathways, which contained AURKA, AURKB and FOXM1. Using the PAM50 breast cancer intrinsic classifier, Luminal B, Her2+/ER negative, and basal-like tumors were identified as the most commonly represented breast cancer subtypes in our brain metastasis cohort. While overall methylation levels were increased in breast cancer brain metastasis, basal-like brain metastases were associated with significantly lower levels of methylation. Integrating DNA methylation data with gene expression revealed defects in cell migration and adhesion due to hypermethylation and downregulation of PENK, EDN3, and ITGAM. Hypomethylation and upregulation of KRT8 likely affects adhesion and permeability. Genomic and epigenomic profiling of breast brain metastasis has provided insight into the somatic events underlying this disease, which have potential in forming the basis of future therapeutic strategies. PMID:24489661

  6. Application of oocyte cryopreservation technology in TALEN-mediated mouse genome editing.

    PubMed

    Nakagawa, Yoshiko; Sakuma, Tetsushi; Nakagata, Naomi; Yamasaki, Sho; Takeda, Naoki; Ohmuraya, Masaki; Yamamoto, Takashi

    2014-01-01

    Reproductive engineering techniques, such as in vitro fertilization (IVF) and cryopreservation of embryos or spermatozoa, are essential for preservation, reproduction, and transportation of genetically engineered mice. However, it has not yet been elucidated whether these techniques can be applied for the generation of genome-edited mice using engineered nucleases such as transcription activator-like effector nucleases (TALENs). Here, we demonstrate the usefulness of frozen oocytes fertilized in vitro using frozen sperm for TALEN-mediated genome editing in mice. We examined side-by-side comparisons concerning sperm (fresh vs. frozen), fertilization method (mating vs. IVF), and fertilized oocytes (fresh vs. frozen) for the source of oocytes used for TALEN injection; we found that fertilized oocytes created under all tested conditions were applicable for TALEN-mediated mutagenesis. In addition, we investigated whether the ages in weeks of parental female mice can affect the efficiency of gene modification, by comparing 5-week-old and 8-12-week-old mice as the source of oocytes used for TALEN injection. The genome editing efficiency of an endogenous gene was consistently 95-100% when either 5-week-old or 8-12-week-old mice were used with or without freezing the oocytes. Thus, our report describes the availability of freeze-thawed oocytes and oocytes from female mice at various weeks of age for TALEN-mediated genome editing, thus boosting the convenience of such innovative gene targeting strategies. PMID:25077765

  7. An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes

    PubMed Central

    Hu, Ying; Shukla, Anjali; Ha, Ngoc-Han; Doran, Anthony; Faraji, Farhoud; Goldberger, Natalie; Lee, Maxwell P.; Keane, Thomas

    2016-01-01

    Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease. PMID:27074153

  8. An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes.

    PubMed

    Bai, Ling; Yang, Howard H; Hu, Ying; Shukla, Anjali; Ha, Ngoc-Han; Doran, Anthony; Faraji, Farhoud; Goldberger, Natalie; Lee, Maxwell P; Keane, Thomas; Hunter, Kent W

    2016-04-01

    Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease. PMID:27074153

  9. Integrative bioinformatics for functional genome annotation: trawling for G protein-coupled receptors.

    PubMed

    Flower, Darren R; Attwood, Teresa K

    2004-12-01

    G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow. PMID:15561589

  10. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    SciTech Connect

    Baliga, Nitin S

    2011-05-26

    applied to the manually curated training set. Applying this method to the data representing around a quarter of the fraction space for water soluble proteins in D. vulgaris, we obtained 854 reliable pair wise interactions. Further, we have developed algorithms to analyze and assign significance to protein interaction data from bait pull-down experiments and integrate these data with other systems biology data through associative biclustering in a parallel computing environment. We will 'fill-in' missing information in these interaction data using a 'Transitive Closure' algorithm and subsequently use 'Between Commonality Decomposition' algorithm to discover complexes within these large graphs of protein interactions. To characterize the metabolic activities of proteins and their complexes we are developing algorithms to deconvolute pure mass spectra, estimate chemical formula for m/z values, and fit isotopic fine structure to metabolomics data. We have discovered that in comparison to isotopic pattern fitting methods restricting the chemical formula by these two dimensions actually facilitates unique solutions for chemical formula generators. To understand how microbial functions are regulated we have developed complementary algorithms for reconstructing gene regulatory networks (GRNs). Whereas the network inference algorithms cMonkey and Inferelator developed enable de novo reconstruction of predictive models for GRNs from diverse systems biology data, the RegPrecise and RegPredict framework developed uses evolutionary comparisons of genomes from closely related organisms to reconstruct conserved regulons. We have integrated the two complementary algorithms to rapidly generate comprehensive models for gene regulation of understudied organisms. Our preliminary analyses of these reconstructed GRNs have revealed novel regulatory mechanisms and cis-regulatory motifs, as well asothers that are conserved across species. Finally, we are supporting scientific efforts in ENIGMA

  11. Mediator infrastructure for information integration and semantic data integration environment for biomedical research.

    PubMed

    Grethe, Jeffrey S; Ross, Edward; Little, David; Sanders, Brian; Gupta, Amarnath; Astakhov, Vadim

    2009-01-01

    This paper presents current progress in the development of semantic data integration environment which is a part of the Biomedical Informatics Research Network (BIRN; http://www.nbirn.net) project. BIRN is sponsored by the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). A goal is the development of a cyberinfrastructure for biomedical research that supports advance data acquisition, data storage, data management, data integration, data mining, data visualization, and other computing and information processing services over the Internet. Each participating institution maintains storage of their experimental or computationally derived data. Mediator-based data integration system performs semantic integration over the databases to enable researchers to perform analyses based on larger and broader datasets than would be available from any single institution's data. This paper describes recent revision of the system architecture, implementation, and capabilities of the semantically based data integration environment for BIRN. PMID:19623485

  12. Accessing integrated genomic data using GenoBase: A tutorial, Part 1

    SciTech Connect

    Overbeek, R.; Price, M.

    1993-01-01

    GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.

  13. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement.

    PubMed

    Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  14. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement

    PubMed Central

    Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  15. CRISPR/Cas9-Mediated Genome Editing of Mouse Small Intestinal Organoids.

    PubMed

    Schwank, Gerald; Clevers, Hans

    2016-01-01

    The CRISPR/Cas9 system is an RNA-guided genome-editing tool that has been recently developed based on the bacterial CRISPR-Cas immune defense system. Due to its versatility and simplicity, it rapidly became the method of choice for genome editing in various biological systems, including mammalian cells. Here we describe a protocol for CRISPR/Cas9-mediated genome editing in murine small intestinal organoids, a culture system in which somatic stem cells are maintained by self-renewal, while giving rise to all major cell types of the intestinal epithelium. This protocol allows the study of gene function in intestinal epithelial homeostasis and pathophysiology and can be extended to epithelial organoids derived from other internal mouse and human organs. PMID:27246017

  16. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  17. A Phenotype-Driven Dimension Reduction (PhDDR) Approach to Integrated Genomic Association Analyses

    PubMed Central

    Gao, Cuilan; Cheng, Cheng

    2013-01-01

    An immediate challenge in integrated genomic analysis involving several types of genomic factors all measured genome-wide is the ultra-high dimensionality. Screening all possible relationships among the genomic factors is an NP-hard problem; therefore in practice proper dimension reduction is necessary. In this paper we develop the Phenotype-Driven Dimension Reduction (PhDDR) approach to the analysis of gene co-expressions, and discuss its extensions to integration of other genetic factors. This approach is then illustrated by an application to gene co-expression analysis of treatment response of childhood leukemia. PMID:22255909

  18. Integrating microarray gene expression object model and clinical document architecture for cancer genomics research.

    PubMed

    Park, Yu Rang; Lee, Hye Won; Kim, Ju Han

    2005-01-01

    Systematic integration of genomic-scale expression profiles with clinical information may facilitate cancer genomics research. MAGE-OM (Microarray Gene Expression Object Model) defines standard objects for genomic but not for clinical data. HL7 CDA (Clinical Document Architecture) is a document model for clinical information, describing syntax (generic structure) but not semantics. We designed a document template in XML Schema with additional constraints for CDA to define content semantics, enabling data model-level integration of MAGE-OM and CDA for cancer genomics research. PMID:16779360

  19. Integrative Functional Genomics of Hepatitis C Virus Infection Identifies Host Dependencies in Complete Viral Replication Cycle

    PubMed Central

    Li, Qisheng; Zhang, Yong-Yuan; Chiu, Stephan; Hu, Zongyi; Lan, Keng-Hsin; Cha, Helen; Sodroski, Catherine; Zhang, Fang; Hsu, Ching-Sheng; Thomas, Emmanuel; Liang, T. Jake

    2014-01-01

    Recent functional genomics studies including genome-wide small interfering RNA (siRNA) screens demonstrated that hepatitis C virus (HCV) exploits an extensive network of host factors for productive infection and propagation. How these co-opted host functions interact with various steps of HCV replication cycle and exert pro- or antiviral effects on HCV infection remains largely undefined. Here we present an unbiased and systematic strategy to functionally interrogate HCV host dependencies uncovered from our previous infectious HCV (HCVcc) siRNA screen. Applying functional genomics approaches and various in vitro HCV model systems, including HCV pseudoparticles (HCVpp), single-cycle infectious particles (HCVsc), subgenomic replicons, and HCV cell culture systems (HCVcc), we identified and characterized novel host factors or pathways required for each individual step of the HCV replication cycle. Particularly, we uncovered multiple HCV entry factors, including E-cadherin, choline kinase α, NADPH oxidase CYBA, Rho GTPase RAC1 and SMAD family member 6. We also demonstrated that guanine nucleotide binding protein GNB2L1, E2 ubiquitin-conjugating enzyme UBE2J1, and 39 other host factors are required for HCV RNA replication, while the deubiquitinating enzyme USP11 and multiple other cellular genes are specifically involved in HCV IRES-mediated translation. Families of antiviral factors that target HCV replication or translation were also identified. In addition, various virologic assays validated that 66 host factors are involved in HCV assembly or secretion. These genes included insulin-degrading enzyme (IDE), a proviral factor, and N-Myc down regulated Gene 1 (NDRG1), an antiviral factor. Bioinformatics meta-analyses of our results integrated with literature mining of previously published HCV host factors allows the construction of an extensive roadmap of cellular networks and pathways involved in the complete HCV replication cycle. This comprehensive study of HCV host

  20. Goldmine integrates information placing genomic ranges into meaningful biological contexts

    PubMed Central

    Bhasin, Jeffrey M.; Ting, Angela H.

    2016-01-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  1. Goldmine integrates information placing genomic ranges into meaningful biological contexts.

    PubMed

    Bhasin, Jeffrey M; Ting, Angela H

    2016-07-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  2. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources

    PubMed Central

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/ PMID:26589635

  3. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    PubMed

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. PMID:26589635

  4. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. PMID:25480115

  5. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

    PubMed Central

    Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

    2015-01-01

    Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors. PMID:26102582

  6. DNA Damage Response and Spindle Assembly Checkpoint Function throughout the Cell Cycle to Ensure Genomic Integrity

    PubMed Central

    Lawrence, Katherine S.; Chau, Thinh; Engebrecht, JoAnne

    2015-01-01

    Errors in replication or segregation lead to DNA damage, mutations, and aneuploidies. Consequently, cells monitor these events and delay progression through the cell cycle so repair precedes division. The DNA damage response (DDR), which monitors DNA integrity, and the spindle assembly checkpoint (SAC), which responds to defects in spindle attachment/tension during metaphase of mitosis and meiosis, are critical for preventing genome instability. Here we show that the DDR and SAC function together throughout the cell cycle to ensure genome integrity in C. elegans germ cells. Metaphase defects result in enrichment of SAC and DDR components to chromatin, and both SAC and DDR are required for metaphase delays. During persistent metaphase arrest following establishment of bi-oriented chromosomes, stability of the metaphase plate is compromised in the absence of DDR kinases ATR or CHK1 or SAC components, MAD1/MAD2, suggesting SAC functions in metaphase beyond its interactions with APC activator CDC20. In response to DNA damage, MAD2 and the histone variant CENPA become enriched at the nuclear periphery in a DDR-dependent manner. Further, depletion of either MAD1 or CENPA results in loss of peripherally associated damaged DNA. In contrast to a SAC-insensitive CDC20 mutant, germ cells deficient for SAC or CENPA cannot efficiently repair DNA damage, suggesting that SAC mediates DNA repair through CENPA interactions with the nuclear periphery. We also show that replication perturbations result in relocalization of MAD1/MAD2 in human cells, suggesting that the role of SAC in DNA repair is conserved. PMID:25898113

  7. Zygote-mediated generation of genome-modified mice using Streptococcus thermophilus 1-derived CRISPR/Cas system.

    PubMed

    Fujii, Wataru; Kakuta, Shigeru; Yoshioka, Shin; Kyuwa, Shigeru; Sugiura, Koji; Naito, Kunihiko

    2016-08-26

    Mammalian zygote-mediated genome-engineering by CRISPR/Cas is currently used for the generation of genome-modified animals. Here we report that a Streptococcus thermophilus-1 derived orthologous CRISPR/Cas system, which recognizes the 5'-NNAGAA sequence as a protospacer adjacent motif (PAM), is useful in mouse zygotes and is applicable for generating knockout mice (87.5%) and targeted knock-in mice (45.5%). The induced mutation could be inherited in the next generation. This novel CRISPR/Cas can expand the feasibility of the zygote-mediated generation of genome-modified animals that require an exact mutation design. PMID:27318086

  8. Genome-wide signatures of male-mediated migration shaping the Indian gene pool.

    PubMed

    ArunKumar, GaneshPrasad; Tatarinova, Tatiana V; Duty, Jeff; Rollo, Debra; Syama, Adhikarla; Arun, Varatharajan Santhakumari; Kavitha, Valampuri John; Triska, Petr; Greenspan, Bennett; Wells, R Spencer; Pitchappan, Ramasamy

    2015-09-01

    Multiple questions relating to contributions of cultural and demographical factors in the process of human geographical dispersal remain largely unanswered. India, a land of early human settlement and the resulting diversity is a good place to look for some of the answers. In this study, we explored the genetic structure of India using a diverse panel of 78 males genotyped using the GenoChip. Their genome-wide single-nucleotide polymorphism (SNP) diversity was examined in the context of various covariates that influence Indian gene pool. Admixture analysis of genome-wide SNP data showed high proportion of the Southwest Asian component in all of the Indian samples. Hierarchical clustering based on admixture proportions revealed seven distinct clusters correlating to geographical and linguistic affiliations. Convex hull overlay of Y-chromosomal haplogroups on the genome-wide SNP principal component analysis brought out distinct non-overlapping polygons of F*-M89, H*-M69, L1-M27, O2a-M95 and O3a3c1-M117, suggesting a male-mediated migration and expansion of the Indian gene pool. Lack of similar correlation with mitochondrial DNA clades indicated a shared genetic ancestry of females. We suggest that ancient male-mediated migratory events and settlement in various regional niches led to the present day scenario and peopling of India. PMID:25994871

  9. CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

    PubMed

    Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

    2015-12-17

    Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. PMID:26686651

  10. A high utility integrated map of the pig genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The domestic pig is being increasingly exploited as a system for modeling human disease. It also has substantial economic importance for meat-based protein production. Physical clone maps have underpinned large-scale genomic sequencing and enabled focused cloning efforts for many genome...

  11. Integrated and composite genome maps: the bovine example

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Combinations of genome maps representing different types of information are needed to link economically important phenotypic variation with underlying genomic variation in farmed animals. For the cow, data from two linkage populations and three radiation hybrid (RH) panels were combined to construc...

  12. CRISPR-mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice.

    PubMed

    Xu, Li; Park, Ki Ho; Zhao, Lixia; Xu, Jing; El Refaey, Mona; Gao, Yandi; Zhu, Hua; Ma, Jianjie; Han, Renzhi

    2016-03-01

    Duchenne muscular dystrophy (DMD) is a degenerative muscle disease caused by genetic mutations that lead to the disruption of dystrophin in muscle fibers. There is no curative treatment for this devastating disease. Clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) has emerged as a powerful tool for genetic manipulation and potential therapy. Here we demonstrate that CRIPSR-mediated genome editing efficiently excised a 23-kb genomic region on the X-chromosome covering the mutant exon 23 in a mouse model of DMD, and restored dystrophin expression and the dystrophin-glycoprotein complex at the sarcolemma of skeletal muscles in live mdx mice. Electroporation-mediated transfection of the Cas9/gRNA constructs in the skeletal muscles of mdx mice normalized the calcium sparks in response to osmotic shock. Adenovirus-mediated transduction of Cas9/gRNA greatly reduced the Evans blue dye uptake of skeletal muscles at rest and after downhill treadmill running. This study provides proof evidence for permanent gene correction in DMD. PMID:26449883

  13. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas

    PubMed Central

    2015-01-01

    BACKGROUND Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. METHODS We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. RESULTS Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. CONCLUSIONS The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q

  14. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of synteny with model fish genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper we generated DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is compose...

  15. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4.

    PubMed

    Gao, Song; Zhang, Liang; Rao, Venigalla B

    2016-05-19

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  16. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4

    PubMed Central

    Gao, Song; Zhang, Liang; Rao, Venigalla B.

    2016-01-01

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo. On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  17. CRISPR/Cas9-Mediated Genome Editing in Soybean Hairy Roots.

    PubMed

    Cai, Yupeng; Chen, Li; Liu, Xiujie; Sun, Shi; Wu, Cunxiang; Jiang, Bingjun; Han, Tianfu; Hou, Wensheng

    2015-01-01

    As a new technology for gene editing, the CRISPR (clustered regularly interspaced short palindromic repeat)/Cas (CRISPR-associated) system has been rapidly and widely used for genome engineering in various organisms. In the present study, we successfully applied type II CRISPR/Cas9 system to generate and estimate genome editing in the desired target genes in soybean (Glycine max (L.) Merrill.). The single-guide RNA (sgRNA) and Cas9 cassettes were assembled on one vector to improve transformation efficiency, and we designed a sgRNA that targeted a transgene (bar) and six sgRNAs that targeted different sites of two endogenous soybean genes (GmFEI2 and GmSHR). The targeted DNA mutations were detected in soybean hairy roots. The results demonstrated that this customized CRISPR/Cas9 system shared the same efficiency for both endogenous and exogenous genes in soybean hairy roots. We also performed experiments to detect the potential of CRISPR/Cas9 system to simultaneously edit two endogenous soybean genes using only one customized sgRNA. Overall, generating and detecting the CRISPR/Cas9-mediated genome modifications in target genes of soybean hairy roots could rapidly assess the efficiency of each target loci. The target sites with higher efficiencies can be used for regular soybean transformation. Furthermore, this method provides a powerful tool for root-specific functional genomics studies in soybean. PMID:26284791

  18. Genetic and statistical study of HIV integration in the human genome

    NASA Astrophysics Data System (ADS)

    Sequeira, Inês J.; Gonçalves, Juliana; Moreira, Elsa; Mexia, João T.; Rueff, José; Brás, Aldina

    2013-10-01

    Integration of the human immunodeficiency virus (HIV) DNA into human genome is essential for HIV-induced disease. The human genome is organized into chromosomes and within these we can define the chromosomal fragile sites. Our aim is to contribute to help clarifying the integration sites preferences of HIV1 and HIV2 in fragile or non-fragile regions. Here we apply statistical techniques, namely non-parametric tests and analysis of variance for analyzing two sets of data of HIV1 and HIV2 integrations in the human genome. The results show that the integrations occur significantly with more intensity in the non-fragile regions of the human genome and that the HIV1 in particular has the major contribution to this fact. This study could have implications in human disease.

  19. MAR-mediated integration of plasmid vectors for in vivo gene transfer and regulation

    PubMed Central

    2013-01-01

    Background The in vivo transfer of naked plasmid DNA into organs such as muscles is commonly used to assess the expression of prophylactic or therapeutic genes in animal disease models. Results In this study, we devised vectors allowing a tight regulation of transgene expression in mice from such non-viral vectors using a doxycycline-controlled network of activator and repressor proteins. Using these vectors, we demonstrate proper physiological response as consequence of the induced expression of two therapeutically relevant proteins, namely erythropoietin and utrophin. Kinetic studies showed that the induction of transgene expression was only transient, unless epigenetic regulatory elements termed Matrix Attachment Regions, or MAR, were inserted upstream of the regulated promoters. Using episomal plasmid rescue and quantitative PCR assays, we observed that similar amounts of plasmids remained in muscles after electrotransfer with or without MAR elements, but that a significant portion had integrated into the muscle fiber chromosomes. Interestingly, the MAR elements were found to promote plasmid genomic integration but to oppose silencing effects in vivo, thereby mediating long-term expression. Conclusions This study thus elucidates some of the determinants of transient or sustained expression from the use of non-viral regulated vectors in vivo. PMID:24295286

  20. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    SciTech Connect

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  1. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes

    PubMed Central

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  2. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes.

    PubMed

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  3. Human genome-wide expression analysis reorients the study of inflammatory mediators and biomechanics in osteoarthritis.

    PubMed

    Sandy, J D; Chan, D D; Trevino, R L; Wimmer, M A; Plaas, A

    2015-11-01

    A major objective of this article is to examine the research implications of recently available genome-wide expression profiles of cartilage from human osteoarthritis (OA) joints. We propose that, when viewed in the light of extensive earlier work, this novel data provides a unique opportunity to reorient the design of experimental systems toward clinical relevance. Specifically, in the area of cartilage explant biology, this will require a fresh evaluation of existing paradigms, so as to optimize the choices of tissue source, cytokine/growth factor/nutrient addition, and biomechanical environment for discovery. Within this context, we firstly discuss the literature on the nature and role of potential catabolic mediators in OA pathology, including data from human OA cartilage, animal models of OA, and ex vivo studies. Secondly, due to the number and breadth of studies on IL-1β in this area, a major focus of the article is a critical analysis of the design and interpretation of cartilage studies where IL-1β has been used as a model cytokine. Thirdly, the article provides a data-driven perspective (including genome-wide analysis of clinical samples, studies on mutant mice, and clinical trials), which concludes that IL-1β should be replaced by soluble mediators such as IL-17 or TGF-β1, which are much more likely to mimic the disease in OA model systems. We also discuss the evidence that changes in early OA can be attributed to the activity of such soluble mediators, whereas late-stage disease results more from a chronic biomechanical effect on the matrix and cells of the remaining cartilage and on other local mediator-secreting cells. Lastly, an updated protocol for in vitro studies with cartilage explants and chondrocytes (including the use of specific gene expression arrays) is provided to motivate more disease-relevant studies on the interplay of cytokines, growth factors, and biomechanics on cellular behavior. PMID:26521740

  4. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    NEALSON, KENNETH H.

    2013-10-15

    products of dissimilatory iron reduction. Geochim. Cosmochim. Acta. 74:574-583. 10. Karpinets, T.V., A.Y Obraztsova, Y. Wang, D.D. Schmoyer, G.H. Kora, B.H. Park, M.H. Serres, M.F. Ropmine, M.L. Land, T.B. Kothe, J.K. Fredrickson, K.H. Nealson, and E.C. Uberbacher 2010. Conserved synteny at the protein family level reveals genes underlying Shewanella species? cold tolerance and predicts their novel phenotypes. Funct. Integr. Genomics 10: 97 ? 110. (DOI 10.1007/s10143-009-0142-y) 11. Bretschger, O., A.C.M. Cheung, F. Mansfeld, and K.H. Nealson. 2010. Comparative microbial fuel cell evaluations of Shewanella spp. Electroanalysis 22: 883-894. 12. McLean, J.S., G. Wanger, Y.A. Gorby, M. Wainstein, J. McQuaid, Shun?ichi Ishii, O. Bretschger, H. Beyanal, K.H. Nealson. 2010. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Env. Sci. Technol. 44:2721-2717. 13. El-Naggar, M., G. Wanger, K.M. Leung, T.D. Yuzvinsky, G. Southam, J. Yang, W.M. Lau, K.H. Nealson, and Y.A. Gorby. 2010. Electrical Transport Along Bacterial Nanowires from Shewanella oneidensis MR-1 Proc. Nat. Acad. Sci. USA 107:18127-18131. 14. Biffinger, J.C., L.A. Fitzgerald, R. Ray, B.J. Little, S.E. Lizewski, E.R. Petersen, B.R. Ringeisen, W.C. Sanders, P.E. Sheehan, J.J. Pietron, J.W. Baldwin, L.J. Nadeau, G.R. Johnson, M. Ribbens, S.E. Finkel, K.H. Nealson. 2010. The utility of Shewanella japonica for microbial fuel cells. Bioresource Technol. 102:290-297. 15. Rodionov, D. , C. Yang, X. Li, I. Rodionova, Y. Wang, A.Y. Obraztsova, O. P. Zagnitko, R. Overbeek, M. F. Romine, S. Reed, J.K. Fredrickson, K.H. Nealson, A.L. Osterman. 2010. Genomic encyclopedia of sugar utilization pathways in the Shewanella genus. BMC Genomics 2010, 11:494 16. Kan, J., L. Hsu, A.C.M. Cheung, M. Pirbazari, and K.H. Nealson. 2011. Current production by bacterial communities in microbial fuel cells enriched from wastewater sludge

  5. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  6. The Mediator complex: a central integrator of transcription

    PubMed Central

    Allen, Benjamin L.; Taatjes, Dylan J.

    2016-01-01

    The RNA polymerase II (pol II) enzyme transcribes all protein-coding and most non-coding RNA genes and is globally regulated by Mediator, a large, conformationally flexible protein complex with variable subunit composition (for example, a four-subunit CDK8 module can reversibly associate). These biochemical characteristics are fundamentally important for Mediator's ability to control various processes important for transcription, including organization of chromatin architecture and regulation of pol II pre-initiation, initiation, re-initiation, pausing, and elongation. Although Mediator exists in all eukaryotes, a variety of Mediator functions appear to be specific to metazoans, indicative of more diverse regulatory requirements. PMID:25693131

  7. NDRG1 links p53 with proliferation-mediated centrosome homeostasis and genome stability

    PubMed Central

    Croessmann, Sarah; Wong, Hong Yuen; Zabransky, Daniel J.; Chu, David; Mendonca, Janet; Sharma, Anup; Mohseni, Morassa; Rosen, D. Marc; Scharpf, Robert B.; Cidado, Justin; Cochran, Rory L.; Parsons, Heather A.; Dalton, W. Brian; Erlanger, Bracha; Button, Berry; Cravero, Karen; Kyker-Snowman, Kelly; Beaver, Julia A.; Kachhap, Sushant; Hurley, Paula J.; Lauring, Josh; Park, Ben Ho

    2015-01-01

    The tumor protein 53 (TP53) tumor suppressor gene is the most frequently somatically altered gene in human cancers. Here we show expression of N-Myc down-regulated gene 1 (NDRG1) is induced by p53 during physiologic low proliferative states, and mediates centrosome homeostasis, thus maintaining genome stability. When placed in physiologic low-proliferating conditions, human TP53 null cells fail to increase expression of NDRG1 compared with isogenic wild-type controls and TP53 R248W knockin cells. Overexpression and RNA interference studies demonstrate that NDRG1 regulates centrosome number and amplification. Mechanistically, NDRG1 physically associates with γ-tubulin, a key component of the centrosome, with reduced association in p53 null cells. Strikingly, TP53 homozygous loss was mutually exclusive of NDRG1 overexpression in over 96% of human cancers, supporting the broad applicability of these results. Our study elucidates a mechanism of how TP53 loss leads to abnormal centrosome numbers and genomic instability mediated by NDRG1. PMID:26324937

  8. An integrated encyclopedia of DNA elements in the human genome.

    PubMed

    2012-09-01

    The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research. PMID:22955616

  9. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  10. Genomic and Genetic Analysis of Bordetella Bacteriophages Encoding Reverse Transcriptase-Mediated Tropism-Switching Cassettes

    PubMed Central

    Liu, Minghsun; Gingery, Mari; Doulatov, Sergei R.; Liu, Yichin; Hodes, Asher; Baker, Stephen; Davis, Paul; Simmonds, Mark; Churcher, Carol; Mungall, Karen; Quail, Michael A.; Preston, Andrew; Harvill, Eric T.; Maskell, Duncan J.; Eiserling, Frederick A.; Parkhill, Julian; Miller, Jeff F.

    2004-01-01

    Liu et al. recently described a group of related temperate bacteriophages that infect Bordetella subspecies and undergo a unique template-dependent, reverse transcriptase-mediated tropism switching phenomenon (Liu et al., Science 295: 2091-2094, 2002). Tropism switching results from the introduction of single nucleotide substitutions at defined locations in the VR1 (variable region 1) segment of the mtd (major tropism determinant) gene, which determines specificity for receptors on host bacteria. In this report, we describe the complete nucleotide sequences of the 42.5- to 42.7-kb double-stranded DNA genomes of three related phage isolates and characterize two additional regions of variability. Forty-nine coding sequences were identified. Of these coding sequences, bbp36 contained VR2 (variable region 2), which is highly dynamic and consists of a variable number of identical 19-bp repeats separated by one of three 5-bp spacers, and bpm encodes a DNA adenine methylase with unusual site specificity and a homopolymer tract that functions as a hotspot for frameshift mutations. Morphological and sequence analysis suggests that these Bordetella phage are genetic hybrids of P22 and T7 family genomes, lending further support to the idea that regions encoding protein domains, single genes, or blocks of genes are readily exchanged between bacterial and phage genomes. Bordetella bacteriophages are capable of transducing genetic markers in vitro, and by using animal models, we demonstrated that lysogenic conversion can take place in the mouse respiratory tract during infection. PMID:14973019

  11. Integrating Genomes, Brain and Behavior in the Study of Songbirds

    PubMed Central

    Clayton, David F.; Balakrishnan, Christopher N.; London, Sarah E.

    2010-01-01

    Songbirds share some essential traits but are extraordinarily diverse, allowing comparative analyses aimed at identifying specific genotype–phenotype associations. This diversity encompasses traits like vocal communication and complex social behaviors that are of great interest to humans, but that are not well represented in other accessible research organisms. Many songbirds are readily observable in nature and thus afford unique insight into the links between environment and organism. The distinctive organization of the songbird brain will facilitate analysis of genomic links to brain and behavior. Access to the zebra finch genome sequence will, therefore, prompt new questions and provide the ability to answer those questions. PMID:19788884

  12. [Integration sites and their characteristic analysis of piggyBac transposon in cattle genome].

    PubMed

    Du, Xin-Hua; Gao, Xue; Zhang, Lu-Pei; Gao, Hui-Jiang; Li, Jun-Ya; Xu, Shang-Zhong

    2013-06-01

    As a useful tool for genetic engineering, piggyBac (PB) transposons have been widely used in more than one species of transgenosis or generating mutation studies. At present, the studies about PB transposons in cattle were few. In order to get the PB transposon integration sites and summarize its characteristics in bovine genome, donor plasmid of PB[CMV-EGFP] and helper-dependent plasmid of pcDNA-PBase were constructed and transferred into bovine fibroblasts by Amaxa basic nucleofector kit for primary mammalian fibroblasts. Cell clones stably transfected were obtained after screening by G-418. Genomic DNA of transgenic cells was extracted and the integration sites of PB transposon were detected by genome walking technology. Eight integration sites were obtained in bovine genome, although only 5 sites were mapped on chromosomes 1, 2, 11, and X chromosome. We found that PB transposon was inserted into the "TTAA" location and integrated into the intergenic non-regulatory sites between two genes. Analysis of the composition of the five bases, which was close to the side of the PB integration sites "TTAA", showed that PB 5' tended to be inserted into region rich in GC (62.5%). From the study, we got that transposition occurred in cattle genome by PB transposons and the integration site information acquired from the research will provide theoretical references for bovine study by PB transposon. PMID:23774022

  13. An integrated functional genomics approach identifies the regulatory network directed by brachyury (T) in chordoma.

    PubMed

    Nelson, Andrew C; Pillay, Nischalan; Henderson, Stephen; Presneau, Nadège; Tirabosco, Roberto; Halai, Dina; Berisha, Fitim; Flicek, Paul; Stemple, Derek L; Stern, Claudio D; Wardle, Fiona C; Flanagan, Adrienne M

    2012-11-01

    Chordoma is a rare malignant tumour of bone, the molecular marker of which is the expression of the transcription factor, brachyury. Having recently demonstrated that silencing brachyury induces growth arrest in a chordoma cell line, we now seek to identify its downstream target genes. Here we use an integrated functional genomics approach involving shRNA-mediated brachyury knockdown, gene expression microarray, ChIP-seq experiments, and bioinformatics analysis to achieve this goal. We confirm that the T-box binding motif of human brachyury is identical to that found in mouse, Xenopus, and zebrafish development, and that brachyury acts primarily as an activator of transcription. Using human chordoma samples for validation purposes, we show that brachyury binds 99 direct targets and indirectly influences the expression of 64 other genes, thereby acting as a master regulator of an elaborate oncogenic transcriptional network encompassing diverse signalling pathways including components of the cell cycle, and extracellular matrix components. Given the wide repertoire of its active binding and the relative specific localization of brachyury to the tumour cells, we propose that an RNA interference-based gene therapy approach is a plausible therapeutic avenue worthy of investigation. PMID:22847733

  14. Integrative genomic testing of cancer survival using semiparametric linear transformation models.

    PubMed

    Huang, Yen-Tsung; Cai, Tianxi; Kim, Eunhee

    2016-07-20

    The wide availability of multi-dimensional genomic data has spurred increasing interests in integrating multi-platform genomic data. Integrative analysis of cancer genome landscape can potentially lead to deeper understanding of the biological process of cancer. We integrate epigenetics (DNA methylation and microRNA expression) and gene expression data in tumor genome to delineate the association between different aspects of the biological processes and brain tumor survival. To model the association, we employ a flexible semiparametric linear transformation model that incorporates both the main effects of these genomic measures as well as the possible interactions among them. We develop variance component tests to examine different coordinated effects by testing various subsets of model coefficients for the genomic markers. A Monte Carlo perturbation procedure is constructed to approximate the null distribution of the proposed test statistics. We further propose omnibus testing procedures to synthesize information from fitting various parsimonious sub-models to improve power. Simulation results suggest that our proposed testing procedures maintain proper size under the null and outperform standard score tests. We further illustrate the utility of our procedure in two genomic analyses for survival of glioblastoma multiforme patients. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26887583

  15. Site-specific T-DNA integration in Arabidopsis thaliana mediated by the combined action of CRE recombinase and ϕC31 integrase.

    PubMed

    De Paepe, Annelies; De Buck, Sylvie; Nolf, Jonah; Van Lerberge, Els; Depicker, Ann

    2013-07-01

    Random T-DNA integration into the plant host genome can be problematic for a variety of reasons, including potentially variable transgene expression as a result of different integration positions and multiple T-DNA copies, the risk of mutating the host genome and the difficulty of stacking well-defined traits. Therefore, recombination systems have been proposed to integrate the T-DNA at a pre-selected site in the host genome. Here, we demonstrate the capacity of the ϕC31 integrase (INT) for efficient targeted T-DNA integration. Moreover, we show that the iterative site-specific integration system (ISSI), which combines the activities of the CRE recombinase and INT, enables the targeting of genes to a pre-selected site with the concomitant removal of the resident selectable marker. To begin, plants expressing both the CRE and INT recombinase and containing the target attP site were constructed. These plants were supertransformed with a T-DNA vector harboring the loxP site, the attB sites, a selectable marker and an expression cassette encoding a reporter protein. Three out of the 35 transformants obtained (9%) showed transgenerational site-specific integration (SSI) of this T-DNA and removal of the resident selectable marker, as demonstrated by PCR, Southern blot and segregation analysis. In conclusion, our results show the applicability of the ISSI system for precise and targeted Agrobacterium-mediated integration, allowing the serial integration of transgenic DNA sequences in plants. PMID:23574114

  16. A physical map of the highly heterozygous Populus genome: integration with the genome sequence and genetic map

    SciTech Connect

    Kelleher, Colin; CHIU, Dr. R.; Shin, Dr. H.; Krywinski, Martin; Fjell, Chris; Wilkin, Jennifer; Yin, Tongming; Difazio, Stephen P.

    2007-01-01

    As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 {+-} 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.

  17. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Cucurbitaceae includes important crops as cucumber, melon, watermelon, and squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular li...

  18. Integrating genomics and plant breeding: whither the breeders?

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Plant breeding has been practiced >5,000 years as an art and >100 years as a science. Selection provides the means where populations are improved for product, such as yield or composition, or for crop protection, such as pest and stress resistance. Such activities have not required use of genomic te...

  19. Integrated genomic approaches to enhance genetic resistance in chickens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The chicken has led the way amongst agricultural animal species in infectious disease control and, in particular, selection for genetic resistance. The generation of the chicken genome sequence and the availability of other empowering tools and resources greatly enhance the ability to select for enh...

  20. Integrating genomics into applied tropical fruit breeding programs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The plant genetics group at the SHRS is divided into three CRIS projects. All three are in the thematic National Program (NP) 301, Plant Microbial and Insect Genetic Resources, Genomics and Genetic Improvement. A major germplasm/breeding CRIS was established in 1998 for improving and preserving orna...

  1. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    TIEDJE, JAMES M; KONSTANTINIDIS, KOSTAS; WORDEN, MARK

    2014-01-08

    The aim of the work reported is to study Shewanella population genomics, and to understand the evolution, ecophysiology, and speciation of Shewanella. The tasks supporting this aim are: to study genetic and ecophysiological bases defining the core and diversification of Shewanella species; to determine gene content patterns along redox gradients; and to Investigate the evolutionary processes, patterns and mechanisms of Shewanella.

  2. Integrated genome-based studies of Shewanella ecophysiology

    SciTech Connect

    Segre Daniel; Beg Qasim

    2012-02-14

    This project was a component of the Shewanella Federation and, as such, contributed to the overall goal of applying the genomic tools to better understand eco-physiology and speciation of respiratory-versatile members of Shewanella genus. Our role at Boston University was to perform bioreactor and high throughput gene expression microarrays, and combine dynamic flux balance modeling with experimentally obtained transcriptional and gene expression datasets from different growth conditions. In the first part of project, we designed the S. oneidensis microarray probes for Affymetrix Inc. (based in California), then we identified the pathways of carbon utilization in the metal-reducing marine bacterium Shewanella oneidensis MR-1, using our newly designed high-density oligonucleotide Affymetrix microarray on Shewanella cells grown with various carbon sources. Next, using a combination of experimental and computational approaches, we built algorithm and methods to integrate the transcriptional and metabolic regulatory networks of S. oneidensis. Specifically, we combined mRNA microarray and metabolite measurements with statistical inference and dynamic flux balance analysis (dFBA) to study the transcriptional response of S. oneidensis MR-1 as it passes through exponential, stationary, and transition phases. By measuring time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different nutrient compositions (minimal lactate and nutritionally rich LB medium), we obtain detailed snapshots of the regulatory strategies used by this bacterium to cope with gradually changing nutrient availability. In addition to traditional clustering, which provides a first indication of major regulatory trends and transcription factors activities, we developed and implemented a new computational approach for Dynamic Detection of Transcriptional Triggers (D2T2). This new method allows us to infer a putative topology of transcriptional dependencies

  3. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells.

    PubMed

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H

    2015-09-22

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  4. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells

    PubMed Central

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H.

    2015-01-01

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  5. Unusual RNA plant virus integration in the soybean genome leads to the production of small RNAs.

    PubMed

    da Fonseca, Guilherme Cordenonsi; de Oliveira, Luiz Felipe Valter; de Morais, Guilherme Loss; Abdelnor, Ricardo Vilela; Nepomuceno, Alexandre Lima; Waterhouse, Peter M; Farinelli, Laurent; Margis, Rogerio

    2016-05-01

    Horizontal gene transfer (HGT) is known to be a major force in genome evolution. The acquisition of genes from viruses by eukaryotic genomes is a well-studied example of HGT, including rare cases of non-retroviral RNA virus integration. The present study describes the integration of cucumber mosaic virus RNA-1 into soybean genome. After an initial metatranscriptomic analysis of small RNAs derived from soybean, the de novo assembly resulted a 3029-nt contig homologous to RNA-1. The integration of this sequence in the soybean genome was confirmed by DNA deep sequencing. The locus where the integration occurred harbors the full RNA-1 sequence followed by the partial sequence of an endogenous mRNA and another sequence of RNA-1 as an inverted repeat and allowing the formation of a hairpin structure. This region recombined into a retrotransposon located inside an exon of a soybean gene. The nucleotide similarity of the integrated sequence compared to other Cucumber mosaic virus sequences indicates that the integration event occurred recently. We described a rare event of non-retroviral RNA virus integration in soybean that leads to the production of a double-stranded RNA in a similar fashion to virus resistance RNAi plants. PMID:26993236

  6. HiCPlotter integrates genomic data with interaction matrices.

    PubMed

    Akdemir, Kadir Caner; Chin, Lynda

    2015-01-01

    Metazoan genomic material is folded into stable non-randomly arranged chromosomal structures that are tightly associated with transcriptional regulation and DNA replication. Various factors including regulators of pluripotency, long non-coding RNAs, or the presence of architectural proteins have been implicated in regulation and assembly of the chromatin architecture. Therefore, comprehensive visualization of this multi-faceted structure is important to unravel the connections between nuclear architecture and transcriptional regulation. Here, we present an easy-to-use open-source visualization tool, HiCPlotter, to facilitate juxtaposition of Hi-C matrices with diverse genomic assay outputs, as well as to compare interaction matrices between various conditions. https://github.com/kcakdemir/HiCPlotter. PMID:26392354

  7. Breaking bad: R-loops and genome integrity.

    PubMed

    Sollier, Julie; Cimprich, Karlene A

    2015-09-01

    R-loops, nucleic acid structures consisting of an RNA-DNA hybrid and displaced single-stranded (ss) DNA, are ubiquitous in organisms from bacteria to mammals. First described in bacteria where they initiate DNA replication, it now appears that R-loops regulate diverse cellular processes such as gene expression, immunoglobulin (Ig) class switching, and DNA repair. Changes in R-loop regulation induce DNA damage and genome instability, and recently it was shown that R-loops are associated with neurodegenerative disorders. We discuss recent developments in the field; in particular, the regulation and effects of R-loops in cells, their effect on genomic and epigenomic stability, and their potential contribution to the origin of diseases including cancer and neurodegenerative disorders. PMID:26045257

  8. Genome maintenance and transcription integrity in aging and disease

    PubMed Central

    Wolters, Stefanie; Schumacher, Björn

    2013-01-01

    DNA damage contributes to cancer development and aging. Congenital syndromes that affect DNA repair processes are characterized by cancer susceptibility, developmental defects, and accelerated aging (Schumacher et al., 2008). DNA damage interferes with DNA metabolism by blocking replication and transcription. DNA polymerase blockage leads to replication arrest and can gives rise to genome instability. Transcription, on the other hand, is an essential process for utilizing the information encoded in the genome. DNA damage that interferes with transcription can lead to apoptosis and cellular senescence. Both processes are powerful tumor suppressors (Bartek and Lukas, 2007). Cellular response mechanisms to stalled RNA polymerase II complexes have only recently started to be uncovered. Transcription-coupled DNA damage responses might thus play important roles for the adjustments to DNA damage accumulation in the aging organism (Garinis et al., 2009). Here we review human disorders that are caused by defects in genome stability to explore the role of DNA damage in aging and disease. We discuss how the nucleotide excision repair system functions at the interface of transcription and repair and conclude with concepts how therapeutic targeting of transcription might be utilized in the treatment of cancer. PMID:23443494

  9. Genome-wide siRNA screen for mediators of NF-κB activation

    PubMed Central

    Gewurz, Benjamin E.; Towfic, Fadi; Mar, Jessica C.; Shinners, Nicholas P.; Takasaki, Kaoru; Zhao, Bo; Cahir-McFarland, Ellen D.; Quackenbush, John; Xavier, Ramnik J.; Kieff, Elliott

    2012-01-01

    Although canonical NFκB is frequently critical for cell proliferation, survival, or differentiation, NFκB hyperactivation can cause malignant, inflammatory, or autoimmune disorders. Despite intensive study, mammalian NFκB pathway loss-of-function RNAi analyses have been limited to specific protein classes. We therefore undertook a human genome-wide siRNA screen for novel NFκB activation pathway components. Using an Epstein Barr virus latent membrane protein (LMP1) mutant, the transcriptional effects of which are canonical NFκB-dependent, we identified 155 proteins significantly and substantially important for NFκB activation in HEK293 cells. These proteins included many kinases, phosphatases, ubiquitin ligases, and deubiquinating enzymes not previously known to be important for NFκB activation. Relevance to other canonical NFκB pathways was extended by finding that 118 of the 155 LMP1 NF-κB activation pathway components were similarly important for IL-1β–, and 79 for TNFα–mediated NFκB activation in the same cells. MAP3K8, PIM3, and six other enzymes were uniquely relevant to LMP1-mediated NFκB activation. Most novel pathway components functioned upstream of IκB kinase complex (IKK) activation. Robust siRNA knockdown effects were confirmed for all mRNAs or proteins tested. Although multiple ZC3H-family proteins negatively regulate NFκB, ZC3H13 and ZC3H18 were activation pathway components. ZC3H13 was critical for LMP1, TNFα, and IL-1β NFκB-dependent transcription, but not for IKK activation, whereas ZC3H18 was critical for IKK activation. Down-modulators of LMP1 mediated NFκB activation were also identified. These experiments identify multiple targets to inhibit or stimulate LMP1-, IL-1β–, or TNFα–mediated canonical NFκB activation. PMID:22308454

  10. An integrated computational pipeline and database to support whole-genome sequence annotation

    PubMed Central

    Mungall, CJ; Misra, S; Berman, BP; Carlson, J; Frise, E; Harris, N; Marshall, B; Shu, S; Kaminker, JS; Prochnik, SE; Smith, CD; Smith, E; Tupy, JL; Wiel, C; Rubin, GM; Lewis, SE

    2002-01-01

    We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture. PMID:12537570

  11. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de). PMID:15888676

  12. Multiplex genomic walking: Integration of the wet lab and computer lab into a single prototyping environment

    SciTech Connect

    Gillevet, P.M.

    1993-12-31

    The authors are presently sequencing the entire genome of Mycoplasma capricolum, one of the smallest of free living organisms by a Multiplex Genomic Walking strategy. This technique involves the repetitive hybridization of sequencing membranes with oligonucleotide probes to acquire sequence data in discrete steps along the genome. The technique allows one to walk a genome in a directed manner eliminating the problems associated with random shotgun assembly. Furthermore, the repetitive stripping and hybridization process is relatively simple to reproduce and has the potential to be easily automated. The Genetic Data Environment (GDE), an X Windows based Graphic User Interface has allowed the seamless integration of a core multiple sequence editor with pre-existing external sequence analysis programs and internally developed programs into a single prototypic environment. This system has facilitated linkage of the 9 Harvard Genome Lab`s internal database and automated data control systems into one Graphic User Interface which can handle the archiving and analysis of both random fluorescent sequencing data and genomic walking data from the Mycoplasma project. Finally, it has facilitated the integration of the Genomic sequence data into a PROLOG database environment for the comparative analysis of Mycoplasma capricolum and other organisms.

  13. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system

    SciTech Connect

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D.; Markowitz, Victor M.; Kyrpides, Nikos C.

    2009-05-01

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across a statistically significant and phylogeneticaly diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate and explore gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  14. ITEP: An integrated toolkit for exploration of microbial pan-genomes

    PubMed Central

    2014-01-01

    Background Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. Results We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP’s capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network

  15. Genomic characterization of viral integration sites in HPV-related cancers.

    PubMed

    Bodelon, Clara; Untereiner, Michael E; Machiela, Mitchell J; Vinokurova, Svetlana; Wentzensen, Nicolas

    2016-11-01

    Persistent infection with carcinogenic human papillomaviruses (HPV) causes the majority of anogenital cancers and a subset of head and neck cancers. The HPV genome is frequently found integrated into the host genome of invasive cancers. The mechanisms of how it may promote disease progression are not well understood. Thoroughly characterizing integration events can provide insights into HPV carcinogenesis. Individual studies have reported limited number of integration sites in cell lines and human samples. We performed a systematic review of published integration sites in HPV-related cancers and conducted a pooled analysis to formally test for integration hotspots and genomic features enriched in integration events using data from the Encyclopedia of DNA Elements (ENCODE). Over 1,500 integration sites were reported in the literature, of which 90.8% (N = 1,407) were in human tissues. We found 10 cytobands enriched for integration events, three previously reported ones (3q28, 8q24.21 and 13q22.1) and seven additional ones (2q22.3, 3p14.2, 8q24.22, 14q24.1, 17p11.1, 17q23.1 and 17q23.2). Cervical infections with HPV18 were more likely to have breakpoints in 8q24.21 (p = 7.68 × 10(-4) ) than those with HPV16. Overall, integration sites were more likely to be in gene regions than expected by chance (p = 6.93 × 10(-9) ). They were also significantly closer to CpG regions, fragile sites, transcriptionally active regions and enhancers. Few integration events occurred within 50 Kb of known cervical cancer driver genes. This suggests that HPV integrates in accessible regions of the genome, preferentially genes and enhancers, which may affect the expression of target genes. PMID:27343048

  16. Site-specific in situ amplification of the integrated polyomavirus genome: a case for a context-specific over-replication model of gene amplification.

    PubMed

    Syu, L J; Fluck, M M

    1997-08-01

    The fate of the genome of the polyoma (Py) tumor virus following integration in the chromosomes of transformed rat FR3T3 cells was re-examined. The viral sequences were integrated at a single transformant-specific chromosomal site in each of 22 transformants tested. In situ amplification of the viral sequences was observed in 24 of 34 transformants analyzed. Large T antigen, the unique viral function involved in initiating DNA replication from the viral origin, was essential for the amplification process. There was an absolute requirement for a reiteration of viral sequences and the extent of the reiteration affected the degree of amplification. The reiteration may be important for homologous recombination-mediated resolution of in situ amplified sequences. Among 11 transformants harboring a 1 to 2 kb repeat, the degree of amplification was transformant-specific and varied over a wide range. At the high end of the spectrum, the genome copy number increased 1300-fold at steady state, while at the low end, amplification was below twofold. Some aspect of the host chromatin at the site integration that affected viral gene expression, also directly or indirectly modulated the amplification. Use of high-resolution electrophoresis for the analysis of the integrated amplified sequences revealed a recurring novel pattern, consisting of a ladder with numerous bands separated by a constant distance approximately the size of the Py genome. We suggest that this pattern was generated by conversion of the amplified viral genomes to head to tail linear arrays with cell to cell variations in the number of genome repeats at single, transformant-specific, chromosomal sites. In light of the known "out of schedule" firing of the Py origin, we propose an "onion skin" structure intermediate and present a homologous recombination model for the conversion from onion skins to linear arrays. The relevance of the in situ amplification of the Py genome to cellular gene amplification is

  17. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  18. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  19. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  20. Efficient CRISPR/Cas9-Mediated Genome Editing in Mice by Zygote Electroporation of Nuclease.

    PubMed

    Qin, Wenning; Dion, Stephanie L; Kutny, Peter M; Zhang, Yingfan; Cheng, Albert W; Jillette, Nathaniel L; Malhotra, Ankit; Geurts, Aron M; Chen, Yi-Guang; Wang, Haoyi

    2015-06-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas) system is an adaptive immune system in bacteria and archaea that has recently been exploited for genome engineering. Mutant mice can be generated in one step through direct delivery of the CRISPR/Cas9 components into a mouse zygote. Although the technology is robust, delivery remains a bottleneck, as it involves manual injection of the components into the pronuclei or the cytoplasm of mouse zygotes, which is technically demanding and inherently low throughput. To overcome this limitation, we employed electroporation as a means to deliver the CRISPR/Cas9 components, including Cas9 messenger RNA, single-guide RNA, and donor oligonucleotide, into mouse zygotes and recovered live mice with targeted nonhomologous end joining and homology-directed repair mutations with high efficiency. Our results demonstrate that mice carrying CRISPR/Cas9-mediated targeted mutations can be obtained with high efficiency by zygote electroporation. PMID:25819794

  1. Natural bone fragmentation in the blind cave-dwelling fish, Astyanax mexicanus: candidate gene identification through integrative comparative genomics.

    PubMed

    Gross, Joshua B; Stahl, Bethany A; Powers, Amanda K; Carlson, Brian M

    2016-01-01

    Animals that colonize dark and nutrient-poor subterranean environments evolve numerous extreme phenotypes. These include dramatic changes to the craniofacial complex, many of which are under genetic control. These phenotypes can demonstrate asymmetric genetic signals wherein a QTL is detected on one side of the face but not the other. The causative gene(s) underlying QTL are difficult to identify with limited genomic resources. We approached this task by searching for candidate genes mediating fragmentation of the third suborbital bone (SO3) directly inferior to the orbit of the eye. We integrated positional genomic information using emerging Astyanax resources, and linked these intervals to homologous (syntenic) regions of the Danio rerio genome. We identified a discrete, approximately 6 Mb, conserved region wherein the gene causing SO3 fragmentation likely resides. We interrogated this interval for genes demonstrating significant differential expression using mRNA-seq analysis of cave and surface morphs across life history. We then assessed genes with known roles in craniofacial evolution and development based on GO term annotation. Finally, we screened coding sequence alterations in this region, identifying two key genes: transforming growth factor β3 (tgfb3) and bone morphogenetic protein 4 (bmp4). Of these candidates, tgfb3 is most promising as it demonstrates significant differential expression across multiple stages of development, maps close (<1 Mb) to the fragmentation critical locus, and is implicated in a variety of other animal systems (including humans) in non-syndromic clefting and malformations of the cranial sutures. Both abnormalities are analogous to the failure-to-fuse phenotype that we observe in SO3 fragmentation. This integrative approach will enable discovery of the causative genetic lesions leading to complex craniofacial features analogous to human craniofacial disorders. This work underscores the value of cave-dwelling fish as a

  2. VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites.

    PubMed

    Calabria, Andrea; Leo, Simone; Benedicenti, Fabrizio; Cesana, Daniela; Spinozzi, Giulio; Orsini, Massimilano; Merella, Stefania; Stupka, Elia; Zanetti, Gianluigi; Montini, Eugenio

    2014-01-01

    The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings. PMID:25342980

  3. The Plant Genome Integrative Explorer Resource: PlantGenIE.org.

    PubMed

    Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R

    2015-12-01

    Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. PMID:26192091

  4. A 4103 marker integrated physical and comparative map of the horse genome

    PubMed Central

    Raudsepp, Terje; Gustafson-Seabury, Ashley; Durkin, Keith; Wagner, Michelle L.; Goh, Glenda; Seabury, Christopher M.; Brinkmeyer-Langford, Candice; Lee, Eun-Joon; Agarwala, Richa; Rice, Edward Stallknecht; Schäffer, Alejandro A.; Skow, Loren C.; Tozaki, Teruaki; Yasue, Hiroshi; Penedo, M. Cecilia T.; Lyons, Leslie A.; Khazanehdari, Kamal A.; Binns, Matthew M.; MacLeod, James N.; Distl, Ottmar; Guérin, Gérard; Leeb, Tosso; Mickelson, James R.; Chowdhary, Bhanu P.

    2008-01-01

    A comprehensive second-generation whole genome radiation hybrid (RH II), cytogenetic and comparative map of the horse genome (2n=64) has been developed using the 5000rad horse × hamster radiation hybrid panel and fluorescence in situ hybridization (FISH). The map contains 4,103 markers (3,816 RH, 1,144 FISH) assigned to all 31 pairs of autosomes and the X chromosome. The RH maps of individual chromosomes are anchored and oriented using 857 cytogenetic markers. The overall resolution of the map is one marker per 775 kilobase-pairs (kb), which represents a more than five-fold improvement over the first-generation map. The RH II incorporates 920 markers shared jointly with the two recently reported meiotic maps. Consequently the two maps were aligned with the RH II maps of individual autosomes and the X chromosome. Additionally, a comparative map of the horse genome was generated by connecting 1,904 loci on the horse map with genome sequences available for eight diverse vertebrates to highlight regions of evolutionarily conserved syntenies, linkages and chromosomal breakpoints. The integrated map thus obtained presents the most comprehensive information on the physical and comparative organization of the equine genome and will assist future assemblies of whole genome BAC fingerprint maps and the genome sequence. It will also serve as a tool to identify genes governing health, disease and performance traits in horses and assist us in understanding the evolution of the equine genome in relation to other species. PMID:18931483

  5. Tomato genomic resources database: an integrated repository of useful tomato genomic information for basic and applied research.

    PubMed

    Suresh, B Venkata; Roy, Riti; Sahu, Kamlesh; Misra, Gopal; Chattopadhyay, Debasis

    2014-01-01

    Tomato Genomic Resources Database (TGRD) allows interactive browsing of tomato genes, micro RNAs, simple sequence repeats (SSRs), important quantitative trait loci and Tomato-EXPEN 2000 genetic map altogether or separately along twelve chromosomes of tomato in a single window. The database is created using sequence of the cultivar Heinz 1706. High quality single nucleotide polymorphic (SNP) sites between the genes of Heinz 1706 and the wild tomato S. pimpinellifolium LA1589 are also included. Genes are classified into different families. 5'-upstream sequences (5'-US) of all the genes and their tissue-specific expression profiles are provided. Sequences of the microRNA loci and their putative target genes are catalogued. Genes and 5'-US show presence of SSRs and SNPs. SSRs located in the genomic, genic and 5'-US can be analysed separately for the presence of any particular motif. Primer sequences for all the SSRs and flanking sequences for all the genic SNPs have been provided. TGRD is a user-friendly web-accessible relational database and uses CMAP viewer for graphical scanning of all the features. Integration and graphical presentation of important genomic information will facilitate better and easier use of tomato genome. TGRD can be accessed as an open source repository at http://59.163.192.91/tomato2/. PMID:24466070

  6. Tomato Genomic Resources Database: An Integrated Repository of Useful Tomato Genomic Information for Basic and Applied Research

    PubMed Central

    Suresh, B. Venkata; Roy, Riti; Sahu, Kamlesh; Misra, Gopal; Chattopadhyay, Debasis

    2014-01-01

    Tomato Genomic Resources Database (TGRD) allows interactive browsing of tomato genes, micro RNAs, simple sequence repeats (SSRs), important quantitative trait loci and Tomato-EXPEN 2000 genetic map altogether or separately along twelve chromosomes of tomato in a single window. The database is created using sequence of the cultivar Heinz 1706. High quality single nucleotide polymorphic (SNP) sites between the genes of Heinz 1706 and the wild tomato S. pimpinellifolium LA1589 are also included. Genes are classified into different families. 5′-upstream sequences (5′-US) of all the genes and their tissue-specific expression profiles are provided. Sequences of the microRNA loci and their putative target genes are catalogued. Genes and 5′-US show presence of SSRs and SNPs. SSRs located in the genomic, genic and 5′-US can be analysed separately for the presence of any particular motif. Primer sequences for all the SSRs and flanking sequences for all the genic SNPs have been provided. TGRD is a user-friendly web-accessible relational database and uses CMAP viewer for graphical scanning of all the features. Integration and graphical presentation of important genomic information will facilitate better and easier use of tomato genome. TGRD can be accessed as an open source repository at http://59.163.192.91/tomato2/. PMID:24466070

  7. Integrating genetic and genomic information into effective cancer care in diverse populations

    PubMed Central

    Fashoyin-Aje, L.; Sanghavi, K.; Bjornard, K.; Bodurtha, J.

    2013-01-01

    This paper provides an overview of issues in the integration of genetic (related to hereditary DNA) and genomic (related to genes and their functions) information in cancer care for individuals and families who are part of health care systems worldwide, from low to high resourced. National and regional cancer plans have the potential to integrate genetic and genomic information with a goal of identifying and helping individuals and families with and at risk of cancer. Healthcare professionals and the public have the opportunity to increase their genetic literacy and communication about cancer family history to enhance cancer control, prevention, and tailored therapies. PMID:24001763

  8. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology.

    PubMed

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho; Park, Kwang-Wook

    2016-03-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs. PMID:27051344

  9. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology

    PubMed Central

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho

    2016-01-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs. PMID:27051344

  10. Tc1-like Transposase Thm3 of Silver Carp (Hypophthalmichthys molitrix) Can Mediate Gene Transposition in the Genome of Blunt Snout Bream (Megalobrama amblycephala)

    PubMed Central

    Guo, Xiu-Ming; Zhang, Qian-Qian; Sun, Yi-Wen; Jiang, Xia-Yun; Zou, Shu-Ming

    2015-01-01

    Tc1-like transposons consist of an inverted repeat sequence flanking a transposase gene that exhibits similarity to the mobile DNA element, Tc1, of the nematode, Caenorhabditis elegans. They are widely distributed within vertebrate genomes including teleost fish; however, few active Tc1-like transposases have been discovered. In this study, 17 Tc1-like transposon sequences were isolated from 10 freshwater fish species belonging to the families Cyprinidae, Adrianichthyidae, Cichlidae, and Salmonidae. We conducted phylogenetic analyses of these sequences using previously isolated Tc1-like transposases and report that 16 of these elements comprise a new subfamily of Tc1-like transposons. In particular, we show that one transposon, Thm3 from silver carp (Hypophthalmichthys molitrix; Cyprinidae), can encode a 335-aa transposase with apparently intact domains, containing three to five copies in its genome. We then coinjected donor plasmids harboring 367 bp of the left end and 230 bp of the right end of the nonautonomous silver carp Thm1 cis-element along with capped Thm3 transposase RNA into the embryos of blunt snout bream (Megalobrama amblycephala; one- to two-cell embryos). This experiment revealed that the average integration rate could reach 50.6% in adult fish. Within the blunt snout bream genome, the TA dinucleotide direct repeat, which is the signature of Tc1-like family of transposons, was created adjacent to both ends of Thm1 at the integration sites. Our results indicate that the silver carp Thm3 transposase can mediate gene insertion by transposition within the genome of blunt snout bream genome, and that this occurs with a TA position preference. PMID:26438298

  11. Distinct SUMO Ligases Cooperate with Esc2 and Slx5 to Suppress Duplication-Mediated Genome Rearrangements

    PubMed Central

    Albuquerque, Claudio P.; Wang, Guoliang; Lee, Nancy S.; Kolodner, Richard D.; Putnam, Christopher D.; Zhou, Huilin

    2013-01-01

    Suppression of duplication-mediated gross chromosomal rearrangements (GCRs) is essential to maintain genome integrity in eukaryotes. Here we report that SUMO ligase Mms21 has a strong role in suppressing GCRs in Saccharomyces cerevisiae, while Siz1 and Siz2 have weaker and partially redundant roles. Understanding the functions of these enzymes has been hampered by a paucity of knowledge of their substrate specificity in vivo. Using a new quantitative SUMO-proteomics technology, we found that Siz1 and Siz2 redundantly control the abundances of most sumoylated substrates, while Mms21 more specifically regulates sumoylation of RNA polymerase-I and the SMC-family proteins. Interestingly, Esc2, a SUMO-like domain-containing protein, specifically promotes the accumulation of sumoylated Mms21-specific substrates and functions with Mms21 to suppress GCRs. On the other hand, the Slx5-Slx8 complex, a SUMO-targeted ubiquitin ligase, suppresses the accumulation of sumoylated Mms21-specific substrates. Thus, distinct SUMO ligases work in concert with Esc2 and Slx5-Slx8 to control substrate specificity and sumoylation homeostasis to prevent GCRs. PMID:23935535

  12. Polymerase Θ is a key driver of genome evolution and of CRISPR/Cas9-mediated mutagenesis.

    PubMed

    van Schendel, Robin; Roerink, Sophie F; Portegijs, Vincent; van den Heuvel, Sander; Tijsterman, Marcel

    2015-01-01

    Cells are protected from toxic DNA double-stranded breaks (DSBs) by a number of DNA repair mechanisms, including some that are intrinsically error prone, thus resulting in mutations. To what extent these mechanisms contribute to evolutionary diversification remains unknown. Here, we demonstrate that the A-family polymerase theta (POLQ) is a major driver of inheritable genomic alterations in Caenorhabditis elegans. Unlike somatic cells, which use non-homologous end joining (NHEJ) to repair DNA transposon-induced DSBs, germ cells use polymerase theta-mediated end joining, a conceptually simple repair mechanism requiring only one nucleotide as a template for repair. Also CRISPR/Cas9-induced genomic changes are exclusively generated through polymerase theta-mediated end joining, refuting a previously assumed requirement for NHEJ in their formation. Finally, through whole-genome sequencing of propagated populations, we show that only POLQ-proficient animals accumulate genomic scars that are abundantly present in genomes of wild C. elegans, pointing towards POLQ as a major driver of genome diversification. PMID:26077599

  13. Plant Genome DataBase Japan (PGDBj): A Portal Website for the Integration of Plant Genome-Related Databases

    PubMed Central

    Asamizu, Erika; Ichihara, Hisako; Nakaya, Akihiro; Nakamura, Yasukazu; Hirakawa, Hideki; Ishii, Takahiro; Tamura, Takuro; Fukami-Kobayashi, Kaoru; Nakajima, Yukari; Tabata, Satoshi

    2014-01-01

    The Plant Genome DataBase Japan (PGDBj, http://pgdbj.jp/?ln=en) is a portal website that aims to integrate plant genome-related information from databases (DBs) and the literature. The PGDBj is comprised of three component DBs and a cross-search engine, which provides a seamless search over the contents of the DBs. The three DBs are as follows. (i) The Ortholog DB, providing gene cluster information based on the amino acid sequence similarity. Over 500,000 amino acid sequences of 20 Viridiplantae species were subjected to reciprocal BLAST searches and clustered. Sequences from plant genome DBs (e.g. TAIR10 and RAP-DB) were also included in the cluster with a direct link to the original DB. (ii) The Plant Resource DB, integrating the SABRE DB, which provides cDNA and genome sequence resources accumulated and maintained in the RIKEN BioResource Center and National BioResource Projects. (iii) The DNA Marker DB, providing manually or automatically curated information of DNA markers, quantitative trait loci and related linkage maps, from the literature and external DBs. As the PGDBj targets various plant species, including model plants, algae, and crops important as food, fodder and biofuel, researchers in the field of basic biology as well as a wide range of agronomic fields are encouraged to perform searches using DNA sequences, gene names, traits and phenotypes of interest. The PGDBj will return the search results from the component DBs and various types of linked external DBs. PMID:24363285

  14. Integration of physical and genetic maps in apple confirms whole-genome and segmental duplications in the apple genome

    PubMed Central

    Han, Yuepeng; Zheng, Danman; Vimolmangkang, Sornkanok; Khan, Muhammad A.; Beever, Jonathan E.; Korban, Schuyler S.

    2011-01-01

    A total of 355 simple sequence repeat (SSR) markers were developed, based on expressed sequence tag (EST) and bacterial artificial chromosome (BAC)-end sequence databases, and successfully used to construct an SSR-based genetic linkage map of the apple. The consensus linkage map spanned 1143 cM, with an average density of 2.5 cM per marker. Newly developed SSR markers along with 279 SSR markers previously published by the HiDRAS project were further used to integrate physical and genetic maps of the apple using a PCR-based BAC library screening approach. A total of 470 contigs were unambiguously anchored onto all 17 linkage groups of the apple genome, and 158 contigs contained two or more molecular markers. The genetically mapped contigs spanned ∼421 Mb in cumulative physical length, representing 60.0% of the genome. The sizes of anchored contigs ranged from 97 kb to 4.0 Mb, with an average of 995 kb. The average physical length of anchored contigs on each linkage group was ∼24.8 Mb, ranging from 17.0 Mb to 37.73 Mb. Using BAC DNA as templates, PCR screening of the BAC library amplified fragments of highly homologous sequences from homoeologous chromosomes. Upon integrating physical and genetic maps of the apple, the presence of not only homoeologous chromosome pairs, but also of multiple locus markers mapped to adjacent sites on the same chromosome was detected. These findings demonstrated the presence of both genome-wide and segmental duplications in the apple genome and provided further insights into the complex polyploid ancestral origin of the apple. PMID:21743103

  15. Cerebral White Matter Integrity Mediates Adult Age Differences in Cognitive Performance

    ERIC Educational Resources Information Center

    Madden, David J.; Spaniol, Julia; Costello, Matthew C.; Bucur, Barbara; White, Leonard E.; Cabeza, Roberto; Davis, Simon W.; Dennis, Nancy A.; Provenzale, James M.; Huettel, Scott A.

    2009-01-01

    Previous research has established that age-related decline occurs in measures of cerebral white matter integrity, but the role of this decline in age-related cognitive changes is not clear. To conclude that white matter integrity has a mediating (causal) contribution, it is necessary to demonstrate that statistical control of the white…

  16. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia.

    PubMed

    Williams, Anna V; Miller, Joseph T; Small, Ian; Nevill, Paul G; Boykin, Laura M

    2016-03-01

    Combining whole genome data with previously obtained amplicon sequences has the potential to increase the resolution of phylogenetic analyses, particularly at low taxonomic levels or where recent divergence, rapid speciation or slow genome evolution has resulted in limited sequence variation. However, the integration of these types of data for large scale phylogenetic studies has rarely been investigated. Here we conduct a phylogenetic analysis of the whole chloroplast genome and two nuclear ribosomal loci for 65 Acacia species from across the most recent Acacia phylogeny. We then combine this data with previously generated amplicon sequences (four chloroplast loci and two nuclear ribosomal loci) for 508 Acacia species. We use several phylogenetic methods, including maximum likelihood bootstrapping (with and without constraint) and ExaBayes, in order to determine the success of combining a dataset of 4000bp with one of 189,000bp. The results of our study indicate that the inclusion of whole genome data gave a far better resolved and well supported representation of the phylogenetic relationships within Acacia than using only amplicon sequences, with the greatest support observed when using a whole genome phylogeny as a constraint on the amplicon sequences. Our study therefore provides methods for optimal integration of genomic and amplicon sequences. PMID:26702955

  17. Homologous recombination maintenance of genome integrity during DNA damage tolerance

    PubMed Central

    Prado, Félix

    2014-01-01

    The DNA strand exchange protein Rad51 provides a safe mechanism for the repair of DNA breaks using the information of a homologous DNA template. Homologous recombination (HR) also plays a key role in the response to DNA damage that impairs the advance of the replication forks by providing mechanisms to circumvent the lesion and fill in the tracks of single-stranded DNA that are generated during the process of lesion bypass. These activities postpone repair of the blocking lesion to ensure that DNA replication is completed in a timely manner. Experimental evidence generated over the last few years indicates that HR participates in this DNA damage tolerance response together with additional error-free (template switch) and error-prone (translesion synthesis) mechanisms through intricate connections, which are presented here. The choice between repair and tolerance, and the mechanism of tolerance, is critical to avoid increased mutagenesis and/or genome rearrangements, which are both hallmarks of cancer. PMID:27308329

  18. Predicting Peptide-Mediated Interactions on a Genome-Wide Scale

    PubMed Central

    Chen, T. Scott; Petrey, Donald; Garzon, Jose Ignacio; Honig, Barry

    2015-01-01

    We describe a method to predict protein-protein interactions (PPIs) formed between structured domains and short peptide motifs. We take an integrative approach based on consensus patterns of known motifs in databases, structures of domain-motif complexes from the PDB and various sources of non-structural evidence. We combine this set of clues using a Bayesian classifier that reports the likelihood of an interaction and obtain significantly improved prediction performance when compared to individual sources of evidence and to previously reported algorithms. Our Bayesian approach was integrated into PrePPI, a structure-based PPI prediction method that, so far, has been limited to interactions formed between two structured domains. Around 80,000 new domain-motif mediated interactions were predicted, thus enhancing PrePPI’s coverage of the human protein interactome. PMID:25938916

  19. Integrative analysis of genome-wide association studies and gene expression analysis identifies pathways associated with rheumatoid arthritis

    PubMed Central

    Li, Jin; Jiang, Yongshuai; Zhang, Ruijie

    2016-01-01

    Rheumatoid arthritis (RA) is a complex and systematic autoimmune disease, which is usually influenced by both genetic and environmental factors. Pathway analyses based on a single data type such as microarray data or SNP data have successfully revealed some biology pathways associated with RA. However, we found that the pathway analysis based on a single data type only provide limited understanding about the pathogenesis of RA. Gene-disease association is usually caused by many ways, such as genotype, gene expression and so on. Therefore, the integrative analysis method combining multiple levels of evidence can more precisely and comprehensively identify the pathway associations. In this study, we performed a pathway analysis by integrating GWAS and gene expression analysis to detect the RA-related pathways. The integrative analysis identified 28 pathways associated with RA. Among these pathways, 18 pathways were also found by both GWAS and gene expression analysis, 7 pathways are novel RA-related pathways, such as B cell receptor signaling pathway, Toll-like receptor signaling pathway, Fc gamma R-mediated phagocytosis and so on. Compared with pathway analyses using only one type genomic data, we found integrative analysis can increase the power to identify the real associations and provided more stable and accurate results. We believe these results will contribute to perform future genetic studies in RA pathogenesis and may promote the development of new therapeutic strategies by targeting these pathways. PMID:26885899

  20. Integrated polyoma genomes in inducible permissive transformed cells.

    PubMed Central

    Chartrand, P; Gusew-Chartrand, N; Bourgaux, P

    1981-01-01

    Using the approach described by Botchan, Topp, and Sambrook (Cell 9:269-287, 1976), we analyzed the organization of the integrated viral sequences in five clonal isolates from the same permissive, inducible cell line (Cyp line) transformed by the tsP155 mutant of polyoma virus. In all five clones, viral sequences were found that could be assigned to a common integration site, as they were joined to the cellular DNA in the same fashion in every instance. However, the sequences comprised between these points differed markedly from clone to clone, as if cell propagation had been accompanied by amplification or recombination or both within the viral insertion. When the clones were compared, no correlation could be found between the abundance, or the organization, of the integrated viral sequences and the amount, or the nature, of the free viral DNA molecules produced during induction. Altogether, our findings suggest that specific events, occurring during either the excision or the subsequent replication of the integrated viral sequences, are responsible for the predominant production of nondefective viral DNA molecules by permissive transformed cells, such as Cyp cells. Images PMID:6268808

  1. Integrative genomics--a basic and essential tool for the development of molecular medicine.

    PubMed

    Ostrowski, Jerzy

    2008-01-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, and usually on the scale of single genes. Medicine in the post-genomic era will utilize thousands of molecular markers associated with disease that are provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical analyses and bioinformatic modeling of biological systems. The collecting, cataloging and comparison of data from molecular studies and the subsequent development of conclusions create the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm called integrative genomics. PMID:19172842

  2. A multiscale statistical mechanical framework integrates biophysical and genomic data to assemble cancer networks

    PubMed Central

    Jenney, Anne; MacBeath, Gavin; Sorger, Peter K.

    2014-01-01

    Functional interpretation of genomic variation is critical to understanding human disease but it remains difficult to predict the effects of specific mutations on protein interaction networks and the phenotypes they regulate. We describe an analytical framework based on multiscale statistical mechanics that integrates genomic and biophysical data to model the human SH2-phosphoprotein network in normal and cancer cells. We apply our approach to data in The Cancer Genome Atlas (TCGA) and test model predictions experimentally. We find that mutations in phosphoproteins often create new interactions but that mutations in SH2 domains result almost exclusively in loss of interactions. Some of these mutations eliminate all interactions but many cause more selective loss, thereby rewiring specific edges in highly connected subnetworks. Moreover, idiosyncratic mutations appear to be as functionally consequential as recurrent mutations. By synthesizing genomic, structural, and biochemical data our framework represents a new approach to the interpretation of genetic variation. PMID:25362484

  3. Integrating genomic and clinical medicine: Searching for susceptibility genes in complex lung diseases

    PubMed Central

    DESAI, ANKIT A.; HYSI, PIRRO; GARCIA, JOE G. N.

    2011-01-01

    The integration of molecular, genomic, and clinical medicine in the post-genome era provides the promise of novel information on genetic variation and pathophysiologic cascades. The current challenge is to translate these discoveries rapidly into viable biomarkers that identify susceptible populations and into the development of precisely targeted therapies. In this article, we describe the application of comparative genomics, microarray platforms, genetic epidemiology, statistical genetics, and bioinformatic approaches within examples of complex pulmonary pathobiology. Our search for candidate genes, which are gene variations that drive susceptibility to and severity of enigmatic acute and chronic lung disorders, provides a logical framework to understand better the evolution of genomic medicine. The dissection of the genetic basis of complex diseases and the development of highly individualized therapies remain lofty but achievable goals. PMID:18355765

  4. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    PubMed

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076

  5. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

    PubMed Central

    Irizarry, Kristopher J. L.; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L.; Barrett, Gini; Barr, Margaret C.

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076

  6. Integrative pathway analysis of genome-wide association studies and gene expression data in prostate cancer

    PubMed Central

    2012-01-01

    Background Pathway analysis of large-scale omics data assists us with the examination of the cumulative effects of multiple functionally related genes, which are difficult to detect using the traditional single gene/marker analysis. So far, most of the genomic studies have been conducted in a single domain, e.g., by genome-wide association studies (GWAS) or microarray gene expression investigation. A combined analysis of disease susceptibility genes across multiple platforms at the pathway level is an urgent need because it can reveal more reliable and more biologically important information. Results We performed an integrative pathway analysis of a GWAS dataset and a microarray gene expression dataset in prostate cancer. We obtained a comprehensive pathway annotation set from knowledge-based public resources, including KEGG pathways and the prostate cancer candidate gene set, and gene sets specifically defined based on cross-platform information. By leveraging on this pathway collection, we first searched for significant pathways in the GWAS dataset using four methods, which represent two broad groups of pathway analysis approaches. The significant pathways identified by each method varied greatly, but the results were more consistent within each method group than between groups. Next, we conducted a gene set enrichment analysis of the microarray gene expression data and found 13 pathways with cross-platform evidence, including "Fc gamma R-mediated phagocytosis" (PGWAS = 0.003, Pexpr < 0.001, and Pcombined = 6.18 × 10-8), "regulation of actin cytoskeleton" (PGWAS = 0.003, Pexpr = 0.009, and Pcombined = 3.34 × 10-4), and "Jak-STAT signaling pathway" (PGWAS = 0.001, Pexpr = 0.084, and Pcombined = 8.79 × 10-4). Conclusions Our results provide evidence at both the genetic variation and expression levels that several key pathways might have been involved in the pathological development of prostate cancer. Our framework that employs gene expression data to facilitate

  7. Comparative Genomics Reveal Extensive Transposon-Mediated Genomic Plasticity and Diversity among Potential Effector Proteins within the Genus Coxiella▿ †

    PubMed Central

    Beare, Paul A.; Unsworth, Nathan; Andoh, Masako; Voth, Daniel E.; Omsland, Anders; Gilk, Stacey D.; Williams, Kelly P.; Sobral, Bruno W.; Kupko, John J.; Porcella, Stephen F.; Samuel, James E.; Heinzen, Robert A.

    2009-01-01

    Genetically distinct isolates of Coxiella burnetii, the cause of human Q fever, display different phenotypes with respect to in vitro infectivity/cytopathology and pathogenicity for laboratory animals. Moreover, correlations between C. burnetii genomic groups and human disease presentation (acute versus chronic) have been described, suggesting that isolates have distinct virulence characteristics. To provide a more-complete understanding of C. burnetii's genetic diversity, evolution, and pathogenic potential, we deciphered the whole-genome sequences of the K (Q154) and G (Q212) human chronic endocarditis isolates and the naturally attenuated Dugway (5J108-111) rodent isolate. Cross-genome comparisons that included the previously sequenced Nine Mile (NM) reference isolate (RSA493) revealed both novel gene content and disparate collections of pseudogenes that may contribute to isolate virulence and other phenotypes. While C. burnetii genomes are highly syntenous, recombination between abundant insertion sequence (IS) elements has resulted in genome plasticity manifested as chromosomal rearrangement of syntenic blocks and DNA insertions/deletions. The numerous IS elements, genomic rearrangements, and pseudogenes of C. burnetii isolates are consistent with genome structures of other bacterial pathogens that have recently emerged from nonpathogens with expanded niches. The observation that the attenuated Dugway isolate has the largest genome with the fewest pseudogenes and IS elements suggests that this isolate's lineage is at an earlier stage of pathoadaptation than the NM, K, and G lineages. PMID:19047403

  8. Barriers and potential solutions for Critical Zone data integration between environmental genomics and the geosciences

    NASA Astrophysics Data System (ADS)

    Aronson, E. L.; Meyer, F.; Packman, A. I.; Mayorga, E.

    2015-12-01

    The Earth's permeable near-surface layer from bedrock to canopy is referred to as the Critical Zone (CZ). Integration of bio- and geoscience data is critical for understanding physical, biological and chemical interactions in the CZ. Genomic and meta-genomic scientists study organisms both in laboratory settings and in the environment, in order to understand the interactions of organisms with the environment. Geoscientists are using environmental data to describe and model dynamics of physical and chemical properties. Yet, there is no agreed upon method for integrating genomic and environmental data to address interactions of living and non-living components of the CZ. There are standards for data interchange being developed in the geosciences and genomics sciences, via standards organization such as the Open Geospatial Consortium (OGC), as well as by research communities in biogeochemistry, hydrology, climatology, and other fields. These are in parallel to, but typically not in coordination with the standards the Genomics Standards Consortium (GSC) is developing for genomics. In addition, efforts are being made to allow for intercompatability of these CZ data with data generated by NEON, Inc. The interoperability of these types of data is limited with current software and cyberinfrastructure. A group of CZ geoscientists, environmental genomic scientists and cyberinfrastructure scientists are coming together to develop a set of common data collection and integration methods and sets of common standards. The data generated by this effort across multiple CZ sites (including the US CZ Observatories, or CZOs) around the world, along with NEON facility data, will be used to test EarthCube (an NSF initiative to develop cyberinfrastructure for the geosciences) cyberinfrastructure, with the goal of bridging this gap in standards and interoperability. Potential solutions to these issues of interoperability will be presented, and a way forward will be described.

  9. High-dimensional genomic data bias correction and data integration using MANCIE

    PubMed Central

    Zang, Chongzhi; Wang, Tao; Deng, Ke; Li, Bo; Hu, Sheng'en; Qin, Qian; Xiao, Tengfei; Zhang, Shihua; Meyer, Clifford A.; He, Housheng Hansen; Brown, Myles; Liu, Jun S.; Xie, Yang; Liu, X. Shirley

    2016-01-01

    High-dimensional genomic data analysis is challenging due to noises and biases in high-throughput experiments. We present a computational method matrix analysis and normalization by concordant information enhancement (MANCIE) for bias correction and data integration of distinct genomic profiles on the same samples. MANCIE uses a Bayesian-supported principal component analysis-based approach to adjust the data so as to achieve better consistency between sample-wise distances in the different profiles. MANCIE can improve tissue-specific clustering in ENCODE data, prognostic prediction in Molecular Taxonomy of Breast Cancer International Consortium and The Cancer Genome Atlas data, copy number and expression agreement in Cancer Cell Line Encyclopedia data, and has broad applications in cross-platform, high-dimensional data integration. PMID:27072482

  10. High-dimensional genomic data bias correction and data integration using MANCIE.

    PubMed

    Zang, Chongzhi; Wang, Tao; Deng, Ke; Li, Bo; Hu, Sheng'en; Qin, Qian; Xiao, Tengfei; Zhang, Shihua; Meyer, Clifford A; He, Housheng Hansen; Brown, Myles; Liu, Jun S; Xie, Yang; Liu, X Shirley

    2016-01-01

    High-dimensional genomic data analysis is challenging due to noises and biases in high-throughput experiments. We present a computational method matrix analysis and normalization by concordant information enhancement (MANCIE) for bias correction and data integration of distinct genomic profiles on the same samples. MANCIE uses a Bayesian-supported principal component analysis-based approach to adjust the data so as to achieve better consistency between sample-wise distances in the different profiles. MANCIE can improve tissue-specific clustering in ENCODE data, prognostic prediction in Molecular Taxonomy of Breast Cancer International Consortium and The Cancer Genome Atlas data, copy number and expression agreement in Cancer Cell Line Encyclopedia data, and has broad applications in cross-platform, high-dimensional data integration. PMID:27072482

  11. Ori-Finder 2, an integrated tool to predict replication origins in the archaeal genomes

    PubMed Central

    Luo, Hao; Zhang, Chun-Ting; Gao, Feng

    2014-01-01

    DNA replication is one of the most basic processes in all three domains of cellular life. With the advent of the post-genomic era, the increasing number of complete archaeal genomes has created an opportunity for exploration of the molecular mechanisms for initiating cellular DNA replication by in vivo experiments as well as in silico analysis. However, the location of replication origins (oriCs) in many sequenced archaeal genomes remains unknown. We present a web-based tool Ori-Finder 2 to predict oriCs in the archaeal genomes automatically, based on the integrated method comprising the analysis of base composition asymmetry using the Z-curve method, the distribution of origin recognition boxes identified by FIMO tool, and the occurrence of genes frequently close to oriCs. The web server is also able to analyze the unannotated genome sequences by integrating with gene prediction pipelines and BLAST software for gene identification and function annotation. The result of the predicted oriCs is displayed as an HTML table, which offers an intuitive way to browse the result in graphical and tabular form. The software presented here is accurate for the genomes with single oriC, but it does not necessarily find all the origins of replication for the genomes with multiple oriCs. Ori-Finder 2 aims to become a useful platform for the identification and analysis of oriCs in the archaeal genomes, which would provide insight into the replication mechanisms in archaea. The web server is freely available at http://tubic.tju.edu.cn/Ori-Finder2/. PMID:25309521

  12. Microhomology-mediated end-joining-dependent integration of donor DNA in cells and animals using TALENs and CRISPR/Cas9

    PubMed Central

    Nakade, Shota; Tsubota, Takuya; Sakane, Yuto; Kume, Satoshi; Sakamoto, Naoaki; Obara, Masanobu; Daimon, Takaaki; Sezutsu, Hideki; Yamamoto, Takashi; Sakuma, Tetsushi; Suzuki, Ken-ichi T.

    2014-01-01

    Genome engineering using programmable nucleases enables homologous recombination (HR)-mediated gene knock-in. However, the labour used to construct targeting vectors containing homology arms and difficulties in inducing HR in some cell type and organisms represent technical hurdles for the application of HR-mediated knock-in technology. Here, we introduce an alternative strategy for gene knock-in using transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) mediated by microhomology-mediated end-joining, termed the PITCh (Precise Integration into Target Chromosome) system. TALEN-mediated PITCh, termed TAL-PITCh, enables efficient integration of exogenous donor DNA in human cells and animals, including silkworms and frogs. We further demonstrate that CRISPR/Cas9-mediated PITCh, termed CRIS-PITCh, can be applied in human cells without carrying the plasmid backbone sequence. Thus, our PITCh-ing strategies will be useful for a variety of applications, not only in cultured cells, but also in various organisms, including invertebrates and vertebrates. PMID:25410609

  13. Filling the knowledge gap: Integrating quantitative genetics and genomics in graduate education and outreach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomics revolution provides vital tools to address global food security. Yet to be incorporated into livestock breeding, molecular techniques need to be integrated into a quantitative genetics framework. Within the U.S., with shrinking faculty numbers with the requisite skills, the capacity to ...

  14. Integrated and translational genomics for analysis of complex traits in crops

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report here on integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of translating gems from these resources into useable DNA markers in the ...

  15. Unexpected Inheritance: Multiple Integrations of Ancient Bornavirus and Ebolavirus/Marburgvirus Sequences in Vertebrate Genomes

    PubMed Central

    Belyi, Vladimir A.; Levine, Arnold J.; Skalka, Anna Marie

    2010-01-01

    Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological

  16. Integration of genomic information in the clinical management of HCC.

    PubMed

    Quetglas, Iris M; Moeini, Agrin; Pinyol, Roser; Llovet, Josep M

    2014-10-01

    Molecular profiling of hepatocellular carcinoma (HCC) is enabling the advancement of novel approaches to disease diagnosis and management. Accurate prognosis prediction in HCC is specially critical. Clinical staging systems for HCC support clinical decision-making (e.g., BCLC algorithm) might be complemented by molecular-based information in the near future. Molecular signatures derived from tumour and non-tumour samples are associated with patient recurrence an outcome. Single nucleotide polymorphisms have been linked with HCC development. Next generation sequencing studies have brought to light the genomic diversity of this disease. Gens recurrently altered in HCC and susceptible to be targeted belong to signalling pathways including telomere maintenance, cell cycle, chromatin remodelling, Wnt/beta-catenin, RAS/RAF/MAPK and PI3K/AKT/mTOR pathways. Oncogenic loops are unknown but might include some of the already discovered aberrations. Despite the intratumoral heterogeneity observed in HCC tumours, studies including large number of samples can identify key genetic drivers and contribute to the development of novel treatments and a personalized medicine. PMID:25260311

  17. An integrative functional genomics approach for discovering biomarkers in schizophrenia

    PubMed Central

    Mamdani, Firoza; Macciardi, Fabio

    2011-01-01

    Schizophrenia (SZ) is a complex disorder resulting from both genetic and environmental causes with a lifetime prevalence world-wide of 1%; however, there are no specific, sensitive and validated biomarkers for SZ. A general unifying hypothesis has been put forward that disease-associated single nucleotide polymorphisms (SNPs) from genome-wide association study (GWAS) are more likely to be associated with gene expression quantitative trait loci (eQTL). We will describe this hypothesis and review primary methodology with refinements for testing this paradigmatic approach in SZ. We will describe biomarker studies of SZ and testing enrichment of SNPs that are associated both with eQTLs and existing GWAS of SZ. SZ-associated SNPs that overlap with eQTLs can be placed into gene–gene expression, protein–protein and protein–DNA interaction networks. Further, those networks can be tested by reducing/silencing the gene expression levels of critical nodes. We present pilot data to support these methods of investigation such as the use of eQTLs to annotate GWASs of SZ, which could be applied to the field of biomarker discovery. Those networks that have association with SNP markers, especially cis-regulated expression, might lead to a more clear understanding of important candidate genes that predispose to disease and alter expression. This method has general application to many complex disorders. PMID:22155586

  18. Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants

    PubMed Central

    Du, Jiang; Bjornson, Robert D.; Zhang, Zhengdong D.; Kong, Yong; Snyder, Michael; Gerstein, Mark B.

    2009-01-01

    The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at

  19. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    PubMed

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week. PMID:27031497

  20. Hexokinase-2-mediated aerobic glycolysis is integral to cerebellar neurogenesis and pathogenesis of medulloblastoma

    PubMed Central

    2013-01-01

    Background While aerobic glycolysis is linked to unconstrained proliferation in cancer, less is known about its physiological role. Why this metabolic program that promotes tumor growth is preserved in the genome has thus been unresolved. We tested the hypothesis that aerobic glycolysis derives from developmental processes that regulate rapid proliferation. Methods We performed an integrated analysis of metabolism and gene expression in cerebellar granule neuron progenitors (CGNPs) with and without Sonic Hedgehog (Shh), their endogenous mitogen. Because our analysis highlighted Hexokinase-2 (Hk2) as a key metabolic regulator induced by Shh, we studied the effect of conditional genetic Hk2 deletion in CGNP development. We then crossed Hk2 conditional knockout mice with transgenic SmoM2 mice that develop spontaneous medulloblastoma and determined changes in SmoM2-driven tumorigenesis. Results We show that Shh and phosphoinositide 3-kinase (PI3K) signaling combine to induce an Hk2-dependent glycolytic phenotype in CGNPs. This phenotype is recapitulated in medulloblastoma, a malignant tumor of CGNP origin. Importantly, cre-mediated ablation of Hk2 abrogated aerobic glycolysis, disrupting CGNP development and Smoothened-induced tumorigenesis. Comparing tumorigenesis in medulloblastoma-prone SmoM2 mice with and without functional Hk2, we demonstrate that loss of aerobic glycolysis reduces the aggressiveness of medulloblastoma, causing tumors to grow as indolent lesions and allowing long-term survival of tumor bearing mice. Conclusions Our investigations demonstrate that aerobic glycolysis in cancer derives from developmental mechanisms that persist in tumorigenesis. Moreover, we demonstrate in a primary tumor model the anti-cancer potential of blocking aerobic glycolysis by targeting Hk2. See commentary article:http://www.biomedcentral.com/1741-7007/11/3 PMID:24280485

  1. The REST remodeling complex protects genomic integrity during embryonic neurogenesis

    PubMed Central

    Nechiporuk, Tamilla; McGann, James; Mullendorff, Karin; Hsieh, Jenny; Wurst, Wolfgang; Floss, Thomas; Mandel, Gail

    2016-01-01

    The timely transition from neural progenitor to post-mitotic neuron requires down-regulation and loss of the neuronal transcriptional repressor, REST. Here, we have used mice containing a gene trap in the Rest gene, eliminating transcription from all coding exons, to remove REST prematurely from neural progenitors. We find that catastrophic DNA damage occurs during S-phase of the cell cycle, with long-term consequences including abnormal chromosome separation, apoptosis, and smaller brains. Persistent effects are evident by latent appearance of proneural glioblastoma in adult mice deleted additionally for the tumor suppressor p53 protein (p53). A previous line of mice deleted for REST in progenitors by conventional gene targeting does not exhibit these phenotypes, likely due to a remaining C-terminal peptide that still binds chromatin and recruits co-repressors. Our results suggest that REST-mediated chromatin remodeling is required in neural progenitors for proper S-phase dynamics, as part of its well-established role in repressing neuronal genes until terminal differentiation. DOI: http://dx.doi.org/10.7554/eLife.09584.001 PMID:26745185

  2. Malaria parasites utilize both homologous recombination and alternative end joining pathways to maintain genome integrity

    PubMed Central

    Kirkman, Laura A.; Lawrence, Elizabeth A.; Deitsch, Kirk W.

    2014-01-01

    Malaria parasites replicate asexually within their mammalian hosts as haploid cells and are subject to DNA damage from the immune response and chemotherapeutic agents that can significantly disrupt genomic integrity. Examination of the annotated genome of the parasite Plasmodium falciparum identified genes encoding core proteins required for the homologous recombination (HR) pathway for repairing DNA double-strand breaks (DSBs), but surprisingly none of the components of the canonical non-homologous end joining (C-NHEJ) pathway were identified. To better understand how malaria parasites repair DSBs and maintain genome integrity, we modified the yeast I-SceI endonuclease system to generate inducible, site-specific DSBs within the parasite’s genome. Analysis of repaired genomic DNA showed that parasites possess both a typical HR pathway resulting in gene conversion events as well as an end joining (EJ) pathway for repair of DSBs when no homologous sequence is available. The products of EJ were limited in number and identical products were observed in multiple independent experiments. The repair junctions frequently contained short insertions also found in the surrounding sequences, suggesting the possibility of a templated repair process. We propose that an alternative end-joining pathway rather than C-NHEJ, serves as a primary method for repairing DSBs in malaria parasites. PMID:24089143

  3. Annotation of the zebrafish genome through an integrated transcriptomic and proteomic analysis.

    PubMed

    Kelkar, Dhanashree S; Provost, Elayne; Chaerkady, Raghothama; Muthusamy, Babylakshmi; Manda, Srikanth S; Subbannayya, Tejaswini; Selvan, Lakshmi Dhevi N; Wang, Chieh-Huei; Datta, Keshava K; Woo, Sunghee; Dwivedi, Sutopa B; Renuse, Santosh; Getnet, Derese; Huang, Tai-Chung; Kim, Min-Sik; Pinto, Sneha M; Mitchell, Christopher J; Madugundu, Anil K; Kumar, Praveen; Sharma, Jyoti; Advani, Jayshree; Dey, Gourav; Balakrishnan, Lavanya; Syed, Nazia; Nanjappa, Vishalakshi; Subbannayya, Yashwanth; Goel, Renu; Prasad, T S Keshava; Bafna, Vineet; Sirdeshmukh, Ravi; Gowda, Harsha; Wang, Charles; Leach, Steven D; Pandey, Akhilesh

    2014-11-01

    Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼ 69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes. PMID:25060758

  4. Neuroscience Data Integration through Mediation: An (F)BIRN Case Study

    PubMed Central

    Ashish, Naveen; Ambite, José Luis; Muslea, Maria; Turner, Jessica A.

    2010-01-01

    We describe an application of the BIRN mediator to the integration of neuroscience experimental data sources. The BIRN mediator is a general purpose solution to the problem of providing integrated, semantically-consistent access to biomedical data from multiple, distributed, heterogeneous data sources. The system follows the mediation approach, where the data remains at the sources, providers maintain control of the data, and the integration system retrieves data from the sources in real-time in response to client queries. Our aim with this paper is to illustrate how domain-specific data integration applications can be developed quickly and in a principled way by using our general mediation technology. We describe in detail the integration of two leading, but radically different, experimental neuroscience sources, namely, the human imaging database, a relational database, and the eXtensible neuroimaging archive toolkit, an XML web services system. We discuss the steps, sources of complexity, effort, and time required to build such applications, as well as outline directions of ongoing and future research on biomedical data integration. PMID:21228907

  5. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    PubMed

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ). PMID:26519403

  6. The integrated model of sport confidence: a canonical correlation and mediational analysis.

    PubMed

    Koehn, Stefan; Pearce, Alan J; Morris, Tony

    2013-12-01

    The main purpose of the study was to examine crucial parts of Vealey's (2001) integrated framework hypothesizing that sport confidence is a mediating variable between sources of sport confidence (including achievement, self-regulation, and social climate) and athletes' affect in competition. The sample consisted of 386 athletes, who completed the Sources of Sport Confidence Questionnaire, Trait Sport Confidence Inventory, and Dispositional Flow Scale-2. Canonical correlation analysis revealed a confidence-achievement dimension underlying flow. Bias-corrected bootstrap confidence intervals in AMOS 20.0 were used in examining mediation effects between source domains and dispositional flow. Results showed that sport confidence partially mediated the relationship between achievement and self-regulation domains and flow, whereas no significant mediation was found for social climate. On a subscale level, full mediation models emerged for achievement and flow dimensions of challenge-skills balance, clear goals, and concentration on the task at hand. PMID:24334324

  7. Personalised Medicine Possible With Real-Time Integration of Genomic and Clinical Data To Inform Clinical Decision-Making.

    PubMed

    Martin-Sanchez, Fernando; Turner, Maureen; Johnstone, Alice; Heffer, Leon; Rafael, Naomi; Bakker, Tim; Thorne, Natalie; Macciocca, Ivan; Gaff, Clara

    2015-01-01

    Despite widespread use of genomic sequencing in research, there are gaps in our understanding of the performance and provision of genomic sequencing in clinical practice. The Melbourne Genomics Health Alliance (the Alliance), has been established to determine the feasibility, performance and impact of using genomic sequencing as a diagnostic tool. The Alliance has partnered with BioGrid Australia to enable the linkage of genomic sequencing, clinical treatment and outcome data for this project. This integrated dataset of genetic, clinical and patient sourced information will be used by the Alliance to evaluate the potential diagnostic value of genomic sequencing in routine clinical practice. This project will allow the Alliance to provide recommendations to facilitate the integration of genomic sequencing into clinical practice to enable personalised disease treatment. PMID:26262351

  8. Integrated circoviral rep-like sequences in the genome of cyprinid fish.

    PubMed

    Fehér, Enikő; Székely, Csaba; Lőrincz, Márta; Cech, Gábor; Tuboly, Tamás; Singh, Hridaya Shanker; Bányai, Krisztián; Farkas, Szilvia L

    2013-10-01

    Recently a new group of circoviruses have been detected in tissues of Barbel fish and European catfish in Hungary. In our study circovirus genomes were screened in eight additional fish species for the detection and characterization of circoviruses. Two species of these bore circoviral sequences based on conventional PCR assay targeting the replication-associated protein coding gene fragments. Interestingly, the methods successfully used before failed to amplify other parts of the circular viral genome, suggesting the presence of partial, integrated genetic elements in the genome of the host. The successfully sequenced fragments of the Indian rohu (Labeo rohita) encoded mutations which may cause frameshifts or termination in the coding region described previously in other vertebrates. Phylogenetic analyses presumed that integration of the viral genetic elements might have progressed concurrently or following the diversification of cyprinid fish. Further studies on the nature of whole circovirus genomes and integrated elements may help to understand their potential role and evolution in different fish species. PMID:23780219

  9. Annotating novel genes by integrating synthetic lethals and genomic information

    PubMed Central

    Schöner, Daniel; Kalisch, Markus; Leisner, Christian; Meier, Lukas; Sohrmann, Marc; Faty, Mahamadou; Barral, Yves; Peter, Matthias; Gruissem, Wilhelm; Bühlmann, Peter

    2008-01-01

    Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W) as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process. PMID:18194531

  10. Pattern discovery and cancer gene identification in integrated cancer genomic data

    PubMed Central

    Mo, Qianxing; Wang, Sijian; Seshan, Venkatraman E.; Olshen, Adam B.; Schultz, Nikolaus; Sander, Chris; Powers, R. Scott; Ladanyi, Marc; Shen, Ronglai

    2013-01-01

    Large-scale integrated cancer genome characterization efforts including the cancer genome atlas and the cancer cell line encyclopedia have created unprecedented opportunities to study cancer biology in the context of knowing the entire catalog of genetic alterations. A clinically important challenge is to discover cancer subtypes and their molecular drivers in a comprehensive genetic context. Curtis et al. [Nature (2012) 486(7403):346–352] has recently shown that integrative clustering of copy number and gene expression in 2,000 breast tumors reveals novel subgroups beyond the classic expression subtypes that show distinct clinical outcomes. To extend the scope of integrative analysis for the inclusion of somatic mutation data by massively parallel sequencing, we propose a framework for joint modeling of discrete and continuous variables that arise from integrated genomic, epigenomic, and transcriptomic profiling. The core idea is motivated by the hypothesis that diverse molecular phenotypes can be predicted by a set of orthogonal latent variables that represent distinct molecular drivers, and thus can reveal tumor subgroups of biological and clinical importance. Using the cancer cell line encyclopedia dataset, we demonstrate our method can accurately group cell lines by their cell-of-origin for several cancer types, and precisely pinpoint their known and potential cancer driver genes. Our integrative analysis also demonstrates the power for revealing subgroups that are not lineage-dependent, but consist of different cancer types driven by a common genetic alteration. Application of the cancer genome atlas colorectal cancer data reveals distinct integrated tumor subtypes, suggesting different genetic pathways in colon cancer progression. PMID:23431203

  11. New Insights into the Classification and Integration Specificity of Streptococcus Integrative Conjugative Elements through Extensive Genome Exploration

    PubMed Central

    Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie

    2016-01-01

    Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent. PMID:26779141

  12. Transcriptional stalling in B-lymphocytes: a mechanism for antibody diversification and maintenance of genomic integrity.

    PubMed

    Sun, Jianbo; Rothschild, Gerson; Pefanis, Evangelos; Basu, Uttiya

    2013-01-01

    B cells utilize three DNA alteration strategies-V(D)J recombination, somatic hypermutation (SHM) and class switch recombination (CSR)-to somatically mutate their genome, thereby expressing a plethora of antibodies tailor-made against the innumerable antigens they encounter while in circulation. Of these three events, the single-strand DNA cytidine deaminase, Activation Induced cytidine Deaminase (AID), is responsible for SHM and CSR. Recent advances, discussed in this review article, point toward various components of RNA polymerase II "stalling" machinery as regulators of AID activity during antibody diversification and maintenance of B cell genome integrity. PMID:23584095

  13. Interplay between arginine methylation and ubiquitylation regulates KLF4-mediated genome stability and carcinogenesis

    PubMed Central

    Hu, Dong; Gur, Mert; Zhou, Zhuan; Gamper, Armin; Hung, Mien-Chie; Fujita, Naoya; Lan, Li; Bahar, Ivet; Wan, Yong

    2015-01-01

    KLF4 is an important regulator of cell-fate decision, including DNA damage response and apoptosis. We identify a novel interplay between protein modifications in regulating KLF4 function. Here we show that arginine methylation of KLF4 by PRMT5 inhibits KLF4 ubiquitylation by VHL and thereby reduces KLF4 turnover, resulting in the elevation of KLF4 protein levels concomitant with increased transcription of KLF4-dependent p21 and reduced expression of KLF4-repressed Bax. Structure-based modelling and simulations provide insight into the molecular mechanisms of KLF4 recognition and catalysis by PRMT5. Following genotoxic stress, disruption of PRMT5-mediated KLF4 methylation leads to abrogation of KLF4 accumulation, which, in turn, attenuates cell cycle arrest. Mutating KLF4 methylation sites suppresses breast tumour initiation and progression, and immunohistochemical stain shows increased levels of both KLF4 and PRMT5 in breast cancer tissues. Taken together, our results point to a critical role for aberrant KLF4 regulation by PRMT5 in genome stability and breast carcinogenesis. PMID:26420673

  14. Interplay between arginine methylation and ubiquitylation regulates KLF4-mediated genome stability and carcinogenesis.

    PubMed

    Hu, Dong; Gur, Mert; Zhou, Zhuan; Gamper, Armin; Hung, Mien-Chie; Fujita, Naoya; Lan, Li; Bahar, Ivet; Wan, Yong

    2015-01-01

    KLF4 is an important regulator of cell-fate decision, including DNA damage response and apoptosis. We identify a novel interplay between protein modifications in regulating KLF4 function. Here we show that arginine methylation of KLF4 by PRMT5 inhibits KLF4 ubiquitylation by VHL and thereby reduces KLF4 turnover, resulting in the elevation of KLF4 protein levels concomitant with increased transcription of KLF4-dependent p21 and reduced expression of KLF4-repressed Bax. Structure-based modelling and simulations provide insight into the molecular mechanisms of KLF4 recognition and catalysis by PRMT5. Following genotoxic stress, disruption of PRMT5-mediated KLF4 methylation leads to abrogation of KLF4 accumulation, which, in turn, attenuates cell cycle arrest. Mutating KLF4 methylation sites suppresses breast tumour initiation and progression, and immunohistochemical stain shows increased levels of both KLF4 and PRMT5 in breast cancer tissues. Taken together, our results point to a critical role for aberrant KLF4 regulation by PRMT5 in genome stability and breast carcinogenesis. PMID:26420673

  15. Genome-wide analysis of FOXO3 mediated transcription regulation through RNA polymerase II profiling.

    PubMed

    Eijkelenboom, Astrid; Mokry, Michal; de Wit, Elzo; Smits, Lydia M; Polderman, Paulien E; van Triest, Miranda H; van Boxtel, Ruben; Schulze, Almut; de Laat, Wouter; Cuppen, Edwin; Burgering, Boudewijn M T

    2013-01-01

    Forkhead box O (FOXO) transcription factors are key players in diverse cellular processes affecting tumorigenesis, stem cell maintenance and lifespan. To gain insight into the mechanisms of FOXO-regulated target gene expression, we studied genome-wide effects of FOXO3 activation. Profiling RNA polymerase II changes shows that FOXO3 regulates gene expression through transcription initiation. Correlative analysis of FOXO3 and RNA polymerase II ChIP-seq profiles demonstrates FOXO3 to act as a transcriptional activator. Furthermore, this analysis reveals a significant part of FOXO3 gene regulation proceeds through enhancer regions. FOXO3 binds to pre-existing enhancers and further activates these enhancers as shown by changes in histone acetylation and RNA polymerase II recruitment. In addition, FOXO3-mediated enhancer activation correlates with regulation of adjacent genes and pre-existence of chromatin loops between FOXO3 bound enhancers and target genes. Combined, our data elucidate how FOXOs regulate gene transcription and provide insight into mechanisms by which FOXOs can induce different gene expression programs depending on chromatin architecture. PMID:23340844

  16. Genomic Access to Monarch Migration Using TALEN and CRISPR/Cas9-Mediated Targeted Mutagenesis.

    PubMed

    Markert, Matthew J; Zhang, Ying; Enuameh, Metewo S; Reppert, Steven M; Wolfe, Scot A; Merlin, Christine

    2016-01-01

    The eastern North American monarch butterfly, Danaus plexippus, is an emerging model system to study the neural, molecular, and genetic basis of animal long-distance migration and animal clockwork mechanisms. While genomic studies have provided new insight into migration-associated and circadian clock genes, the general lack of simple and versatile reverse-genetic methods has limited in vivo functional analysis of candidate genes in this species. Here, we report the establishment of highly efficient and heritable gene mutagenesis methods in the monarch butterfly using transcriptional activator-like effector nucleases (TALENs) and CRISPR-associated RNA-guided nuclease Cas9 (CRISPR/Cas9). Using two clock gene loci, cryptochrome 2 and clock (clk), as candidates, we show that both TALENs and CRISPR/Cas9 generate high-frequency nonhomologous end-joining (NHEJ)-mediated mutations at targeted sites (up to 100%), and that injecting fewer than 100 eggs is sufficient to recover mutant progeny and generate monarch knockout lines in about 3 months. Our study also genetically defines monarch CLK as an essential component of the transcriptional activation complex of the circadian clock. The methods presented should not only greatly accelerate functional analyses of many aspects of monarch biology, but are also anticipated to facilitate the development of these tools in other nontraditional insect species as well as the development of homology-directed knock-ins. PMID:26837953

  17. Genomic Access to Monarch Migration Using TALEN and CRISPR/Cas9-Mediated Targeted Mutagenesis

    PubMed Central

    Markert, Matthew J.; Zhang, Ying; Enuameh, Metewo S.; Reppert, Steven M.; Wolfe, Scot A.; Merlin, Christine

    2016-01-01

    The eastern North American monarch butterfly, Danaus plexippus, is an emerging model system to study the neural, molecular, and genetic basis of animal long-distance migration and animal clockwork mechanisms. While genomic studies have provided new insight into migration-associated and circadian clock genes, the general lack of simple and versatile reverse-genetic methods has limited in vivo functional analysis of candidate genes in this species. Here, we report the establishment of highly efficient and heritable gene mutagenesis methods in the monarch butterfly using transcriptional activator-like effector nucleases (TALENs) and CRISPR-associated RNA-guided nuclease Cas9 (CRISPR/Cas9). Using two clock gene loci, cryptochrome 2 and clock (clk), as candidates, we show that both TALENs and CRISPR/Cas9 generate high-frequency nonhomologous end-joining (NHEJ)-mediated mutations at targeted sites (up to 100%), and that injecting fewer than 100 eggs is sufficient to recover mutant progeny and generate monarch knockout lines in about 3 months. Our study also genetically defines monarch CLK as an essential component of the transcriptional activation complex of the circadian clock. The methods presented should not only greatly accelerate functional analyses of many aspects of monarch biology, but are also anticipated to facilitate the development of these tools in other nontraditional insect species as well as the development of homology-directed knock-ins. PMID:26837953

  18. Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.

    PubMed

    Masseroli, Marco; Canakoglu, Arif; Ceri, Stefano

    2016-01-01

    Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists' ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Proteomic Knowledge Base (GPKB), which integrates several of the most relevant sources of such dispersed information (including Entrez Gene, UniProt, IntAct, Expasy Enzyme, GO, GOA, BioCyc, KEGG, Reactome, and OMIM). Our solution is general, as it uses a flexible, modular, and multilevel global data schema based on abstraction and generalization of integrated data features, and a set of automatic procedures for easing data integration and maintenance, also when the integrated data sources evolve in data content, structure, and number. These procedures also assure consistency, quality, and provenance tracking of all integrated data, and perform the semantic closure of the hierarchical relationships of the integrated biomedical ontologies. At http://www.bioinformatics.deib.polimi.it/GPKB/, a Web interface allows graphical easy composition of queries, although complex, on the knowledge base, supporting also semantic query expansion and comprehensive explorative search of the integrated data to better sustain biomedical knowledge extraction. PMID:27045824

  19. Exploring breast carcinogenesis through integrative genomics and epigenomics analyses.

    PubMed

    Minning, Chin; Mokhtar, Norfilza Mohd; Abdullah, Norlia; Muhammad, Rohaizak; Emran, Nor Aina; Ali, Siti Aishah M D; Harun, Roslan; Jamal, Rahman

    2014-11-01

    There have been many DNA methylation studies on breast cancer which showed various methylation patterns involving tumour suppressor genes and oncogenes but only a few of those studies link the methylation data with gene expression. More data are required especially from the Asian region and to analyse how the epigenome data correlate with the transcriptome. DNA methylation profiling was carried out on 76 fresh frozen primary breast tumour tissues and 25 adjacent non-cancerous breast tissues using the Illumina Infinium(®) HumanMethylation27 BeadChip. Validation of methylation results was performed on 7 genes using either MS-MLPA or MS-qPCR. Gene expression profiling was done on 15 breast tumours and 5 adjacent non-cancerous breast tissues using the Affymetrix GeneChip(®) Human Gene 1.0 ST array. The overlapping genes between DNA methylation and gene expression datasets were further mapped to the KEGG database to identify the molecular pathways that linked these genes together. Supervised hierarchical cluster analysis revealed 1,389 hypermethylated CpG sites and 22 hypomethylated CpG sites in cancer compared to the normal samples. Gene expression microarray analysis using a fold-change of at least 1.5 and a false discovery rate (FDR) at p>0.05 identified 404 upregulated and 463 downregulated genes in cancer samples. Integration of both datasets identified 51 genes with hypermethylation with low expression (negative association) and 13 genes with hypermethylation with high expression (positive association). Most of the overlapping genes belong to the focal adhesion and extracellular matrix-receptor interaction that play important roles in breast carcinogenesis. The present study displayed the value of using multiple datasets in the same set of tissues and how the integrative analysis can create a list of well-focused genes as well as to show the correlation between epigenetic changes and gene expression. These gene signatures can help us understand the epigenetic

  20. Integrating Hormone- and Micromolecule-Mediated Signaling with Plasmodesmal Communication.

    PubMed

    Han, Xiao; Kim, Jae-Yean

    2016-01-01

    Intercellular and supracellular communications through plasmodesmata are involved in vital processes for plant development and physiological responses. Micro- and macromolecules, including hormones, RNA, and proteins, serve as biological information vectors that traffic through the plasmodesmata between cells. Previous studies demonstrated that the plasmodesmata are elaborately regulated, whereby a long queue of multiple signaling molecules forms. However, the mechanism by which these signals are coupled or coordinated in terms of simultaneous transport in a single channel remains a puzzle. In the last few years, several phytohormones that could function as both non-cell-autonomous signals and plasmodesmal regulators have been disclosed. Plasmodesmal regulators such as auxin, salicylic acid, reactive oxygen species, gibberellic acids, chitin, and jasmonic acid could regulate intercellular trafficking by adjusting plasmodesmal permeability. Here, callose, along with β-glucan synthase and β-glucanase, plays a critical role in regulating plasmodesmal permeability. Interestingly, most of the previously identified regulators are capable of diffusing through the plasmodesmata. Given the small sizes of these molecules, the plasmodesmata are prominent intercellular channels that allow diffusion-based movement of those signaling molecules. Obviously, intercellular communication is under the control of a major mechanism, named a feedback loop, at the plasmodesmata, which mediates complicated biological behaviors. Prospective research on the mechanism of coupling micromolecules at the plasmodesmata for developmental signaling and nutrient provision will help us to understand how plants coordinate their development and photosynthetic assimilation, which is important for agriculture. PMID:26384246

  1. Integrating analysis reveals microRNA-mediated pathway crosstalk among Crohn's disease, ulcerative colitis and colorectal cancer.

    PubMed

    Bai, Jing; Li, Yongsheng; Shao, Tingting; Zhao, Zheng; Wang, Yuan; Wu, Aiwei; Chen, Hong; Li, Shengli; Jiang, Chunjie; Xu, Juan; Li, Xia

    2014-07-29

    Inflammatory bowel disease (IBD), which can increase the risk of colorectal cancer (CRC), includes two primary subtypes, ulcerative colitis (UC) and Crohn's disease (CD). Although several individual genes involved in inflammation or cancer characterization have been identified, it is still difficult to elucidate functional relationship details between the molecules underlying pathogenesis at the system level. The global effect of miRNAs on genes or their involved functions is also poorly understood. We first integrated genome-wide gene expression profiles and biological pathway information to explore the underlying associations among UC, CD and CRC at the function and gene level. After identifying the pathways regulated by miRNAs, a global map of miRNA-mediated pathway crosstalk shared by the three diseases was further constructed to vertically explain the links of three level alterations. The three types of diseases have close associations with each other at the levels of function, gene and miRNA regulation. Several key biological pathways are involved in the three diseases, related to the immune system and inflammation, metabolism, or cell proliferation and apoptosis etc. Moreover, miRNAs exhibit dominant effects on multiple pathways. It is worth noting that UC shows relatively close associations with CD and CRC at the three levels. Finally, the miRNAs could mediate the crosstalk within or between pathways. For example, hsa-miR-125b, hsa-miR-335 and hsa-miR-155 mediated the crosstalk between three metabolic pathways. The crosstalk within the Toll-like receptor signaling pathway could be mediated by hsa-miR-124, hsa-miR-146a and hsa-mir-221/222. Our results make sense for the prevention and treatment of intestinal-related chronic inflammation or cancer. PMID:24949825

  2. Transient spatiotopic integration across saccadic eye movements mediates visual stability.

    PubMed

    Cicchini, Guido M; Binda, Paola; Burr, David C; Morrone, M Concetta

    2013-02-01

    Eye movements pose major problems to the visual system, because each new saccade changes the mapping of external objects on the retina. It is known that stimuli briefly presented around the time of saccades are systematically mislocalized, whereas continuously visible objects are perceived as spatially stable even when they undergo large transsaccadic displacements. In this study we investigated the relationship between these two phenomena and measured how human subjects perceive the position of pairs of bars briefly displayed around the time of large horizontal saccades. We show that they interact strongly, with the perisaccadic bar being drawn toward the other, dramatically altering the pattern of perisaccadic mislocalization. The interaction field extends over a wide range (200 ms and 20°) and is oriented along the retinotopic trajectory of the saccade-induced motion, suggesting a mechanism that integrates pre- and postsaccadic stimuli at different retinal locations but similar external positions. We show how transient changes in spatial integration mechanisms, which are consistent with the present psychophysical results and with the properties of "remapping cells" reported in the literature, can create transient craniotopy by merging the distinct retinal images of the pre- and postsaccadic fixations to signal a single stable object. PMID:23197453

  3. Transient spatiotopic integration across saccadic eye movements mediates visual stability

    PubMed Central

    Cicchini, Guido M.; Binda, Paola; Burr, David C.

    2013-01-01

    Eye movements pose major problems to the visual system, because each new saccade changes the mapping of external objects on the retina. It is known that stimuli briefly presented around the time of saccades are systematically mislocalized, whereas continuously visible objects are perceived as spatially stable even when they undergo large transsaccadic displacements. In this study we investigated the relationship between these two phenomena and measured how human subjects perceive the position of pairs of bars briefly displayed around the time of large horizontal saccades. We show that they interact strongly, with the perisaccadic bar being drawn toward the other, dramatically altering the pattern of perisaccadic mislocalization. The interaction field extends over a wide range (200 ms and 20°) and is oriented along the retinotopic trajectory of the saccade-induced motion, suggesting a mechanism that integrates pre- and postsaccadic stimuli at different retinal locations but similar external positions. We show how transient changes in spatial integration mechanisms, which are consistent with the present psychophysical results and with the properties of “remapping cells” reported in the literature, can create transient craniotopy by merging the distinct retinal images of the pre- and postsaccadic fixations to signal a single stable object. PMID:23197453

  4. Identification of metastasis-associated genes in colorectal cancer through an integrated genomic and transcriptomic analysis

    PubMed Central

    Peng, Sihua

    2013-01-01

    Objective Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of microarray data was presented, by combined with evidence acquired from comparative genomic hybridization (CGH) data. Methods Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify metastasis-associated genes in CRC. Results A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions Our results demonstrated that integration analysis is an effective strategy for mining cancer-associated genes. PMID:24385689

  5. Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine.

    PubMed

    Ostrowski, Jerzy; Wyrwicz, Lucjan S

    2009-09-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, which are usually applied on the scale of single genes. Medicine in the postgenomic era will utilize thousands of disease-associated molecular markers provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical and bioinformatic analyses to model biological systems. Collecting, cataloging and comparing data from molecular studies, and the subsequent development of conclusions, creates the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm known as integrative genomics. PMID:19732006

  6. The Multifunctions of WD40 Proteins in Genome Integrity and Cell Cycle Progression

    PubMed Central

    Zhang, Caiguo; Zhang, Fan

    2015-01-01

    Eukaryotic genome encodes numerous WD40 repeat proteins, which generally function as platforms of protein-protein interactions and are involved in numerous biological process, such as signal transduction, gene transcriptional regulation, protein modifications, cytoskeleton assembly, vesicular trafficking, DNA damage and repair, cell death and cell cycle progression. Among these diverse functions, genome integrity maintenance and cell cycle progression are extremely important as deregulation of them is clinically linked to uncontrolled proliferative diseases such as cancer. Thus, we mainly summarize and discuss the recent understanding of WD40 proteins and their molecular mechanisms linked to genome stability and cell cycle progression in this review, thereby demonstrating their pervasiveness and importance in cellular networks. PMID:25653723

  7. Integrative genomic characterization of oral squamous cell carcinomaidentifies frequent somatic drivers

    PubMed Central

    Pickering, Curtis R.; Zhang, Jiexin; Yoo, Suk Young; Bengtsson, Linnea; Moorthy, Shhyam; Neskey, David M.; Zhao, Mei; Alves, Marcus V Ortega; Chang, Kyle; Drummond, Jennifer; Cortez, Elsa; Xie, Tong-xin; Zhang, Di; Chung, Woonbok; Issa, Jean-Pierre J.; Zweidler-McKay, Patrick A.; Wu, Xifeng; El-Naggar, Adel K.; Weinstein, John N.; Wang, Jing; Muzny, Donna M.; Gibbs, Richard A.; Wheeler, David A.; Myers, Jeffrey N.; Frederick, Mitchell J.

    2013-01-01

    The survival of patients with oral squamous cell carcinoma (OSCC) has not changed significantly in several decades, leading clinicians and investigators to search for promising molecular targets. To this end, we performed comprehensive genomic analysis of gene expression, copy number, methylation and point mutations in OSCC. Integrated analysis revealed more somatic events than previously reported, identifying four major driver pathways (mitogenic signaling, Notch, cell cycle, TP53) and two additional key genes (FAT1, CASP8). The Notch pathway was defective in 66% of patients, and in follow-up studies of mechanism, functional NOTCH1 signaling inhibited proliferation of OSCC cell lines. Frequent mutation of CASP8 defines a new molecular subtype of OSCC with few copy number changes. Although genomic alterations are dominated by loss of tumor suppressor genes, 80% of patients harbored at least one genomic alteration in a targetable gene, suggesting that novel approaches to treatment may be possible for this debilitating disease. PMID:23619168

  8. Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project

    PubMed Central

    Gerstein, Mark B.; Lu, Zhi John; Van Nostrand, Eric L.; Cheng, Chao; Arshinoff, Bradley I.; Liu, Tao; Yip, Kevin Y.; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K.; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P.; Barber, Galt; Brdlik, Cathleen M.; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O.; Dernburg, Abby F.; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C.; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A.; Gassmann, Reto; Good, Peter J.; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S.; Habegger, Lukas; Han, Ting; Henikoff, Jorja G.; Henz, Stefan R.; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A. Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W. James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K.; Kolasinska-Zwierz, Paulina; Lai, Eric C.; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F.; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D.; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M.; Muroyama, Andrew; Murray, John I.; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A.; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J.; Slightam, Cindie; Smith, Richard; Spencer, William C.; Stinson, E. O.; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L.; Whittle, Christina M.; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C.; Micklem, Gos; Liu, X. Shirley; Reinke, Valerie; Kim, Stuart K.; Hillier, LaDeana W.; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D.; Waterston, Robert H.

    2011-01-01

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome. PMID:21177976

  9. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

    PubMed

    Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H

    2010-12-24

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome. PMID:21177976

  10. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing.

    PubMed

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75-98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing. PMID:27606350

  11. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing

    PubMed Central

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75–98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing. PMID:27606350

  12. Integrating Genomics into Clinical Oncology: Ethical and Social Challenges from Proponents of Personalized Medicine

    PubMed Central

    Settersten, Richard A.; Juengst, Eric T.; Fishman, Jennifer R.

    2013-01-01

    Summary The use of molecular tools to individualize health care, predict appropriate therapies and prevent adverse health outcomes has gained significant traction in the field of oncology, under the banner of “personalized medicine.” Enthusiasm for personalized medicine in oncology has been fueled by success stories of targeted treatments for a variety of cancers based on their molecular profiles. Though these are clear indications of optimism for personalized medicine, little is known about the ethical and social implications of personalized approaches in clinical oncology. The objective of this study is to assess how a range of stakeholders engaged in promoting, monitoring, and providing personalized medicine understand the challenges of integrating genomic testing and targeted therapies into clinical oncology. The study involved the analysis of in-depth interviews with 117 basic scientists, clinician-researchers, clinicians in private practice, health professional educators, representatives of funding agencies, medical journal editors, entrepreneurs, and insurers whose experiences and perspectives on personalized medicine span a wide variety of institutional and professional settings. Despite considerable enthusiasm for this shift, promoters, monitors and providers of personalized medicine identified four domains which will still provoke heightened ethical and social concerns: (1) informed consent for cancer genomic testing, (2) privacy, confidentiality, and disclosure of genomic test results, (3) access to genomic testing and targeted therapies in oncology, and (4) the costs of scaling up pharmacogenomic testing and targeted cancer therapies. These specific concerns are not unique to oncology, or even genomics. However, those most invested in the success of personalized medicine view oncologists’ responses to these challenges as precedent-setting because oncology is farther along the path of clinical integration of genomic technologies than other fields

  13. Epiviz: a view inside the design of an integrated visual analysis software for genomics

    PubMed Central

    2015-01-01

    Background Computational and visual data analysis for genomics has traditionally involved a combination of tools and resources, of which the most ubiquitous consist of genome browsers, focused mainly on integrative visualization of large numbers of big datasets, and computational environments, focused on data modeling of a small number of moderately sized datasets. Workflows that involve the integration and exploration of multiple heterogeneous data sources, small and large, public and user specific have been poorly addressed by these tools. In our previous work, we introduced Epiviz, which bridges the gap between the two types of tools, simplifying these workflows. Results In this paper we expand on the design decisions behind Epiviz, and introduce a series of new advanced features that further support the type of interactive exploratory workflow we have targeted. We discuss three ways in which Epiviz advances the field of genomic data analysis: 1) it brings code to interactive visualizations at various different levels; 2) takes the first steps in the direction of collaborative data analysis by incorporating user plugins from source control providers, as well as by allowing analysis states to be shared among the scientific community; 3) combines established analysis features that have never before been available simultaneously in a genome browser. In our discussion section, we present security implications of the current design, as well as a series of limitations and future research steps. Conclusions Since many of the design choices of Epiviz are novel in genomics data analysis, this paper serves both as a document of our own approaches with lessons learned, as well as a start point for future efforts in the same direction for the genomics community. PMID:26328750

  14. NAHR-mediated copy-number variants in a clinical population: Mechanistic insights into both genomic disorders and Mendelizing traits

    PubMed Central

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y.; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E.; Scott, Daryl A.; Schaaf, Christian P.; Torres-Martinez, Wilfredo; Stevens, Abby K.; Rosenfeld, Jill A.; Agadi, Satish; Francis, David; Kang, Sung-Hae L.; Breman, Amy; Lalani, Seema R.; Bacino, Carlos A.; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L.; Patel, Ankita; Shaw, Chad A.; Lupski, James R.; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-01-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5′-CCNCCNTNNCCNC-3′, correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease. PMID:23657883

  15. Genomic Analysis of Sleeping Beauty Transposon Integration in Human Somatic Cells

    PubMed Central

    Turchiano, Giandomenico; Latella, Maria Carmela; Gogol-Döring, Andreas; Cattoglio, Claudia; Mavilio, Fulvio; Izsvák, Zsuzsanna; Ivics, Zoltán; Recchia, Alessandra

    2014-01-01

    The Sleeping Beauty (SB) transposon is a non-viral integrating vector system with proven efficacy for gene transfer and functional genomics. However, integration efficiency is negatively affected by the length of the transposon. To optimize the SB transposon machinery, the inverted repeats and the transposase gene underwent several modifications, resulting in the generation of the hyperactive SB100X transposase and of the high-capacity “sandwich” (SA) transposon. In this study, we report a side-by-side comparison of the SA and the widely used T2 arrangement of transposon vectors carrying increasing DNA cargoes, up to 18 kb. Clonal analysis of SA integrants in human epithelial cells and in immortalized keratinocytes demonstrates stability and integrity of the transposon independently from the cargo size and copy number-dependent expression of the cargo cassette. A genome-wide analysis of unambiguously mapped SA integrations in keratinocytes showed an almost random distribution, with an overrepresentation in repetitive elements (satellite, LINE and small RNAs) compared to a library representing insertions of the first-generation transposon vector and to gammaretroviral and lentiviral libraries. The SA transposon/SB100X integrating system therefore shows important features as a system for delivering large gene constructs for gene therapy applications. PMID:25390293

  16. Integration of genome scale data for identifying new players in colorectal cancer

    PubMed Central

    Sokolova, Viktorija; Crippa, Elisabetta; Gariboldi, Manuela

    2016-01-01

    Colorectal cancers (CRCs) display a wide variety of genomic aberrations that may be either causally linked to their development and progression, or might serve as biomarkers for their presence. Recent advances in rapid high-throughput genetic and genomic analysis have helped to identify a plethora of alterations that can potentially serve as new cancer biomarkers, and thus help to improve CRC diagnosis, prognosis, and treatment. Each distinct data type (copy number variations, gene and microRNAs expression, CpG island methylation) provides an investigator with a different, partially independent, and complementary view of the entire genome. However, elucidation of gene function will require more information than can be provided by analyzing a single type of data. The integration of knowledge obtained from different sources is becoming increasingly essential for obtaining an interdisciplinary view of large amounts of information, and also for cross-validating experimental results. The integration of numerous types of genetic and genomic data derived from public sources, and via the use of ad-hoc bioinformatics tools and statistical methods facilitates the discovery and validation of novel, informative biomarkers. This combinatory approach will also enable researchers to more accurately and comprehensively understand the associations between different biologic pathways, mechanisms, and phenomena, and gain new insights into the etiology of CRC. PMID:26811605

  17. A comprehensive whole-genome integrated cytogenetic map for the alpaca (Lama pacos).

    PubMed

    Avila, Felipe; Baily, Malorie P; Perelman, Polina; Das, Pranab J; Pontius, Joan; Chowdhary, Renuka; Owens, Elaine; Johnson, Warren E; Merriwether, David A; Raudsepp, Terje

    2014-01-01

    Genome analysis of the alpaca (Lama pacos, LPA) has progressed slowly compared to other domestic species. Here, we report the development of the first comprehensive whole-genome integrated cytogenetic map for the alpaca using fluorescence in situ hybridization (FISH) and CHORI-246 BAC library clones. The map is comprised of 230 linearly ordered markers distributed among all 36 alpaca autosomes and the sex chromosomes. For the first time, markers were assigned to LPA14, 21, 22, 28, and 36. Additionally, 86 genes from 15 alpaca chromosomes were mapped in the dromedary camel (Camelus dromedarius, CDR), demonstrating exceptional synteny and linkage conservation between the 2 camelid genomes. Cytogenetic mapping of 191 protein-coding genes improved and refined the known Zoo-FISH homologies between camelids and humans: we discovered new homologous synteny blocks (HSBs) corresponding to HSA1-LPA/CDR11, HSA4-LPA/CDR31 and HSA7-LPA/CDR36, and revised the location of breakpoints for others. Overall, gene mapping was in good agreement with the Zoo-FISH and revealed remarkable evolutionary conservation of gene order within many human-camelid HSBs. Most importantly, 91 FISH-mapped markers effectively integrated the alpaca whole-genome sequence and the radiation hybrid maps with physical chromosomes, thus facilitating the improvement of the sequence assembly and the discovery of genes of biological importance. PMID:25662411

  18. Integrated Syntenic and Phylogenomic Analyses Reveal an Ancient Genome Duplication in Monocots[W

    PubMed Central

    Jiao, Yuannian; Li, Jingping; Tang, Haibao; Paterson, Andrew H.

    2014-01-01

    Unraveling widespread polyploidy events throughout plant evolution is a necessity for inferring the impacts of whole-genome duplication (WGD) on speciation, functional innovations, and to guide identification of true orthologs in divergent taxa. Here, we employed an integrated syntenic and phylogenomic analyses to reveal an ancient WGD that shaped the genomes of all commelinid monocots, including grasses, bromeliads, bananas (Musa acuminata), ginger, palms, and other plants of fundamental, agricultural, and/or horticultural interest. First, comprehensive phylogenomic analyses revealed 1421 putative gene families that retained ancient duplication shared by Musa (Zingiberales) and grass (Poales) genomes, indicating an ancient WGD in monocots. Intergenomic synteny blocks of Musa and Oryza were investigated, and 30 blocks were shown to be duplicated before Musa-Oryza divergence an estimated 120 to 150 million years ago. Synteny comparisons of four monocot (rice [Oryza sativa], sorghum [Sorghum bicolor], banana, and oil palm [Elaeis guineensis]) and two eudicot (grape [Vitis vinifera] and sacred lotus [Nelumbo nucifera]) genomes also support this additional WGD in monocots, herein called Tau (τ). Integrating synteny and phylogenomic comparisons achieves better resolution of ancient polyploidy events than either approach individually, a principle that is exemplified in the disambiguation of a WGD series of rho (ρ)-sigma (σ)-tau (τ) in the grass lineages that echoes the alpha (α)-beta (β)-gamma (γ) series previously revealed in the Arabidopsis thaliana lineage. PMID:25082857

  19. Drosophila Sld5 is essential for normal cell cycle progression and maintenance of genomic integrity

    SciTech Connect

    Gouge, Catherine A.; Christensen, Tim W.

    2010-09-10

    Research highlights: {yields} Drosophila Sld5 interacts with Psf1, PPsf2, and Mcm10. {yields} Haploinsufficiency of Sld5 leads to M-phase delay and genomic instability. {yields} Sld5 is also required for normal S phase progression. -- Abstract: Essential for the normal functioning of a cell is the maintenance of genomic integrity. Failure in this process is often catastrophic for the organism, leading to cell death or mis-proliferation. Central to genomic integrity is the faithful replication of DNA during S phase. The GINS complex has recently come to light as a critical player in DNA replication through stabilization of MCM2-7 and Cdc45 as a member of the CMG complex which is likely responsible for the processivity of helicase activity during S phase. The GINS complex is made up of 4 members in a 1:1:1:1 ratio: Psf1, Psf2, Psf3, And Sld5. Here we present the first analysis of the function of the Sld5 subunit in a multicellular organism. We show that Drosophila Sld5 interacts with Psf1, Psf2, and Mcm10 and that mutations in Sld5 lead to M and S phase delays with chromosomes exhibiting hallmarks of genomic instability.

  20. Integrated syntenic and phylogenomic analyses reveal an ancient genome duplication in monocots.

    PubMed

    Jiao, Yuannian; Li, Jingping; Tang, Haibao; Paterson, Andrew H

    2014-07-01

    Unraveling widespread polyploidy events throughout plant evolution is a necessity for inferring the impacts of whole-genome duplication (WGD) on speciation, functional innovations, and to guide identification of true orthologs in divergent taxa. Here, we employed an integrated syntenic and phylogenomic analyses to reveal an ancient WGD that shaped the genomes of all commelinid monocots, including grasses, bromeliads, bananas (Musa acuminata), ginger, palms, and other plants of fundamental, agricultural, and/or horticultural interest. First, comprehensive phylogenomic analyses revealed 1421 putative gene families that retained ancient duplication shared by Musa (Zingiberales) and grass (Poales) genomes, indicating an ancient WGD in monocots. Intergenomic synteny blocks of Musa and Oryza were investigated, and 30 blocks were shown to be duplicated before Musa-Oryza divergence an estimated 120 to 150 million years ago. Synteny comparisons of four monocot (rice [Oryza sativa], sorghum [Sorghum bicolor], banana, and oil palm [Elaeis guineensis]) and two eudicot (grape [Vitis vinifera] and sacred lotus [Nelumbo nucifera]) genomes also support this additional WGD in monocots, herein called Tau (τ). Integrating synteny and phylogenomic comparisons achieves better resolution of ancient polyploidy events than either approach individually, a principle that is exemplified in the disambiguation of a WGD series of rho (ρ)-sigma (σ)-tau (τ) in the grass lineages that echoes the alpha (α)-beta (β)-gamma (γ) series previously revealed in the Arabidopsis thaliana lineage. PMID:25082857

  1. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    PubMed Central

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-01-01

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov. PMID:20542910

  2. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    SciTech Connect

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  3. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    PubMed Central

    King, Zachary A.; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  4. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    PubMed

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  5. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DOE PAGESBeta

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scalemore » metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.« less

  6. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    SciTech Connect

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  7. FISH Oracle 2: a web server for integrative visualization of genomic data in cancer research

    PubMed Central

    2014-01-01

    Background A comprehensive view on all relevant genomic data is instrumental for understanding the complex patterns of molecular alterations typically found in cancer cells. One of the most effective ways to rapidly obtain an overview of genomic alterations in large amounts of genomic data is the integrative visualization of genomic events. Results We developed FISH Oracle 2, a web server for the interactive visualization of different kinds of downstream processed genomics data typically available in cancer research. A powerful search interface and a fast visualization engine provide a highly interactive visualization for such data. High quality image export enables the life scientist to easily communicate their results. A comprehensive data administration allows to keep track of the available data sets. We applied FISH Oracle 2 to published data and found evidence that, in colorectal cancer cells, the gene TTC28 may be inactivated in two different ways, a fact that has not been published before. Conclusions The interactive nature of FISH Oracle 2 and the possibility to store, select and visualize large amounts of downstream processed data support life scientists in generating hypotheses. The export of high quality images supports explanatory data visualization, simplifying the communication of new biological findings. A FISH Oracle 2 demo server and the software is available at http://www.zbh.uni-hamburg.de/fishoracle. PMID:24684958

  8. Integrative molecular characterization of head and neck cancer cell model genomes

    PubMed Central

    Tsui, Ivy F.L.; Garnis, Cathie

    2010-01-01

    Background Cell lines are invaluable model systems for the investigation of cancer. Knowledge of the molecular alterations that exist within cell models is required to define the mechanisms governing cellular phenotypes. Methods Five tongue squamous cell carcinomas cell lines and one submaxillary salivary gland epidermoid carcinoma cell line were analyzed for copy number and mRNA expression by tiling-path DNA microarrays and Agilent Whole Human Genome Oligoarrays, respectively. Results Integrative analysis of genetic and expression alterations revealed the molecular landscape of each cell line. Molecular results for individual cell lines and across all samples have been summarized and made available for easy reference. Conclusion Our integrative genomic analyses have defined the DNA and RNA alterations for each individual line. These data will be useful to anyone modelling oral cancer behaviour, providing a molecular context that will be useful for deciphering cell phenotypes. PMID:20014447

  9. Bioinformatics visualization and integration with open standards: the Bluejay genomic browser.

    PubMed

    Turinsky, Andrei L; Ah-Seng, Andrew C; Gordon, Paul M K; Stromer, Julie N; Taschuk, Morgan L; Xu, Emily W; Sensen, Christoph W

    2005-01-01

    We have created a new Java-based integrated computational environment for the exploration of genomic data, called Bluejay. The system is capable of using almost any XML file related to genomic data. Non-XML data sources can be accessed via a proxy server. Bluejay has several features, which are new to Bioinformatics, including an unlimited semantic zoom capability, coupled with Scalable Vector Graphics (SVG) outputs; an implementation of the XLink standard, which features access to MAGPIE Genecards as well as any BioMOBY service accessible over the Internet; and the integration of gene chip analysis tools with the functional assignments. The system can be used as a signed web applet, Web Start, and a local stand-alone application, with or without connection to the Internet. It is available free of charge and as open source via http://bluejay.ucalgary.ca. PMID:15972014

  10. Complete genome sequence of Brachyspira intermedia reveals unique genomic features in Brachyspira species and phage-mediated horizontal gene transfer

    PubMed Central

    2011-01-01

    Background Brachyspira spp. colonize the intestines of some mammalian and avian species and show different degrees of enteropathogenicity. Brachyspira intermedia can cause production losses in chickens and strain PWS/AT now becomes the fourth genome to be completed in the genus Brachyspira. Results 15 classes of unique and shared genes were analyzed in B. intermedia, B. murdochii, B. hyodysenteriae and B. pilosicoli. The largest number of unique genes was found in B. intermedia and B. murdochii. This indicates the presence of larger pan-genomes. In general, hypothetical protein annotations are overrepresented among the unique genes. A 3.2 kb plasmid was found in B. intermedia strain PWS/AT. The plasmid was also present in the B. murdochii strain but not in nine other Brachyspira isolates. Within the Brachyspira genomes, genes had been translocated and also frequently switched between leading and lagging strands, a process that can be followed by different AT-skews in the third positions of synonymous codons. We also found evidence that bacteriophages were being remodeled and genes incorporated into them. Conclusions The accessory gene pool shapes species-specific traits. It is also influenced by reductive genome evolution and horizontal gene transfer. Gene-transfer events can cross both species and genus boundaries and bacteriophages appear to play an important role in this process. A mechanism for horizontal gene transfer appears to be gene translocations leading to remodeling of bacteriophages in combination with broad tropism. PMID:21816042

  11. Maintaining Pedagogical Integrity of a Computer Mediated Course Delivery in Social Foundations

    ERIC Educational Resources Information Center

    Stewart, Shelley; Cobb-Roberts, Deirdre; Shircliffe, Barbara J.

    2013-01-01

    Transforming a face to face course to a computer mediated format in social foundations (interdisciplinary field in education), while maintaining pedagogical integrity, involves strategic collaboration between instructional technologists and content area experts. This type of planned partnership requires open dialogue and a mutual respect for prior…

  12. Impact of Nucleoporin-Mediated Chromatin Localization and Nuclear Architecture on HIV Integration Site Selection.

    PubMed

    Wong, Richard W; Mamede, João I; Hope, Thomas J

    2015-10-01

    It has been known for a number of years that integration sites of human immunodeficiency virus type 1 (HIV-1) DNA show a preference for actively expressed chromosomal locations. A number of viral and cellular proteins are implicated in this process, but the underlying mechanism is not clear. Two recent breakthrough publications advance our understanding of HIV integration site selection by focusing on the localization of the preferred target genes of integration. These studies reveal that knockdown of certain nucleoporins and components of nucleocytoplasmic trafficking alter integration site preference, not by altering the trafficking of the viral genome but by altering the chromatin subtype localization relative to the structure of the nucleus. Here, we describe the link between the nuclear basket nucleoporins (Tpr and Nup153) and chromatin organization and how altering the host environment by manipulating nuclear structure may have important implications for the preferential integration of HIV into actively transcribed genes, facilitating efficient viral replication. PMID:26136574

  13. A series of conditional shuttle vectors for targeted genomic integration in budding yeast

    PubMed Central

    Chou, Chia-Ching; Patel, Michael T.; Gartenberg, Marc R.

    2015-01-01

    The capacity of Saccharomyces cerevisiae to repair exposed DNA ends by homologous recombination has long been used by experimentalists to assemble plasmids from DNA fragments in vivo. While this approach works well for engineering extrachromosomal vectors, it is not well suited to the generation, recovery and reuse of integrative vectors. Here, we describe the creation of a series of conditional centromeric shuttle vectors, termed pXR vectors, that can be used for both plasmid assembly in vivo and targeted genomic integration. The defining feature of pXR vectors is that the DNA segment bearing the centromere and origin of replication, termed CEN/ARS, is flanked by a pair of loxP sites. Passaging the vectors through bacteria that express Cre recombinase reduces the loxP-CEN/ARS-loxP module to a single loxP site, thereby eliminating the ability to replicate autonomously in yeast. Each vector also contains a selectable marker gene, as well as a fragment of the HO locus, which permits targeted integration at a neutral genomic site. The pXR vectors provide a convenient and robust method to assemble DNAs for targeted genomic modifications. PMID:25736914

  14. Transformation of Ulva mutabilis (Chlorophyta) by vector plasmids integrating into the genome.

    PubMed

    Oertel, Wolfgang; Wichard, Thomas; Weissgerber, Adelheid

    2015-10-01

    A method for the stable transformation of the green marine macroalga Ulva mutabilis was developed based on vector plasmids integrating into the genome. By combination of the expression signals (promoter, enhancer, and transcriptional termination sequences) of a chromosomal rbcS gene from U. mutabilis with the bleomycin resistance gene (ble) from Streptoalloteichus hindustanus, a dominant selectable marker gene was constructed for the preparation of a series of E. coli-U. mutabilis shuttle vector plasmids. Special vectors were prepared for the introduction and expression of foreign genes in Ulva, for insertional mutagenesis and gene tagging by plasmid integration into the genome, and for protein tagging by the green fluorescent protein, as well as tools for posttranscriptional gene silencing and cosmid cloning to prepare genomic gene libraries for mutant gene complementation. The vectors were successfully tested in pilot experiments, where they were efficiently introduced into Ulva gametes, zoospores or protoplasts of somatic blade cells by treatment with Ca(2+) -ions and polyethylene glycol under isotonic conditions at low ionic strength. The parthenogenetically propagated phleomycin-resistant transformants of the mutant slender (sl) and the wildtype (wt) were demonstrated to be carrying the plasmids randomly integrated into the chromosomes often as tandem repeat clusters. PMID:26986891

  15. A geminivirus-based guide RNA delivery system for CRISPR/Cas9 mediated plant genome editing

    PubMed Central

    Yin, Kangquan; Han, Ting; Liu, Guang; Chen, Tianyuan; Wang, Ying; Yu, Alice Yunzi L.; Liu, Yule

    2015-01-01

    CRISPR/Cas has emerged as potent genome editing technology and has successfully been applied in many organisms, including several plant species. However, delivery of genome editing reagents remains a challenge in plants. Here, we report a virus-based guide RNA (gRNA) delivery system for CRISPR/Cas9 mediated plant genome editing (VIGE) that can be used to precisely target genome locations and cause mutations. VIGE is performed by using a modified Cabbage Leaf Curl virus (CaLCuV) vector to express gRNAs in stable transgenic plants expressing Cas9. DNA sequencing confirmed VIGE of endogenous NbPDS3 and NbIspH genes in non-inoculated leaves because CaLCuV can infect plants systemically. Moreover, VIGE of NbPDS3 and NbIspH in newly developed leaves caused photo-bleached phenotype. These results demonstrate that geminivirus-based VIGE could be a powerful tool in plant genome editing. PMID:26450012

  16. A geminivirus-based guide RNA delivery system for CRISPR/Cas9 mediated plant genome editing.

    PubMed

    Yin, Kangquan; Han, Ting; Liu, Guang; Chen, Tianyuan; Wang, Ying; Yu, Alice Yunzi L; Liu, Yule

    2015-01-01

    CRISPR/Cas has emerged as potent genome editing technology and has successfully been applied in many organisms, including several plant species. However, delivery of genome editing reagents remains a challenge in plants. Here, we report a virus-based guide RNA (gRNA) delivery system for CRISPR/Cas9 mediated plant genome editing (VIGE) that can be used to precisely target genome locations and cause mutations. VIGE is performed by using a modified Cabbage Leaf Curl virus (CaLCuV) vector to express gRNAs in stable transgenic plants expressing Cas9. DNA sequencing confirmed VIGE of endogenous NbPDS3 and NbIspH genes in non-inoculated leaves because CaLCuV can infect plants systemically. Moreover, VIGE of NbPDS3 and NbIspH in newly developed leaves caused photo-bleached phenotype. These results demonstrate that geminivirus-based VIGE could be a powerful tool in plant genome editing. PMID:26450012

  17. Gross deletions involving IGHM, BTK, or Artemis: a model for genomic lesions mediated by transposable elements.

    PubMed

    van Zelm, Menno C; Geertsema, Corinne; Nieuwenhuis, Nicole; de Ridder, Dick; Conley, Mary Ellen; Schiff, Claudine; Tezcan, Ilhan; Bernatowska, Ewa; Hartwig, Nico G; Sanders, Elisabeth A M; Litzman, Jiri; Kondratenko, Irina; van Dongen, Jacques J M; van der Burg, Mirjam

    2008-02-01

    Most genetic disruptions underlying human disease are microlesions, whereas gross lesions are rare with gross deletions being most frequently found (6%). Similar observations have been made in primary immunodeficiency genes, such as BTK, but for unknown reasons the IGHM and DCLRE1C (Artemis) gene defects frequently represent gross deletions ( approximately 60%). We characterized the gross deletion breakpoints in IGHM-, BTK-, and Artemis-deficient patients. The IGHM deletion breakpoints did not show involvement of recombination signal sequences or immunoglobulin switch regions. Instead, five IGHM, eight BTK, and five unique Artemis breakpoints were located in or near sequences derived from transposable elements (TE). The breakpoints of four out of five disrupted Artemis alleles were located in highly homologous regions, similar to Ig subclass deficiencies and Vh deletion polymorphisms. Nevertheless, these observations suggest a role for TEs in mediating gross deletions. The identified gross deletion breakpoints were mostly located in TE subclasses that were specifically overrepresented in the involved gene as compared to the average in the human genome. This concerned both long (LINE1) and short (Alu, MIR) interspersed elements, as well as LTR retrotransposons (ERV). Furthermore, a high total TE content (>40%) was associated with an increased frequency of gross deletions. Both findings were further investigated and confirmed in a total set of 20 genes disrupted in human disease. Thus, to our knowledge for the first time, we provide evidence that a high TE content, irrespective of the type of element, results in the increased incidence of gross deletions as gene disruption underlying human disease. PMID:18252213

  18. Gross Deletions Involving IGHM, BTK, or Artemis: A Model for Genomic Lesions Mediated by Transposable Elements

    PubMed Central

    van Zelm, Menno C.; Geertsema, Corinne; Nieuwenhuis, Nicole; de Ridder, Dick; Conley, Mary Ellen; Schiff, Claudine; Tezcan, Ilhan; Bernatowska, Ewa; Hartwig, Nico G.; Sanders, Elisabeth A.M.; Litzman, Jiri; Kondratenko, Irina; van Dongen, Jacques J.M.; van der Burg, Mirjam

    2008-01-01

    Most genetic disruptions underlying human disease are microlesions, whereas gross lesions are rare with gross deletions being most frequently found (6%). Similar observations have been made in primary immunodeficiency genes, such as BTK, but for unknown reasons the IGHM and DCLRE1C (Artemis) gene defects frequently represent gross deletions (∼60%). We characterized the gross deletion breakpoints in IGHM-, BTK-, and Artemis-deficient patients. The IGHM deletion breakpoints did not show involvement of recombination signal sequences or immunoglobulin switch regions. Instead, five IGHM, eight BTK, and five unique Artemis breakpoints were located in or near sequences derived from transposable elements (TE). The breakpoints of four out of five disrupted Artemis alleles were located in highly homologous regions, similar to Ig subclass deficiencies and Vh deletion polymorphisms. Nevertheless, these observations suggest a role for TEs in mediating gross deletions. The identified gross deletion breakpoints were mostly located in TE subclasses that were specifically overrepresented in the involved gene as compared to the average in the human genome. This concerned both long (LINE1) and short (Alu, MIR) interspersed elements, as well as LTR retrotransposons (ERV). Furthermore, a high total TE content (>40%) was associated with an increased frequency of gross deletions. Both findings were further investigated and confirmed in a total set of 20 genes disrupted in human disease. Thus, to our knowledge for the first time, we provide evidence that a high TE content, irrespective of the type of element, results in the increased incidence of gross deletions as gene disruption underlying human disease. PMID:18252213

  19. DNA bending facilitates the error-free DNA damage tolerance pathway and upholds genome integrity

    PubMed Central

    Gonzalez-Huici, Victor; Szakal, Barnabas; Urulangodi, Madhusoodanan; Psakhye, Ivan; Castellucci, Federica; Menolfi, Demis; Rajakumara, Eerappa; Fumasoni, Marco; Bermejo, Rodrigo; Jentsch, Stefan; Branzei, Dana

    2014-01-01

    DNA replication is sensitive to damage in the template. To bypass lesions and complete replication, cells activate recombination-mediated (error-free) and translesion synthesis-mediated (error-prone) DNA damage tolerance pathways. Crucial for error-free DNA damage tolerance is template switching, which depends on the formation and resolution of damage-bypass intermediates consisting of sister chromatid junctions. Here we show that a chromatin architectural pathway involving the high mobility group box protein Hmo1 channels replication-associated lesions into the error-free DNA damage tolerance pathway mediated by Rad5 and PCNA polyubiquitylation, while preventing mutagenic bypass and toxic recombination. In the process of template switching, Hmo1 also promotes sister chromatid junction formation predominantly during replication. Its C-terminal tail, implicated in chromatin bending, facilitates the formation of catenations/hemicatenations and mediates the roles of Hmo1 in DNA damage tolerance pathway choice and sister chromatid junction formation. Together, the results suggest that replication-associated topological changes involving the molecular DNA bender, Hmo1, set the stage for dedicated repair reactions that limit errors during replication and impact on genome stability. PMID:24473148

  20. Integrated physical, genetic and genome map of chickpea (Cicer arietinum L.).

    PubMed

    Varshney, Rajeev K; Mir, Reyazul Rouf; Bhatia, Sabhyata; Thudi, Mahendar; Hu, Yuqin; Azam, Sarwar; Zhang, Yong; Jaganathan, Deepa; You, Frank M; Gao, Jinliang; Riera-Lizarazu, Oscar; Luo, Ming-Cheng

    2014-03-01

    Physical map of chickpea was developed for the reference chickpea genotype (ICC 4958) using bacterial artificial chromosome (BAC) libraries targeting 71,094 clones (~12× coverage). High information content fingerprinting (HICF) of these clones gave high-quality fingerprinting data for 67,483 clones, and 1,174 contigs comprising 46,112 clones and 3,256 singletons were defined. In brief, 574 Mb genome size was assembled in 1,174 contigs with an average of 0.49 Mb per contig and 3,256 singletons represent 407 Mb genome. The physical map was linked with two genetic maps with the help of 245 BAC-end sequence (BES)-derived simple sequence repeat (SSR) markers. This allowed locating some of the BACs in the vicinity of some important quantitative trait loci (QTLs) for drought tolerance and reistance to Fusarium wilt and Ascochyta blight. In addition, fingerprinted contig (FPC) assembly was also integrated with the draft genome sequence of chickpea. As a result, ~965 BACs including 163 minimum tilling path (MTP) clones could be mapped on eight pseudo-molecules of chickpea forming 491 hypothetical contigs representing 54,013,992 bp (~54 Mb) of the draft genome. Comprehensive analysis of markers in abiotic and biotic stress tolerance QTL regions led to identification of 654, 306 and 23 genes in drought tolerance "QTL-hotspot" region, Ascochyta blight resistance QTL region and Fusarium wilt resistance QTL region, respectively. Integrated physical, genetic and genome map should provide a foundation for cloning and isolation of QTLs/genes for molecular dissection of traits as well as markers for molecular breeding for chickpea improvement. PMID:24610029

  1. Loss of p53-mediated cell-cycle arrest, senescence and apoptosis promotes genomic instability and premature aging

    PubMed Central

    Li, Tongyuan; Liu, Xiangyu; Jiang, Le; Manfredi, James; Zha, Shan; Gu, Wei

    2016-01-01

    Although p53-mediated cell cycle arrest, senescence and apoptosis are well accepted as major tumor suppression mechanisms, the loss of these functions does not directly lead to tumorigenesis, suggesting that the precise roles of these canonical activities of p53 need to be redefined. Here, we report that the cells derived from the mutant mice expressing p533KR, an acetylation-defective mutant that fails to induce cell-cycle arrest, senescence and apoptosis, exhibit high levels of aneuploidy upon DNA damage. Moreover, the embryonic lethality caused by the deficiency of XRCC4, a key DNA double strand break repair factor, can be fully rescued in the p533KR/3KR background. Notably, despite high levels of genomic instability, p533KR/3KRXRCC4−/− mice, unlike p53−/− XRCC4−/− mice, are not succumbed to pro-B-cell lymphomas. Nevertheless, p533KR/3KR XRCC4−/− mice display aging-like phenotypes including testicular atrophy, kyphosis, and premature death. Further analyses demonstrate that SLC7A11 is downregulated and that p53-mediated ferroptosis is significantly induced in spleens and testis of p533KR/3KRXRCC4−/− mice. These results demonstrate that the direct role of p53-mediated cell cycle arrest, senescence and apoptosis is to control genomic stability in vivo. Our study not only validates the importance of ferroptosis in p53-mediated tumor suppression in vivo but also reveals that the combination of genomic instability and activation of ferroptosis may promote aging-associated phenotypes. PMID:26943586

  2. Construction of an Ortholog Database Using the Semantic Web Technology for Integrative Analysis of Genomic Data

    PubMed Central

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis. PMID:25875762

  3. Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data.

    PubMed

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis. PMID:25875762

  4. Genome-wide variant analysis of simplex autism families with an integrative clinical-bioinformatics pipeline

    PubMed Central

    Jiménez-Barrón, Laura T.; O'Rawe, Jason A.; Wu, Yiyang; Yoon, Margaret; Fang, Han; Iossifov, Ivan; Lyon, Gholson J.

    2015-01-01

    Autism spectrum disorders (ASDs) are a group of developmental disabilities that affect social interaction and communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASDs, in which many different loci are involved. Although many current population-scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole-genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de novo, autosomal recessive, X-linked, mitochondrial, and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous copy-number variations (CNVs), a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole-genome sequencing data can generate reliable results for use in downstream investigations. PMID:27148569

  5. An Integrated Linkage, Chromosome, and Genome Map for the Yellow Fever Mosquito Aedes aegypti

    PubMed Central

    Timoshevskiy, Vladimir A.; Severson, David W.; deBruyn, Becky S.; Black, William C.; Sharakhov, Igor V.; Sharakhova, Maria V.

    2013-01-01

    Background Aedes aegypti, the yellow fever mosquito, is an efficient vector of arboviruses and a convenient model system for laboratory research. Extensive linkage mapping of morphological and molecular markers localized a number of quantitative trait loci (QTLs) related to the mosquito's ability to transmit various pathogens. However, linking the QTLs to Ae. aegypti chromosomes and genomic sequences has been challenging because of the poor quality of polytene chromosomes and the highly fragmented genome assembly for this species. Methodology/Principal Findings Based on the approach developed in our previous study, we constructed idiograms for mitotic chromosomes of Ae. aegypti based on their banding patterns at early metaphase. These idiograms represent the first cytogenetic map developed for mitotic chromosomes of Ae. aegypti. One hundred bacterial artificial chromosome clones carrying major genetic markers were hybridized to the chromosomes using fluorescent in situ hybridization. As a result, QTLs related to the transmission of the filarioid nematode Brugia malayi, the avian malaria parasite Plasmodium gallinaceum, and the dengue virus, as well as sex determination locus and 183 Mbp of genomic sequences were anchored to the exact positions on Ae. aegypti chromosomes. A linear regression analysis demonstrated a good correlation between positions of the markers on the physical and linkage maps. As a result of the recombination rate variation along the chromosomes, 12 QTLs on the linkage map were combined into five major clusters of QTLs on the chromosome map. Conclusion This study developed an integrated linkage, chromosome, and genome map—iMap—for the yellow fever mosquito. Our discovery of the localization of multiple QTLs in a few major chromosome clusters suggests a possibility that the transmission of various pathogens is controlled by the same genomic loci. Thus, the iMap will facilitate the identification of genomic determinants of traits responsible

  6. Integration of genotypic and phenotypic screening reveals molecular mediators of melanoma-stromal interaction.

    PubMed

    Stine, Megan J; Wang, C Joanne; Moriarty, Whei F; Ryu, Byungwoo; Cheong, Raymond; Westra, William H; Levchenko, Andre; Alani, Rhoda M

    2011-04-01

    Tumor-endothelium interactions are critical for tumor survival and metastasis. Melanomas can rapidly metastasize early in tumor progression, but the dependence of this aggressive behavior on tumor-stromal interaction is poorly understood. To probe the mechanisms involved, we developed a heterotypic coculture methodology, allowing simultaneous tracking of genomic and phenotypic changes in interacting tumor and endothelial cells in vitro. We found a dramatic rearrangement of endothelial cell networks into patterns reminiscent of vascular beds, even on plastic and glass. Multiple genes were upregulated in the process, many coding for cell surface and secreted proteins, including Neuropilin-2 (NRP2). A critical role of NRP2 in coordinated cell patterning and growth was confirmed using the coculture system. We conclude that NRP2 represents an important mediator of melanoma-endothelial interactions. Furthermore, the described methodology represents a powerful yet simple system to elucidate heterotypic intercellular interactions mediating diverse physiological and pathological processes. PMID:21324919

  7. Different Foreign Genes Incidentally Integrated into the Same Locus of the Streptococcus suis Genome

    PubMed Central

    Sekizaki, Tsutomu; Takamatsu, Daisuke; Osaki, Makoto; Shimoji, Yoshihiro

    2005-01-01

    Some strains of Streptococcus suis possess a type II restriction-modification (RM) system, whose genes are thought to be inserted into the genome between purH and purD from a foreign source by illegitimate recombination. In this study, we characterized the purHD locus of the S. suis genomes of 28 serotype reference strains by DNA sequencing. Four strains contained the RM genes in the locus, as described before, whereas 11 strains possessed other genetic regions of seven classes. The genetic regions contained a single gene or multiple genes that were either unknown or similar to hypothetical genes of other bacteria. The mutually exclusive localization of the genetic regions with the atypical G+C contents indicated that these regions were also acquired from foreign sources. No transposable element or long-repeat sequence was found in the neighboring regions. An alignment of the nucleotide sequences, including the RM gene regions, suggested that the foreign regions were integrated by illegitimate recombination via short stretches of nucleotide identity. By using a thermosensitive suicide plasmid, the RM genes were experimentally introduced into an S. suis strain that did not contain any foreign genes in that locus. Integration of the plasmid into the S. suis genome did not occur in the purHD locus but occurred at various chromosomal loci, where there were 2 to 10 bp of nucleotide identity between the chromosome and the plasmid. These results suggest that various foreign genes described here were incidentally integrated into the same locus of the S. suis genome. PMID:15659665

  8. Latent feature decompositions for integrative analysis of multi-platform genomic data

    PubMed Central

    Gregory, Karl B.; Momin, Amin A.; Coombes, Kevin R.; Baladandayuthapani, Veerabhadran

    2015-01-01

    Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and epigenetic characteristics illuminate the roles their complex relationships play in disease progression and outcomes. However, integrative methods for diverse genomics data are faced with the challenges of ultra-high dimensionality and the existence of complex interactions both within and between platforms. We propose a novel modeling framework for integrative analysis based on decompositions of the large number of platform-specific features into a smaller number of latent features. Subsequently we build a predictive model for clinical outcomes accounting for both within- and between-platform interactions based on Bayesian model averaging procedures. Principal components, partial least squares and non-negative matrix factorization as well as sparse counterparts of each are used to define the latent features, and the performance of these decompositions is compared both on real and simulated data. The latent feature interactions are shown to preserve interactions between the original features and not only aid prediction but also allow explicit selection of outcome-related features. The methods are motivated by and applied to, a glioblastoma multiforme dataset from The Cancer Genome Atlas to predict patient survival times integrating gene expression, microRNA, copy number and methylation data. For the glioblastoma data, we find a high concordance between our selected prognostic genes and genes with known associations with glioblastoma. In addition, our model discovers several relevant cross-platform interactions such as copy number variation associated gene dosing and epigenetic regulation through promoter methylation. On simulated data, we show that our proposed method successfully incorporates interactions within and between

  9. Transposon Mediated Integration of Plasmid DNA into the Subventricular Zone of Neonatal Mice to Generate Novel Models of Glioblastoma

    PubMed Central

    Calinescu, Anda-Alexandra; Núñez, Felipe Javier; Koschmann, Carl; Kolb, Bradley L.; Lowenstein, Pedro R.; Castro, Maria G.

    2015-01-01

    An urgent need exists to test the contribution of new genes to the pathogenesis and progression of human glioblastomas (GBM), the most common primary brain tumor in adults with dismal prognosis. New potential therapies are rapidly emerging from the bench and require systematic testing in experimental models which closely reproduce the salient features of the human disease. Herein we describe in detail a method to induce new models of GBM with transposon-mediated integration of plasmid DNA into cells of the subventricular zone of neonatal mice. We present a simple way to clone new transposons amenable for genomic integration using the Sleeping Beauty transposon system and illustrate how to monitor plasmid uptake and disease progression using bioluminescence, histology and immuno-histochemistry. We also describe a method to create new primary GBM cell lines. Ideally, this report will allow further dissemination of the Sleeping Beauty transposon system among brain tumor researchers, leading to an in depth understanding of GBM pathogenesis and progression and to the timely design and testing of effective therapies for patients. PMID:25741859

  10. High-Resolution Linkage and Quantitative Trait Locus Mapping Aided by Genome Survey Sequencing: Building Up An Integrative Genomic Framework for a Bivalve Mollusc

    PubMed Central

    Jiao, Wenqian; Fu, Xiaoteng; Dou, Jinzhuang; Li, Hengde; Su, Hailin; Mao, Junxia; Yu, Qian; Zhang, Lingling; Hu, Xiaoli; Huang, Xiaoting; Wang, Yangfan; Wang, Shi; Bao, Zhenmin

    2014-01-01

    Genetic linkage maps are indispensable tools in genetic and genomic studies. Recent development of genotyping-by-sequencing (GBS) methods holds great promise for constructing high-resolution linkage maps in organisms lacking extensive genomic resources. In the present study, linkage mapping was conducted for a bivalve mollusc (Chlamys farreri) using a newly developed GBS method—2b-restriction site-associated DNA (2b-RAD). Genome survey sequencing was performed to generate a preliminary reference genome that was utilized to facilitate linkage and quantitative trait locus (QTL) mapping in C. farreri. A high-resolution linkage map was constructed with a marker density (3806) that has, to our knowledge, never been achieved in any other molluscs. The linkage map covered nearly the whole genome (99.5%) with a resolution of 0.41 cM. QTL mapping and association analysis congruously revealed two growth-related QTLs and one potential sex-determination region. An important candidate QTL gene named PROP1, which functions in the regulation of growth hormone production in vertebrates, was identified from the growth-related QTL region detected on the linkage group LG3. We demonstrate that this linkage map can serve as an important platform for improving genome assembly and unifying multiple genomic resources. Our study, therefore, exemplifies how to build up an integrative genomic framework in a non-model organism. PMID:24107803

  11. Messenger RNA- Versus Retrovirus-Based Induced Pluripotent Stem Cell Reprogramming Strategies: Analysis of Genomic Integrity

    PubMed Central

    Steichen, Clara; Luce, Eléanor; Maluenda, Jérôme; Tosca, Lucie; Moreno-Gimeno, Inmaculada; Desterke, Christophe; Dianat, Noushin; Goulinet-Mainot, Sylvie; Awan-Toor, Sarah; Burks, Deborah; Marie, Joëlle; Weber, Anne; Tachdjian, Gérard; Melki, Judith

    2014-01-01

    The use of synthetic messenger RNAs to generate human induced pluripotent stem cells (iPSCs) is particularly appealing for potential regenerative medicine applications, because it overcomes the common drawbacks of DNA-based or virus-based reprogramming strategies, including transgene integration in particular. We compared the genomic integrity of mRNA-derived iPSCs with that of retrovirus-derived iPSCs generated in strictly comparable conditions, by single-nucleotide polymorphism (SNP) and copy number variation (CNV) analyses. We showed that mRNA-derived iPSCs do not differ significantly from the parental fibroblasts in SNP analysis, whereas retrovirus-derived iPSCs do. We found that the number of CNVs seemed independent of the reprogramming method, instead appearing to be clone-dependent. Furthermore, differentiation studies indicated that mRNA-derived iPSCs differentiated efficiently into hepatoblasts and that these cells did not load additional CNVs during differentiation. The integration-free hepatoblasts that were generated constitute a new tool for the study of diseased hepatocytes derived from patients’ iPSCs and their use in the context of stem cell-derived hepatocyte transplantation. Our findings also highlight the need to conduct careful studies on genome integrity for the selection of iPSC lines before using them for further applications. PMID:24736403

  12. Npl3, a new link between RNA-binding proteins and the maintenance of genome integrity

    PubMed Central

    Santos-Pereira, José M; Herrero, Ana B; Moreno, Sergio; Aguilera, Andrés

    2014-01-01

    The mRNA is co-transcriptionally bound by a number of RNA-binding proteins (RBPs) that contribute to its processing and formation of an export-competent messenger ribonucleoprotein particle (mRNP). In the last few years, increasing evidence suggests that RBPs play a key role in preventing transcription-associated genome instability. Part of this instability is mediated by the accumulation of co-transcriptional R loops, which may impair replication fork (RF) progression due to collisions between transcription and replication machineries. In addition, some RBPs have been implicated in DNA repair and/or the DNA damage response (DDR). Recently, the Npl3 protein, one of the most abundant heterogeneous nuclear ribonucleoproteins (hnRNPs) in yeast, has been shown to prevent transcription-associated genome instability and accumulation of RF obstacles, partially associated with R-loop formation. Interestingly, Npl3 seems to have additional functions in DNA repair, and npl3∆ mutants are highly sensitive to genotoxic agents, such as the antitumor drug trabectedin. Here we discuss the role of Npl3 in particular, and RBPs in general, in the connection of transcription with replication and genome instability, and its effect on the DDR. PMID:24694687

  13. The Fanconi Anemia Pathway Protects Genome Integrity from R-loops

    PubMed Central

    García-Rubio, María L.; Pérez-Calero, Carmen; Barroso, Sonia I.; Tumini, Emanuela; Herrera-Moyano, Emilia; Rosado, Iván V.; Aguilera, Andrés

    2015-01-01

    Co-transcriptional RNA-DNA hybrids (R loops) cause genome instability. To prevent harmful R loop accumulation, cells have evolved specific eukaryotic factors, one being the BRCA2 double-strand break repair protein. As BRCA2 also protects stalled replication forks and is the FANCD1 member of the Fanconi Anemia (FA) pathway, we investigated the FA role in R loop-dependent genome instability. Using human and murine cells defective in FANCD2 or FANCA and primary bone marrow cells from FANCD2 deficient mice, we show that the FA pathway removes R loops, and that many DNA breaks accumulated in FA cells are R loop-dependent. Importantly, FANCD2 foci in untreated and MMC-treated cells are largely R loop dependent, suggesting that the FA functions at R loop-containing sites. We conclude that co-transcriptional R loops and R loop-mediated DNA damage greatly contribute to genome instability and that one major function of the FA pathway is to protect cells from R loops. PMID:26584049

  14. Importance of Mediator complex in the regulation and integration of diverse signaling pathways in plants

    PubMed Central

    Samanta, Subhasis; Thakur, Jitendra K.

    2015-01-01

    Basic transcriptional machinery in eukaryotes is assisted by a number of cofactors, which either increase or decrease the rate of transcription. Mediator complex is one such cofactor, and recently has drawn a lot of interest because of its integrative power to converge different signaling pathways before channeling the transcription instructions to the RNA polymerase II machinery. Like yeast and metazoans, plants do possess the Mediator complex across the kingdom, and its isolation and subunit analyses have been reported from the model plant, Arabidopsis. Genetic, and molecular analyses have unraveled important regulatory roles of Mediator subunits at every stage of plant life cycle starting from flowering to embryo and organ development, to even size determination. It also contributes immensely to the survival of plants against different environmental vagaries by the timely activation of its resistance mechanisms. Here, we have provided an overview of plant Mediator complex starting from its discovery to regulation of stoichiometry of its subunits. We have also reviewed involvement of different Mediator subunits in different processes and pathways including defense response pathways evoked by diverse biotic cues. Wherever possible, attempts have been made to provide mechanistic insight of Mediator's involvement in these processes. PMID:26442070

  15. Identification of a gene causing human cytochrome c oxidase deficiency by integrative genomics.

    PubMed

    Mootha, Vamsi K; Lepage, Pierre; Miller, Kathleen; Bunkenborg, Jakob; Reich, Michael; Hjerrild, Majbrit; Delmonte, Terrye; Villeneuve, Amelie; Sladek, Robert; Xu, Fenghao; Mitchell, Grant A; Morin, Charles; Mann, Matthias; Hudson, Thomas J; Robinson, Brian; Rioux, John D; Lander, Eric S

    2003-01-21

    Identifying the genes responsible for human diseases requires combining information about gene position with clues about biological function. The recent availability of whole-genome data sets of RNA and protein expression provides powerful new sources of functional insight. Here we illustrate how such data sets can expedite disease-gene discovery, by using them to identify the gene causing Leigh syndrome, French-Canadian type (LSFC, Online Mendelian Inheritance in Man no. 220111), a human cytochrome c oxidase deficiency that maps to chromosome 2p16-21. Using four public RNA expression data sets, we assigned to all human genes a "score" reflecting their similarity in RNA-expression profiles to known mitochondrial genes. Using a large survey of organellar proteomics, we similarly classified human genes according to the likelihood of their protein product being associated with the mitochondrion. By intersecting this information with the relevant genomic region, we identified a single clear candidate gene, LRPPRC. Resequencing identified two mutations on two independent haplotypes, providing definitive genetic proof that LRPPRC indeed causes LSFC. LRPPRC encodes an mRNA-binding protein likely involved with mtDNA transcript processing, suggesting an additional mechanism of mitochondrial pathophysiology. Similar strategies to integrate diverse genomic information can be applied likewise to other disease pathways and will become increasingly powerful with the growing wealth of diverse, functional genomics data. PMID:12529507

  16. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  17. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  18. MicrobesOnline: an integrated portal for comparative and functional genomics.

    PubMed

    Dehal, Paramvir S; Joachimiak, Marcin P; Price, Morgan N; Bates, John T; Baumohl, Jason K; Chivian, Dylan; Friedland, Greg D; Huang, Katherine H; Keller, Keith; Novichkov, Pavel S; Dubchak, Inna L; Alm, Eric J; Arkin, Adam P

    2010-01-01

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html. PMID:19906701

  19. Opposite transcriptional regulation of integrated vs unintegrated HIV genomes by the NF-κB pathway.

    PubMed

    Thierry, Sylvain; Thierry, Eloïse; Subra, Frédéric; Deprez, Eric; Leh, Hervé; Bury-Moné, Stéphanie; Delelis, Olivier

    2016-01-01

    Integration of HIV-1 linear DNA into host chromatin is required for high levels of viral expression, and constitutes a key therapeutic target. Unintegrated viral DNA (uDNA) can support only limited transcription but may contribute to viral propagation, persistence and/or treatment escape under specific situations. The molecular mechanisms involved in the differential expression of HIV uDNA vs integrated genome (iDNA) remain to be elucidated. Here, we demonstrate, for the first time, that the expression of HIV uDNA is mainly supported by 1-LTR circles, and regulated in the opposite way, relatively to iDNA, following NF-κB pathway modulation. Upon treatment activating the NF-κB pathway, NF-κB p65 and AP-1 (cFos/cJun) binding to HIV LTR iDNA correlates with increased iDNA expression, while uDNA expression decreases. On the contrary, inhibition of the NF-κB pathway promotes the expression of circular uDNA, and correlates with Bcl-3 and AP-1 binding to its LTR region. Finally, this study identifies NF-κB subunits and Bcl-3 as transcription factors binding the HIV promoter differently depending on viral genome topology, and opens new insights on the potential roles of episomal genomes during the HIV-1 latency and persistence. PMID:27167871

  20. Opposite transcriptional regulation of integrated vs unintegrated HIV genomes by the NF-κB pathway

    PubMed Central

    Thierry, Sylvain; Thierry, Eloïse; Subra, Frédéric; Deprez, Eric; Leh, Hervé; Bury-Moné, Stéphanie; Delelis, Olivier

    2016-01-01

    Integration of HIV-1 linear DNA into host chromatin is required for high levels of viral expression, and constitutes a key therapeutic target. Unintegrated viral DNA (uDNA) can support only limited transcription but may contribute to viral propagation, persistence and/or treatment escape under specific situations. The molecular mechanisms involved in the differential expression of HIV uDNA vs integrated genome (iDNA) remain to be elucidated. Here, we demonstrate, for the first time, that the expression of HIV uDNA is mainly supported by 1-LTR circles, and regulated in the opposite way, relatively to iDNA, following NF-κB pathway modulation. Upon treatment activating the NF-κB pathway, NF-κB p65 and AP-1 (cFos/cJun) binding to HIV LTR iDNA correlates with increased iDNA expression, while uDNA expression decreases. On the contrary, inhibition of the NF-κB pathway promotes the expression of circular uDNA, and correlates with Bcl-3 and AP-1 binding to its LTR region. Finally, this study identifies NF-κB subunits and Bcl-3 as transcription factors binding the HIV promoter differently depending on viral genome topology, and opens new insights on the potential roles of episomal genomes during the HIV-1 latency and persistence. PMID:27167871

  1. Genome-guided transcript assembly from integrative analysis of RNA sequence data

    PubMed Central

    Boley, Nathan; Stoiber, Marcus H.; Booth, Benjamin W.; Wan, Kenneth H.; Hoskins, Roger A.; Bickel, Peter J.; Celniker, Susan E.; Brown, James B.

    2014-01-01

    The identification of full length transcripts entirely from short-read RNA sequencing data (RNA-seq) remains a challenge in genome annotation pipelines. Here we describe an automated pipeline for genome annotation that integrates RNA-seq and gene-boundary data sets, which we call generalized RNA integration tool, or GRIT. By applying GRIT to Drosophila melanogaster short-read RNA-seq, cap analysis of gene expression (CAGE) and poly(A)-site-seq data collected for the modENCODE project, we recover the vast majority of previously annotated transcripts and double the total number of transcripts cataloged. We find that 20% of protein coding genes encode multiple protein-localization signals, and that, in 20 day old adult fly heads, genes with multiple poly-adenylation sites are more common than genes with alternate splicing or alternate promoters. When compared to the most widely used transcript assembly tools, GRIT recovers a larger fraction of annotated transcripts at higher precision. GRIT will enable the automated generation of high-quality genome annotations without necessitating extensive manual annotation. PMID:24633242

  2. Matrix Factorization-Based Prediction of Novel Drug Indications by Integrating Genomic Space

    PubMed Central

    Dai, Wen; Liu, Xi; Gao, Yibo; Chen, Lin; Gao, Kuo; Jiang, Yongshi; Yang, Yiping; Chen, Jianxin

    2015-01-01

    There has been rising interest in the discovery of novel drug indications because of high costs in introducing new drugs. Many computational techniques have been proposed to detect potential drug-disease associations based on the creation of explicit profiles of drugs and diseases, while seldom research takes advantage of the immense accumulation of interaction data. In this work, we propose a matrix factorization model based on known drug-disease associations to predict novel drug indications. In addition, genomic space is also integrated into our framework. The introduction of genomic space, which includes drug-gene interactions, disease-gene interactions, and gene-gene interactions, is aimed at providing molecular biological information for prediction of drug-disease associations. The rationality lies in our belief that association between drug and disease has its evidence in the interactome network of genes. Experiments show that the integration of genomic space is indeed effective. Drugs, diseases, and genes are described with feature vectors of the same dimension, which are retrieved from the interaction data. Then a matrix factorization model is set up to quantify the association between drugs and diseases. Finally, we use the matrix factorization model to predict novel indications for drugs. PMID:26078775

  3. Integration of HIV in the Human Genome: Which Sites Are Preferential? A Genetic and Statistical Assessment

    PubMed Central

    Gonçalves, Juliana; Moreira, Elsa; Sequeira, Inês J.; Rodrigues, António S.; Rueff, José; Brás, Aldina

    2016-01-01

    Chromosomal fragile sites (FSs) are loci where gaps and breaks may occur and are preferential integration targets for some viruses, for example, Hepatitis B, Epstein-Barr virus, HPV16, HPV18, and MLV vectors. However, the integration of the human immunodeficiency virus (HIV) in Giemsa bands and in FSs is not yet completely clear. This study aimed to assess the integration preferences of HIV in FSs and in Giemsa bands using an in silico study. HIV integration positions from Jurkat cells were used and two nonparametric tests were applied to compare HIV integration in dark versus light bands and in FS versus non-FS (NFSs). The results show that light bands are preferential targets for integration of HIV-1 in Jurkat cells and also that it integrates with equal intensity in FSs and in NFSs. The data indicates that HIV displays different preferences for FSs compared to other viruses. The aim was to develop and apply an approach to predict the conditions and constraints of HIV insertion in the human genome which seems to adequately complement empirical data. PMID:27294106

  4. Pancreatic cancer modeling using retrograde viral vector delivery and in vivo CRISPR/Cas9-mediated somatic genome editing.

    PubMed

    Chiou, Shin-Heng; Winters, Ian P; Wang, Jing; Naranjo, Santiago; Dudgeon, Crissy; Tamburini, Fiona B; Brady, Jennifer J; Yang, Dian; Grüner, Barbara M; Chuang, Chen-Hua; Caswell, Deborah R; Zeng, Hong; Chu, Pauline; Kim, Grace E; Carpizo, Darren R; Kim, Seung K; Winslow, Monte M

    2015-07-15

    Pancreatic ductal adenocarcinoma (PDAC) is a genomically diverse, prevalent, and almost invariably fatal malignancy. Although conventional genetically engineered mouse models of human PDAC have been instrumental in understanding pancreatic cancer development, these models are much too labor-intensive, expensive, and slow to perform the extensive molecular analyses needed to adequately understand this disease. Here we demonstrate that retrograde pancreatic ductal injection of either adenoviral-Cre or lentiviral-Cre vectors allows titratable initiation of pancreatic neoplasias that progress into invasive and metastatic PDAC. To enable in vivo CRISPR/Cas9-mediated gene inactivation in the pancreas, we generated a Cre-regulated Cas9 allele and lentiviral vectors that express Cre and a single-guide RNA. CRISPR-mediated targeting of Lkb1 in combination with oncogenic Kras expression led to selection for inactivating genomic alterations, absence of Lkb1 protein, and rapid tumor growth that phenocopied Cre-mediated genetic deletion of Lkb1. This method will transform our ability to rapidly interrogate gene function during the development of this recalcitrant cancer. PMID:26178787

  5. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    SciTech Connect

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; Leslie, Christina

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  6. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGESBeta

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; Leslie, Christina

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of

  7. An Integrated Approach to Reconstructing Genome-Scale Transcriptional Regulatory Networks

    PubMed Central

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.

    2015-01-01

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  8. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. PMID:26432246

  9. The GDB Human Genome Data Base: a source of integrated genetic mapping and disease data.

    PubMed Central

    Brandt, K A

    1993-01-01

    The GDB Human Genome Data Base refers collectively to GDB and OMIM, Online Mendelian Inheritance in Man. GDB and OMIM are linked databases that provide an international repository for information generated by the Human Genome Initiative. GDB contains human gene mapping data, while OMIM offers the text of Dr. Victor A. McKusick's catalog of genetic disease and phenotype descriptions. These databases, updated and edited continuously, integrate bibliographic and full-text information with several types of mapping data. They are accessible through a flexible interface and are available through SprintNet and the Internet to the scientific community without cost. This paper provides an overview of the context, development, structure, content, and use of these databases. PMID:8374584

  10. An integrated map of structural variation in 2,504 human genomes

    PubMed Central

    Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K.; Malhotra, Ankit; Stütz, Adrian M.; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J.P.; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y. K.; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M.; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A.; Marth, Gabor; Mason, Christopher E.; Menelaou, Androniki; Muzny, Donna M.; Nelson, Bradley J.; Noor, Amina; Parrish, Nicholas F.; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E.; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A.; Untergasser, Andreas; Walker, Jerilyn A.; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A.; McCarroll, Steven A.; Mills, Ryan E.; Gerstein, Mark B.; Bashir, Ali; Stegle, Oliver; Devine, Scott E.; Lee, Charles; Eichler, Evan E.; Korbel, Jan O.

    2015-01-01

    Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association. PMID:26432246

  11. An integrative genomics screen uncovers ncRNA T-UCR functions in neuroblastoma tumours.

    PubMed

    Mestdagh, P; Fredlund, E; Pattyn, F; Rihani, A; Van Maerken, T; Vermeulen, J; Kumps, C; Menten, B; De Preter, K; Schramm, A; Schulte, J; Noguera, R; Schleiermacher, G; Janoueix-Lerosey, I; Laureys, G; Powel, R; Nittner, D; Marine, J-C; Ringnér, M; Speleman, F; Vandesompele, J

    2010-06-17

    Different classes of non-coding RNAs, including microRNAs, have recently been implicated in the process of tumourigenesis. In this study, we examined the expression and putative functions of a novel class of non-coding RNAs known as transcribed ultraconserved regions (T-UCRs) in neuroblastoma. Genome-wide expression profiling revealed correlations between specific T-UCR expression levels and important clinicogenetic parameters such as MYCN amplification status. A functional genomics approach based on the integration of multi-level transcriptome data was adapted to gain insights into T-UCR functions. Assignments of T-UCRs to cellular processes such as TP53 response, differentiation and proliferation were verified using various cellular model systems. For the first time, our results define a T-UCR expression landscape in neuroblastoma and suggest widespread T-UCR involvement in diverse cellular processes that are deregulated in the process of tumourigenesis. PMID:20383195

  12. Conditional Epistatic Interaction Maps Reveal Global Functional Rewiring of Genome Integrity Pathways in Escherichia coli.

    PubMed

    Kumar, Ashwani; Beloglazova, Natalia; Bundalovic-Torma, Cedoljub; Phanse, Sadhna; Deineko, Viktor; Gagarinova, Alla; Musso, Gabriel; Vlasblom, James; Lemak, Sofia; Hooshyar, Mohsen; Minic, Zoran; Wagih, Omar; Mosca, Roberto; Aloy, Patrick; Golshani, Ashkan; Parkinson, John; Emili, Andrew; Yakunin, Alexander F; Babu, Mohan

    2016-01-26

    As antibiotic resistance is increasingly becoming a public health concern, an improved understanding of the bacterial DNA damage response (DDR), which is commonly targeted by antibiotics, could be of tremendous therapeutic value. Although the genetic components of the bacterial DDR have been studied extensively in isolation, how the underlying biological pathways interact functionally remains unclear. Here, we address this by performing systematic, unbiased, quantitative synthetic genetic interaction (GI) screens and uncover widespread changes in the GI network of the entire genomic integrity apparatus of Escherichia coli under standard and DNA-damaging growth conditions. The GI patterns of untreated cultures implicated two previously uncharacterized proteins (YhbQ and YqgF) as nucleases, whereas reorganization of the GI network after DNA damage revealed DDR roles for both annotated and uncharacterized genes. Analyses of pan-bacterial conservation patterns suggest that DDR mechanisms and functional relationships are near universal, highlighting a modular and highly adaptive genomic stress response. PMID:26774489

  13. BiologicalNetworks 2.0 - an integrative view of genome biology data

    PubMed Central

    2010-01-01

    Background A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. Results Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. Conclusions The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org. PMID:21190573

  14. An integrative approach for efficient analysis of whole genome bisulfite sequencing data

    PubMed Central

    2015-01-01

    Background Whole genome bisulfite sequencing (WGBS) is a high-throughput technique for profiling genome-wide DNA methylation at single nucleotide resolution. However, the applications of WGBS are limited by low accuracy resulting from bisulfite-induced damage on DNA fragments. Although many computer programs have been developed for accurate detecting, most of the programs have barely succeeded in improving either quantity or quality of the methylation results. To improve both, we attempted to develop a novel integration of most widely used bisulfite-read mappers: Bismark, BSMAP, and BS-seeker2. Results A comprehensive analysis of the three mappers revealed that the mapping results of the mappers were mutually complementary under diverse read conditions. Therefore, we sought to integrate the characteristics of the mappers by scoring them to gain robustness against artifacts. As a result, the integration significantly increased detection accuracy compared with the individual mappers. In addition, the amount of detected cytosine was higher than that by Bismark. Furthermore, the integration successfully reduced the fluctuation of detection accuracy induced by read conditions. We applied the integration to real WGBS samples and succeeded in classifying the samples according to the originated tissues by both CpG and CpH methylation patterns. Conclusions In this study, we improved both quality and quantity of methylation results from WGBS data by integrating the mapping results of three bisulfite-read mappers. Also, we succeeded in combining and comparing WGBS samples by reducing the effects of read heterogeneity on methylation detection. This study contributes to DNA methylation researches by improving efficiency of methylation detection from WGBS data and facilitating the comprehensive analysis of public WGBS data. PMID:26680746

  15. USF-1 Is Critical for Maintaining Genome Integrity in Response to UV-Induced DNA Photolesions

    PubMed Central

    Mouchet, Nicolas; Vaulont, Sophie; Prince, Sharon; Galibert, Marie-Dominique

    2012-01-01

    An important function of all organisms is to ensure that their genetic material remains intact and unaltered through generations. This is an extremely challenging task since the cell's DNA is constantly under assault by endogenous and environmental agents. To protect against this, cells have evolved effective mechanisms to recognize DNA damage, signal its presence, and mediate its repair. While these responses are expected to be highly regulated because they are critical to avoid human diseases, very little is known about the regulation of the expression of genes involved in mediating their effects. The Nucleotide Excision Repair (NER) is the major DNA–repair process involved in the recognition and removal of UV-mediated DNA damage. Here we use a combination of in vitro and in vivo assays with an intermittent UV-irradiation protocol to investigate the regulation of key players in the DNA–damage recognition step of NER sub-pathways (TCR and GGR). We show an up-regulation in gene expression of CSA and HR23A, which are involved in TCR and GGR, respectively. Importantly, we show that this occurs through a p53 independent mechanism and that it is coordinated by the stress-responsive transcription factor USF-1. Furthermore, using a mouse model we show that the loss of USF-1 compromises DNA repair, which suggests that USF-1 plays an important role in maintaining genomic stability. PMID:22291606

  16. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    PubMed

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD. PMID:26849207

  17. Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility

    PubMed Central

    Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K.; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C.; Burgess, Shawn M.; Sampath, Karuna

    2016-01-01

    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. PMID:26818075

  18. Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility.

    PubMed

    Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna

    2016-01-01

    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. PMID:26818075

  19. INDIGO – INtegrated Data Warehouse of MIcrobial GenOmes with Examples from the Red Sea Extremophiles

    PubMed Central

    Alam, Intikhab; Antunes, André; Kamau, Allan Anthony; Ba alawi, Wail; Kalkatawi, Manal; Stingl, Ulrich; Bajic, Vladimir B.

    2013-01-01

    Background The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes. Results We developed a data warehouse system (INDIGO) that enables the integration of annotations for exploration and analysis of newly sequenced microbial genomes. INDIGO offers an opportunity to construct complex queries and combine annotations from multiple sources starting from genomic sequence to protein domain, gene ontology and pathway levels. This data warehouse is aimed at being populated with information from genomes of pure cultures and uncultured single cells of Red Sea bacteria and Archaea. Currently, INDIGO contains information from Salinisphaera shabanensis, Haloplasma contractile, and Halorhabdus tiamatea - extremophiles isolated from deep-sea anoxic brine lakes of the Red Sea. We provide examples of utilizing the system to gain new insights into specific aspects on the unique lifestyle and adaptations of these organisms to extreme environments. Conclusions We developed a data warehouse system, INDIGO, which enables comprehensive integration of information from various resources to be used for annotation, exploration and analysis of microbial genomes. It will be regularly updated and extended with new genomes. It is aimed to serve as a resource dedicated to the Red Sea microbes. In addition, through INDIGO, we provide our Automatic Annotation of Microbial Genomes (AAMG) pipeline. The INDIGO web server is freely available at http://www.cbrc.kaust.edu.sa/indigo. PMID

  20. Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics.

    PubMed

    Sakai, Hiroaki; Lee, Sung Shin; Tanaka, Tsuyoshi; Numa, Hisataka; Kim, Jungsok; Kawahara, Yoshihiro; Wakimoto, Hironobu; Yang, Ching-chia; Iwamoto, Masao; Abe, Takashi; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro; Ikemura, Toshimichi; Matsumoto, Takashi; Sasaki, Takuji; Itoh, Takeshi

    2013-02-01

    The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics. PMID:23299411

  1. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    PubMed

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-01-01

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu. PMID:26015273

  2. The GLOBE 3D Genome Platform - towards a novel system-biological paper tool to integrate the huge complexity of genome organization and function.

    PubMed

    Knoch, Tobias A; Lesnussa, Michael; Kepper, Nick; Eussen, Hubert B; Grosveld, Frank G

    2009-01-01

    Genomes are tremendous co-evolutionary holistic systems for molecular storage, processing and fabrication of information. Their system-biological complexity remains, however, still largely mysterious, despite immense sequencing achievements and huge advances in the understanding of the general sequential, three-dimensional and regulatory organization. Here, we present the GLOBE 3D Genome Platform a completely novel grid based virtual "paper" tool and in fact the first system-biological genome browser integrating the holistic complexity of genomes in a single easy comprehensible platform: Based on a detailed study of biophysical and IT requirements, every architectural level from sequence to morphology of one or several genomes can be approached in a real and in a symbolic representation simultaneously and navigated by continuous scale-free zooming within a unique three-dimensional OpenGL and grid driven environment. In principle an unlimited number of multi-dimensional data sets can be visualized, customized in terms of arrangement, shape, colour, and texture etc. as well as accessed and annotated individually or in groups using internal or external data bases/facilities. Any information can be searched and correlated by importing or calculating simple relations in real-time using grid resources. A general correlation and application platform for more complex correlative analysis and a front-end for system-biological simulations both using again the huge capabilities of grid infrastructures is currently under development. Hence, the GLOBE 3D Genome Platform is an example of a grid based approach towards a virtual desktop for genomic work combining the three fundamental distributed resources: i) visual data representation, ii) data access and management, and iii) data analysis and creation. Thus, the GLOBE 3D Genome Platform is the novel system-biology oriented information system urgently needed to access, present, annotate, and to simulate the holistic genome

  3. MicroRNA-mediated immune modulation as a therapeutic strategy in host-implant integration.

    PubMed

    Ong, Siew-Min; Biswas, Subhra K; Wong, Siew-Cheng

    2015-07-01

    The concept of implanting an artificial device into the human body was once the preserve of science fiction, yet this approach is now often used to replace lost or damaged biological structures in human patients. However, assimilation of medical devices into host tissues is a complex process, and successful implant integration into patients is far from certain. The body's immediate response to a foreign object is immune-mediated reaction, hence there has been extensive research into biomaterials that can reduce or even ablate anti-implant immune responses. There have also been attempts to embed or coat anti-inflammatory drugs and pro-regulatory molecules onto medical devices with the aim of preventing implant rejection by the host. In this review, we summarize the key immune mediators of medical implant reaction, and we evaluate the potential of microRNAs to regulate these processes to promote wound healing, and prolong host-implant integration. PMID:26024977

  4. Building a BRIDGE for the integration of heterogeneous data from functional genomics into a platform for systems biology.

    PubMed

    Goesmann, Alexander; Linke, Burkhard; Rupp, Oliver; Krause, Lutz; Bartels, Daniela; Dondrup, Michael; McHardy, Alice C; Wilke, Andreas; Pühler, Alfred; Meyer, Folker

    2003-12-19

    The flood of data acquired from the increasing number of publicly available genomes has led to new demands for bioinformatics software. With the growing amount of information resulting from high throughput experiments new questions arise that often focus on the comparison of genes, genomes, and their expression profiles. Inferring new knowledge by combining different kinds of "post-genomics" data obviously necessitates the development of new approaches that allow the integration of variable data sources into a flexible framework. In this paper, we describe our concept for the integration of heterogeneous data into a platform for systems biology. We have implemented a Bioinformatics Resource for the Integration of heterogeneous Data from Genomic Explorations (BRIDGE) and illustrate the usability of our approach as a platform for systems biology for two sample applications. PMID:14651858

  5. Complete genome sequence of the Sporosarcina psychrophila DSM 6497, a psychrophilic Bacillus strain that mediates the calcium carbonate precipitation.

    PubMed

    Yan, Wenkai; Xiao, Xiang; Zhang, Yu

    2016-05-20

    Sporosarcina psychrophila DSM 6497 is a gram positive, spore-formation psychrophilic bacterial strain, widely distributed in terrestrial and aquatic environments. Here we report its complete sequence including one circular chromosome of 4674191bp with a GC content of 40.3%. Genes encoding urease are predicted in the genome, which provide insight information on the microbiologically mediated urea hydrolysis process. This urea hydrolysis can further lead to an increase of carbonate anion and alkalinity in the environment, which promotes the microbiologically induced carbonate precipitation with various applications, such as the bioremediation of calcium rich wastewater and bio-reservation of architectural patrimony. PMID:27015981

  6. Ethics orientation as a mediator of organizational integrity in health services organizations.

    PubMed

    Proenca, E Jose

    2004-01-01

    Increasing scrutiny of ethical misconduct by federal and state agencies has prompted health services organizations to adopt codes of ethics and institute legal compliance programs. However, there is little understanding of the impact of ethics programs or the manner in which program elements act to enhance organizational integrity. This study examined the effect of five ethics program elements on organizational integrity and the mediating role played by ethics orientation in this relationship. It found that program elements influence organizational integrity by engendering among employees a values orientation, a compliance orientation, or both. Furthermore, program elements that induced both orientations have a larger impact on integrity. These findings have important implications for health services managers involved in designing and implementing an ethics program. PMID:14992483

  7. Integrative functional genomic delineation of the cascades of transcriptional changes involved in hepatocellular carcinoma progression.

    PubMed

    Ramesh, Vignesh; Ganesan, Kumaresan

    2016-10-01

    Development of targeted therapeutics is still at its early stage for hepatocellular carcinoma (HCC) due to the incomplete understanding of the confounding regulations at signaling pathway level. In this investigation, gene co-expression-based networking and integrative functional genomic modeling of HCC mRNA profiles as signaling processes were employed to understand the complex signaling cascades involved in HCC development toward understanding the avenues for targeted therapeutics. Multiple sets of genes and molecular biological processes involved during HCC development were identified from this integrative analysis: (i) Loss of liver cellular features due to the reduced HNF4A & PPAR signaling in the early stages of HCC, (ii) activated inflammatory and stress signals in the cirrhosis stages and (iii) highly activated cellular proliferation with the activated E2F-MYC oncogenic signaling with the gain of embryonic liver stem cell-like features in the advanced stage tumors. Upon connecting these gene-sets with the established drug sensitivity-related gene signatures, targeted therapeutic strategies for the heterogeneous HCC conditions have been identified. PPAR agonist class of drugs for early stage HCC conditions, anti-inflammatory drugs for cirrhosis and topoisomerase inhibitors for the advanced HCC conditions were inferred. Integrative functional genomic analysis of HCC transcriptome profiles at the context of signaling pathways has defined the key molecular processes involved in HCC development. Further, the study highlights the stage-specific and pathway focused targeted therapeutics for HCC. These findings deserve extensive preclinical explorations toward the establishment of targeted therapeutics. PMID:27194100

  8. Perspectives on Clinical Informatics: Integrating Large-Scale Clinical, Genomic, and Health Information for Clinical Care

    PubMed Central

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K.

    2013-01-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  9. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care.

    PubMed

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K; Chung, Yeun-Jun

    2013-12-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  10. Triplex-Inspector: an analysis tool for triplex-mediated targeting of genomic loci

    PubMed Central

    Buske, Fabian A.; Bauer, Denis C.; Mattick, John S.; Bailey, Timothy L.

    2013-01-01

    Summary: At the heart of many modern biotechnological and therapeutic applications lies the need to target specific genomic loci with pinpoint accuracy. Although landmark experiments demonstrate technological maturity in manufacturing and delivering genetic material, the genomic sequence analysis to find suitable targets lags behind. We provide a computational aid for the sophisticated design of sequence-specific ligands and selection of appropriate targets, taking gene location and genomic architecture into account. Availability: Source code and binaries are downloadable from www.bioinformatics.org.au/triplexator/inspector. Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23740745

  11. Genome-wide analysis uncovers high frequency, strong differential chromosomal interactions and their associated epigenetic patterns in E2-mediated gene regulation

    PubMed Central

    2013-01-01

    Background An emerging Hi-C protocol has the ability to probe three-dimensional (3D) architecture and capture chromatin interactions in a genome-wide scale. It provides informative results to address how chromatin organization changes contribute to disease/tumor occurrence and progression in response to stimulation of environmental chemicals or hormones. Results In this study, using MCF7 cells as a model system, we found estrogen stimulation significantly impact chromatin interactions, leading to alteration of gene regulation and the associated histone modification states. Many chromosomal interaction regions at different levels of interaction frequency were identified. In particular, the top 10 hot regions with the highest interaction frequency are enriched with breast cancer specific genes. Furthermore, four types of E2-mediated strong differential (gain- or loss-) chromosomal (intra- or inter-) interactions were classified, in which the number of gain-chromosomal interactions is less than the number of loss-chromosomal interactions upon E2 stimulation. Finally, by integrating with eight histone modification marks, DNA methylation, regulatory elements regions, ERα and Pol-II binding activities, associations between epigenetic patterns and high chromosomal interaction frequency were revealed in E2-mediated gene regulation. Conclusions The work provides insight into the effect of chromatin interaction on E2/ERα regulated downstream genes in breast cancer cells. PMID:23368971

  12. Nuclear-receptor-mediated telomere insertion leads to genome instability in ALT cancers.

    PubMed

    Marzec, Paulina; Armenise, Claudia; Pérot, Gaëlle; Roumelioti, Fani-Marlen; Basyuk, Eugenia; Gagos, Sarantis; Chibon, Frédéric; Déjardin, Jérôme

    2015-02-26

    The breakage-fusion-bridge cycle is a classical mechanism of telomere-driven genome instability in which dysfunctional telomeres are fused to other chromosomal extremities, creating dicentric chromosomes that eventually break at mitosis. Here, we uncover a distinct pathway of telomere-driven genome instability, specifically occurring in cells that maintain telomeres with the alternative lengthening of telomeres mechanism. We show that, in these cells, telomeric DNA is added to multiple discrete sites throughout the genome, corresponding to regions regulated by NR2C/F transcription factors. These proteins drive local telomere DNA addition by recruiting telomeric chromatin. This mechanism, which we name targeted telomere insertion (TTI), generates potential common fragile sites that destabilize the genome. We propose that TTI driven by NR2C/F proteins contributes to the formation of complex karyotypes in ALT tumors. PMID:25723166

  13. A Genome-Wide Map of AAV-Mediated Human Gene Targeting

    PubMed Central

    Deyle, David R.; Hansen, R. Scott; Cornea, Anda M.; Li, Li B.; Burt, Amber A.; Alexander, Ian E.; Sandstrom, Richard S.; Stamatoyannopoulos, John A.; Wei, Chia-Lin; Russell, David W.

    2014-01-01

    To determine which genomic features promote homologous recombination, we created a genome-wide map of gene targeting sites. An adeno-associated virus vector was used to target identical loci introduced as transcriptionally active retroviral vector proviruses. A comparison of ~2,000 targeted and untargeted sites showed that targeting occurred throughout the human genome and was not influenced by the presence of nearby CpG islands, sequence repeats, or DNase I hypersensitive sites. Targeted sites were preferentially found within transcription units, especially when the target loci were transcribed in the opposite orientation to their surrounding chromosomal genes. The impact of DNA replication was determined by mapping replication forks, which revealed a preference for recombination at target loci transcribed towards an incoming fork. Our results constitute the first genome-wide screen of gene targeting in mammalian cells, and they demonstrate a strong recombinogenic effect of colliding polymerases. PMID:25282150

  14. Mycobacterium tuberculosis EsxO (Rv2346c) promotes bacillary survival by inducing oxidative stress mediated genomic instability in macrophages.

    PubMed

    Mohanty, Soumitra; Dal Molin, Michael; Ganguli, Geetanjali; Padhi, Avinash; Jena, Prajna; Selchow, Petra; Sengupta, Srabasti; Meuli, Michael; Sander, Peter; Sonawane, Avinash

    2016-01-01

    Mycobacterium tuberculosis (Mtb) survives inside the macrophages by modulating the host immune responses in its favor. The 6-kDa early secretory antigenic target (ESAT-6; esxA) of Mtb is known as a potent virulence and T-cell antigenic determinant. At least 23 such ESAT-6 family proteins are encoded in the genome of Mtb; however, the function of many of them is still unknown. We herein report that ectopic expression of Mtb Rv2346c (esxO), a member of ESAT-6 family proteins, in non-pathogenic Mycobacterium smegmatis strain (MsmRv2346c) aids host cell invasion and intracellular bacillary persistence. Further mechanistic studies revealed that MsmRv2346c infection abated macrophage immunity by inducing host cell death and genomic instability as evident from the appearance of several DNA damage markers. We further report that the induction of genomic instability in infected cells was due to increase in the hosts oxidative stress responses. MsmRv2346c infection was also found to induce autophagy and modulate the immune function of macrophages. In contrast, blockade of Rv2346c induced oxidative stress by treatment with ROS inhibitor N-acetyl-L-cysteine prevented the host cell death, autophagy induction and genomic instability in infected macrophages. Conversely, MtbΔRv2346c mutant did not show any difference in intracellular survival and oxidative stress responses. We envision that Mtb ESAT-6 family protein Rv2346c dampens antibacterial effector functions namely by inducing oxidative stress mediated genomic instability in infected macrophages, while loss of Rv2346c gene function may be compensated by other redundant ESAT-6 family proteins. Thus EsxO plays an important role in mycobacterial pathogenesis in the context of innate immunity. PMID:26786654

  15. InvFEST, a database integrating information of polymorphic inversions in the human genome

    PubMed Central

    Martínez-Fundichely, Alexander; Casillas, Sònia; Egea, Raquel; Ràmia, Miquel; Barbadilla, Antonio; Pantano, Lorena; Puig, Marta; Cáceres, Mario

    2014-01-01

    The newest genomic advances have uncovered an unprecedented degree of structural variation throughout genomes, with great amounts of data accumulating rapidly. Here we introduce InvFEST (http://invfestdb.uab.cat), a database combining multiple sources of information to generate a complete catalogue of non-redundant human polymorphic inversions. Due to the complexity of this type of changes and the underlying high false-positive discovery rate, it is necessary to integrate all the available data to get a reliable estimate of the real number of inversions. InvFEST automatically merges predictions into different inversions, refines the breakpoint locations, and finds associations with genes and segmental duplications. In addition, it includes data on experimental validation, population frequency, functional effects and evolutionary history. All this information is readily accessible through a complete and user-friendly web report for each inversion. In its current version, InvFEST combines information from 34 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Therefore, InvFEST aims to represent the most reliable set of human inversions and become a central repository to share information, guide future studies and contribute to the analysis of the functional and evolutionary impact of inversions on the human genome. PMID:24253300

  16. Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci.

    PubMed

    Miller, Clint L; Pjanic, Milos; Wang, Ting; Nguyen, Trieu; Cohain, Ariella; Lee, Jonathan D; Perisic, Ljubica; Hedin, Ulf; Kundu, Ramendra K; Majmudar, Deshna; Kim, Juyong B; Wang, Oliver; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Assimes, Themistocles L; Montgomery, Stephen B; Schadt, Eric E; Björkegren, Johan L M; Quertermous, Thomas

    2016-01-01

    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified >150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context. PMID:27386823

  17. Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci

    PubMed Central

    Miller, Clint L.; Pjanic, Milos; Wang, Ting; Nguyen, Trieu; Cohain, Ariella; Lee, Jonathan D.; Perisic, Ljubica; Hedin, Ulf; Kundu, Ramendra K.; Majmudar, Deshna; Kim, Juyong B.; Wang, Oliver; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Assimes, Themistocles L.; Montgomery, Stephen B.; Schadt, Eric E.; Björkegren, Johan L.M.; Quertermous, Thomas

    2016-01-01

    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified >150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context. PMID:27386823

  18. An integrated map of genetic variation from 1,092 human genomes.

    PubMed

    Abecasis, Goncalo R; Auton, Adam; Brooks, Lisa D; DePristo, Mark A; Durbin, Richard M; Handsaker, Robert E; Kang, Hyun Min; Marth, Gabor T; McVean, Gil A

    2012-11-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. PMID:23128226

  19. An integrated map of genetic variation from 1,092 human genomes

    PubMed Central

    2012-01-01

    Summary Through characterising the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help understand the genetic contribution to disease. We describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methodologies to integrate information across multiple algorithms and diverse data sources we provide a validated haplotype map of 38 million SNPs, 1.4 million indels and over 14 thousand larger deletions. We show that individuals from different populations carry different profiles of rare and common variants and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways and that each individual harbours hundreds of rare non-coding variants at conserved sites, such as transcription-factor-motif disrupting changes. This resource, which captures up to 98% of accessible SNPs at a frequency of 1% in populations of medical genetics focus, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. PMID:23128226

  20. InvFEST, a database integrating information of polymorphic inversions in the human genome.

    PubMed

    Martínez-Fundichely, Alexander; Casillas, Sònia; Egea, Raquel; Ràmia, Miquel; Barbadilla, Antonio; Pantano, Lorena; Puig, Marta; Cáceres, Mario

    2014-01-01

    The newest genomic advances have uncovered an unprecedented degree of structural variation throughout genomes, with great amounts of data accumulating rapidly. Here we introduce InvFEST (http://invfestdb.uab.cat), a database combining multiple sources of information to generate a complete catalogue of non-redundant human polymorphic inversions. Due to the complexity of this type of changes and the underlying high false-positive discovery rate, it is necessary to integrate all the available data to get a reliable estimate of the real number of inversions. InvFEST automatically merges predictions into different inversions, refines the breakpoint locations, and finds associations with genes and segmental duplications. In addition, it includes data on experimental validation, population frequency, functional effects and evolutionary history. All this information is readily accessible through a complete and user-friendly web report for each inversion. In its current version, InvFEST combines information from 34 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Therefore, InvFEST aims to represent the most reliable set of human inversions and become a central repository to share information, guide future studies and contribute to the analysis of the functional and evolutionary impact of inversions on the human genome. PMID:24253300

  1. Transgene expression after rep-mediated site-specific integration into chromosome 19.

    PubMed

    Philpott, Nicola J; Gomos, Janette; Falck-Pedersen, Erik

    2004-01-01

    We have used a plasmid-based transfection model of the adeno-associated virus (AAV) Rep-mediated site-specific integration (RMSSI) pathway to characterize the stability and expression of a site-specifically integrated transgene (either green fluorescent protein [GFP] or chloramphenicol acetyltransferase [CAT]). Three plasmids containing the AAV p5 integration efficiency element (p5IEE) have been used to study integration and transgene expression in HeLa cells: (1) pRepGFP(itr+) contains both AAV ITRs, rep, and p5IEE and can be used as either a plasmid or rAAV vehicle for integration; (2) pRepGFP(itr-) contains the AAV rep gene and the p5IEE; (3) pAd-p5CAT contains only the 138-bp p5IEE of AAV. The data presented demonstrate that in the absence of drug selection, all three constructs undergo site-specific integration (efficiencies of between 10 and 40% of transduced cell lines). At 6 weeks posttransfection most cell lines that underwent RMSSI also expressed the appropriate transgene product. By 18 weeks posttransfection cell lines that were established with rep in cis to the transgene showed a decline in transgene expression as well as a loss of transgene DNA. In many cell lines, there appears to be transgene-containing DNA that does not contribute to gene expression. Data support a model of gene expression and transgene instability through a Rep-mediated pathway. In contrast to rep-containing cell lines, clonal cell lines containing p5IEECAT (with Rep provided in trans) maintained both the integrated transgene and transgene expression throughout the entire experimental time course (18 weeks). PMID:14965377

  2. Multiple proviral integration events after virological synapse-mediated HIV-1 spread

    SciTech Connect

    Russell, Rebecca A.; Martin, Nicola; Mitar, Ivonne; Jones, Emma; Sattentau, Quentin J.

    2013-08-15

    HIV-1 can move directly between T cells via virological synapses (VS). Although aspects of the molecular and cellular mechanisms underlying this mode of spread have been elucidated, the outcomes for infection of the target cell remain incompletely understood. We set out to determine whether HIV-1 transfer via VS results in productive, high-multiplicity HIV-1 infection. We found that HIV-1 cell-to-cell spread resulted in nuclear import of multiple proviruses into target cells as seen by fluorescence in-situ hybridization. Proviral integration into the target cell genome was significantly higher than that seen in a cell-free infection system, and consequent de novo viral DNA and RNA production in the target cell detected by quantitative PCR increased over time. Our data show efficient proviral integration across VS, implying the probability of multiple integration events in target cells that drive productive T cell infection. - Highlights: • Cell-to-cell HIV-1 infection delivers multiple vRNA copies to the target cell. • Cell-to-cell infection results in productive infection of the target cell. • Cell-to-cell transmission is more efficient than cell-free HIV-1 infection. • Suggests a mechanism for recombination in cells infected with multiple viral genomes.

  3. Integrative Genomic Analyses Identify BRF2 as a Novel Lineage-Specific Oncogene in Lung Squamous Cell Carcinoma

    PubMed Central

    Lockwood, William W.; Chari, Raj; Coe, Bradley P.; Thu, Kelsie L.; Garnis, Cathie; Malloff, Chad A.; Campbell, Jennifer; Williams, Ariane C.; Hwang, Dorothy; Zhu, Chang-Qi; Buys, Timon P. H.; Yee, John; English, John C.; MacAulay, Calum; Tsao, Ming-Sound; Gazdar, Adi F.; Minna, John D.; Lam, Stephen; Lam, Wan L.

    2010-01-01

    Background Traditionally, non-small cell lung cancer is treated as a single disease entity in terms of systemic therapy. Emerging evidence suggests the major subtypes—adenocarcinoma (AC) and squamous cell carcinoma (SqCC)—respond differently to therapy. Identification of the molecular differences between these tumor types will have a significant impact in designing novel therapies that can improve the treatment outcome. Methods and Findings We used an integrative genomics approach, combing high-resolution comparative genomic hybridization and gene expression microarray profiles, to compare AC and SqCC tumors in order to uncover alterations at the DNA level, with corresponding gene transcription changes, which are selected for during development of lung cancer subtypes. Through the analysis of multiple independent cohorts of clinical tumor samples (>330), normal lung tissues and bronchial epithelial cells obtained by bronchial brushing in smokers without lung cancer, we identified the overexpression of BRF2, a gene on Chromosome 8p12, which is specific for development of SqCC of lung. Genetic activation of BRF2, which encodes a RNA polymerase III (Pol III) transcription initiation factor, was found to be associated with increased expression of small nuclear RNAs (snRNAs) that are involved in processes essential for cell growth, such as RNA splicing. Ectopic expression of BRF2 in human bronchial epithelial cells induced a transformed phenotype and demonstrates downstream oncogenic effects, whereas RNA interference (RNAi)-mediated knockdown suppressed growth and colony formation of SqCC cells overexpressing BRF2, but not AC cells. Frequent activation of BRF2 in >35% preinvasive bronchial carcinoma in situ, as well as in dysplastic lesions, provides evidence that BRF2 expression is an early event in cancer development of this cell lineage. Conclusions This is the first study, to our knowledge, to show that the focal amplification of a gene in Chromosome 8p12, plays

  4. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

    PubMed

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  5. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research

    PubMed Central

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  6. Genome scale models of yeast: towards standardized evaluation and consistent omic integration.

    PubMed

    Sánchez, Benjamín J; Nielsen, Jens

    2015-08-01

    Genome scale models (GEMs) have enabled remarkable advances in systems biology, acting as functional databases of metabolism, and as scaffolds for the contextualization of high-throughput data. In the case of Saccharomyces cerevisiae (budding yeast), several GEMs have been published and are currently used for metabolic engineering and elucidating biological interactions. Here we review the history of yeast's GEMs, focusing on recent developments. We study how these models are typically evaluated, using both descriptive and predictive metrics. Additionally, we analyze the different ways in which all levels of omics data (from gene expression to flux) have been integrated in yeast GEMs. Relevant conclusions and current challenges for both GEM evaluation and omic integration are highlighted. PMID:26079294

  7. Integrative genomic mining for enzyme function to enable engineering of a non-natural biosynthetic pathway

    PubMed Central

    Mak, Wai Shun; Tran, Stephen; Marcheschi, Ryan; Bertolani, Steve; Thompson, James; Baker, David; Liao, James C.; Siegel, Justin B.

    2015-01-01

    The ability to biosynthetically produce chemicals beyond what is commonly found in Nature requires the discovery of novel enzyme function. Here we utilize two approaches to discover enzymes that enable specific production of longer-chain (C5–C8) alcohols from sugar. The first approach combines bioinformatics and molecular modelling to mine sequence databases, resulting in a diverse panel of enzymes capable of catalysing the targeted reaction. The median catalytic efficiency of the computationally selected enzymes is 75-fold greater than a panel of naively selected homologues. This integrative genomic mining approach establishes a unique avenue for enzyme function discovery in the rapidly expanding sequence databases. The second approach uses computational enzyme design to reprogramme specificity. Both approaches result in enzymes with >100-fold increase in specificity for the targeted reaction. When enzymes from either approach are integrated in vivo, longer-chain alcohol production increases over 10-fold and represents >95% of the total alcohol products. PMID:26598135

  8. Integrative genomic mining for enzyme function to enable engineering of a non-natural biosynthetic pathway.

    PubMed

    Mak, Wai Shun; Tran, Stephen; Marcheschi, Ryan; Bertolani, Steve; Thompson, James; Baker, David; Liao, James C; Siegel, Justin B

    2015-01-01

    The ability to biosynthetically produce chemicals beyond what is commonly found in Nature requires the discovery of novel enzyme function. Here we utilize two approaches to discover enzymes that enable specific production of longer-chain (C5-C8) alcohols from sugar. The first approach combines bioinformatics and molecular modelling to mine sequence databases, resulting in a diverse panel of enzymes capable of catalysing the targeted reaction. The median catalytic efficiency of the computationally selected enzymes is 75-fold greater than a panel of naively selected homologues. This integrative genomic mining approach establishes a unique avenue for enzyme function discovery in the rapidly expanding sequence databases. The second approach uses computational enzyme design to reprogramme specificity. Both approaches result in enzymes with >100-fold increase in specificity for the targeted reaction. When enzymes from either approach are integrated in vivo, longer-chain alcohol production increases over 10-fold and represents >95% of the total alcohol products. PMID:26598135

  9. New approaches to assessing the effects of mutagenic agents on the integrity of the human genome.

    PubMed

    Elespuru, R K; Sankaranarayanan, K

    2007-03-01

    Heritable genetic alterations, although individually rare, have a substantial collective health impact. Approximately 20% of these are new mutations of unknown cause. Assessment of the effect of exposures to DNA damaging agents, i.e. mutagenic chemicals and radiations, on the integrity of the human genome and on the occurrence of genetic disease remains a daunting challenge. Recent insights may explain why previous examination of human exposures to ionizing radiation, as in Hiroshima and Nagasaki, failed to reveal heritable genetic effects. New opportunities to assess the heritable genetic damaging effects of environmental mutagens are afforded by: (1) integration of knowledge on the molecular nature of genetic disorders and the molecular effects of mutagens; (2) the development of more practical assays for germline mutagenesis; (3) the likely use of population-based genetic screening in personalized medicine. PMID:17174354

  10. CRISPR/Cas9 mediated genome editing in ES cells and its application for chimeric analysis in mice.

    PubMed

    Oji, Asami; Noda, Taichi; Fujihara, Yoshitaka; Miyata, Haruhiko; Kim, Yeon Joo; Muto, Masanaga; Nozawa, Kaori; Matsumura, Takafumi; Isotani, Ayako; Ikawa, Masahito

    2016-01-01

    Targeted gene disrupted mice can be efficiently generated by expressing a single guide RNA (sgRNA)/CAS9 complex in the zygote. However, the limited success of complicated genome editing, such as large deletions, point mutations, and knockins, remains to be improved. Further, the mosaicism in founder generations complicates the genotypic and phenotypic analyses in these animals. Here we show that large deletions with two sgRNAs as well as dsDNA-mediated point mutations are efficient in mouse embryonic stem cells (ESCs). The dsDNA-mediated gene knockins are also feasible in ESCs. Finally, we generated chimeric mice with biallelic mutant ESCs for a lethal gene, Dnajb13, and analyzed their phenotypes. Not only was the lethal phenotype of hydrocephalus suppressed, but we also found that Dnajb13 is required for sperm cilia formation. The combination of biallelic genome editing in ESCs and subsequent chimeric analysis provides a useful tool for rapid gene function analysis in the whole organism. PMID:27530713

  11. Tracing Phosphate Ions Generated during Loop-Mediated Isothermal Amplification for Electrochemical Detection of Nosema bombycis Genomic DNA PTP1.

    PubMed

    Xie, Shunbi; Yuan, Yali; Chai, Yaqin; Yuan, Ruo

    2015-10-20

    Traditionally, amplified DNA detection in a loop-mediated isothermal amplification (LAMP) was carried out in a complicated gel electrophoresis or with expensive fluorescence-based methods. Here, instead of direct detection that relies on amplified DNA, the indirect detection based on tracing phosphate ions (Pi) generated during LAMP by using an electrochemical method has been proposed for sensitive nucleic acid detection. Pyrophosphate (PPi) as the byproduct of nucleic acid polymerization reaction in LAMP was hydrolyzed into Pi by the preaddition of thermostable inorganic pyrophosphatase (PPase). Thus, the total amount of Pi in the LAMP-amplified sample was proportional to the amount of starting DNA templates. The obtained Pi could then react with acidic molybdate to form the molybdophosphate precipitates on the electrode surface, which serve as redox mediators to give a readily measurable electrochemical signal. The practicality of this strategy has been further demonstrated by employing it for sensitive and accurate quantification of Nosema bombycis genomic DNA PTP1. The electrochemical method allowed the quantitative analysis for target genomic DNA with a detection limit of 17 fg/μL. Thus, we suppose that the novel method proposed in this work with superior sensitivity and specificity, as well as the simple feature, can be easily established for quantitative analysis of many other kinds of nucleic acids in the assistance of LAMP. PMID:26412581

  12. CRISPR/Cas9 mediated genome editing in ES cells and its application for chimeric analysis in mice

    PubMed Central

    Oji, Asami; Noda, Taichi; Fujihara, Yoshitaka; Miyata, Haruhiko; Kim, Yeon Joo; Muto, Masanaga; Nozawa, Kaori; Matsumura, Takafumi; Isotani, Ayako; Ikawa, Masahito

    2016-01-01

    Targeted gene disrupted mice can be efficiently generated by expressing a single guide RNA (sgRNA)/CAS9 complex in the zygote. However, the limited success of complicated genome editing, such as large deletions, point mutations, and knockins, remains to be improved. Further, the mosaicism in founder generations complicates the genotypic and phenotypic analyses in these animals. Here we show that large deletions with two sgRNAs as well as dsDNA-mediated point mutations are efficient in mouse embryonic stem cells (ESCs). The dsDNA-mediated gene knockins are also feasible in ESCs. Finally, we generated chimeric mice with biallelic mutant ESCs for a lethal gene, Dnajb13, and analyzed their phenotypes. Not only was the lethal phenotype of hydrocephalus suppressed, but we also found that Dnajb13 is required for sperm cilia formation. The combination of biallelic genome editing in ESCs and subsequent chimeric analysis provides a useful tool for rapid gene function analysis in the whole organism. PMID:27530713

  13. Microenvironmental Heterogeneity Parallels Breast Cancer Progression: A Histology–Genomic Integration Analysis

    PubMed Central

    Natrajan, Rachael; Sailem, Heba; Mardakheh, Faraz K.; Arias Garcia, Mar; Tape, Christopher J.; Dowsett, Mitch; Bakal, Chris; Yuan, Yinyin

    2016-01-01

    Background The intra-tumor diversity of cancer cells is under intense investigation; however, little is known about the heterogeneity of the tumor microenvironment that is key to cancer progression and evolution. We aimed to assess the degree of microenvironmental heterogeneity in breast cancer and correlate this with genomic and clinical parameters. Methods and Findings We developed a quantitative measure of microenvironmental heterogeneity along three spatial dimensions (3-D) in solid tumors, termed the tumor ecosystem diversity index (EDI), using fully automated histology image analysis coupled with statistical measures commonly used in ecology. This measure was compared with disease-specific survival, key mutations, genome-wide copy number, and expression profiling data in a retrospective study of 510 breast cancer patients as a test set and 516 breast cancer patients as an independent validation set. In high-grade (grade 3) breast cancers, we uncovered a striking link between high microenvironmental heterogeneity measured by EDI and a poor prognosis that cannot be explained by tumor size, genomics, or any other data types. However, this association was not observed in low-grade (grade 1 and 2) breast cancers. The prognostic value of EDI was superior to known prognostic factors and was enhanced with the addition of TP53 mutation status (multivariate analysis test set, p = 9 × 10−4, hazard ratio = 1.47, 95% CI 1.17–1.84; validation set, p = 0.0011, hazard ratio = 1.78, 95% CI 1.26–2.52). Integration with genome-wide profiling data identified losses of specific genes on 4p14 and 5q13 that were enriched in grade 3 tumors with high microenvironmental diversity that also substratified patients into poor prognostic groups. Limitations of this study include the number of cell types included in the model, that EDI has prognostic value only in grade 3 tumors, and that our spatial heterogeneity measure was dependent on spatial scale and tumor size. Conclusions To

  14. Purdue Ionomics Information Management System. An Integrated Functional Genomics Platform1[C][W][OA

    PubMed Central

    Baxter, Ivan; Ouzzani, Mourad; Orcun, Seza; Kennedy, Brad; Jandhyala, Shrinivas S.; Salt, David E.

    2007-01-01

    The advent of high-throughput phenotyping technologies has created a deluge of information that is difficult to deal with without the appropriate data management tools. These data management tools should integrate defined workflow controls for genomic-scale data acquisition and validation, data storage and retrieval, and data analysis, indexed around the genomic information of the organism of interest. To maximize the impact of these large datasets, it is critical that they are rapidly disseminated to the broader research community, allowing open access for data mining and discovery. We describe here a system that incorporates such functionalities developed around the Purdue University high-throughput ionomics phenotyping platform. The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS is deployed as a World Wide Web-enabled system, allowing for integration of distributed workflow processes and open access to raw data for analysis by numerous laboratories. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations. PiiMS is accessible at www.purdue.edu/dp/ionomics. PMID:17189337

  15. The structure of adenovirus type 12 DNA integration sites in the hamster cell genome.

    PubMed Central

    Knoblauch, M; Schröer, J; Schmitz, B; Doerfler, W

    1996-01-01

    Foreign DNA can integrate into the genomes of mammalian cells, and this process plays major roles in viral oncogenesis and in the generation of transgenic organisms and will be important in evolving regimens for human somatic gene therapy. In the present study, the insertion sites of adenovirus type 12 (Ad12) DNA genomes have been analyzed in detail in the Ad12-transformed hamster cell line T637, its revertants, which have lost most of the >20 Ad12 genome equivalents integrated chromosomally in cell line T637, and in the Ad12-induced tumor T191. Some of these junction sites have been molecularly cloned, and the nucleotide sequences at the sites of transition between viral and cellular DNAs have been determined. The sites of linkage between the hamster cellular and the foreign (viral) DNA are characterized by the frequent occurrence of patch homologies between the recombination partners. The cellular junction sites investigated here are not transcriptionally active. One of the cellular DNA sequences abutting the right Ad12 DNA terminus in cell line T637 (os2) is represented only once in the hamster genome and has a strikingly low abundance of 5'-CG-3' dinucleotide sequences. One 5'-GCGC-3' sequence close to the Ad12 DNA integration site is heavily methylated in normal cells, Ad12-transformed cells, and Ad12-induced tumor cells. The second such sequence is more remote from the junction site, is partly methylated in BHK21 hamster cells, and shows differences in methylation in different Ad12-transformed cell lines. This site is unmethylated in liver DNA. The cellular DNA sequence at the site of Ad12 linkage in the tumor T191 exhibits homologies to highly repetitive sequences of the Alu family and to an origin of hamster DNA replication containing an Alu element. A number of junction sites between Ad12 DNA and hamster or mouse DNA in Ad12-transformed cell lines or Ad12-induced tumor cell lines, investigated here and previously, are characterized by stem-loop structures

  16. Bridging the Gap from Bench to Bedside--An Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED).

    PubMed

    2015-01-01

    The abundance of heterogeneous biomedical data from a variety of sources demands the development of strategies to address data integration and management issues, so that the data can be used effectively in clinical practices and biomedical research. This research presents an Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED) and provides a roadmap that envisions utilizing the clinical and biomedical resources in our case study. This work describes a data integration approach, proposed by ICGED, with a two-fold purpose: personalized medicine and biomedical data storage and sharing platform. It describes our experiences integrating disease specific clinical and genomics datasets with Data Integration and Analysis Tools (DIAT)--using Informatics for Integrating Biology and the Bedside, and discusses work in progress and future work for extending DIAT, and the development of Risk Assessment and Prediction Tools, Clinical Decision Support Systems and a Bioinformatics Data Warehouse. PMID:26262353

  17. Integrated high-throughput analysis identifies Sp1 as a crucial determinant of p53-mediated apoptosis

    PubMed Central

    Li, H; Zhang, Y; Ströse, A; Tedesco, D; Gurova, K; Selivanova, G

    2014-01-01

    The restoration of p53 tumor suppressor function is a promising therapeutic strategy to combat cancer. However, the biological outcomes of p53 activation, ranging from the promotion of growth arrest to the induction of cell death, are hard to predict, which limits the clinical application of p53-based therapies. In the present study, we performed an integrated analysis of genome-wide short hairpin RNA screen and gene expression data and uncovered a previously unrecognized role of Sp1 as a central modulator of the transcriptional response induced by p53 that leads to robust induction of apoptosis. Sp1 is indispensable for the pro-apoptotic transcriptional repression by p53, but not for the induction of pro-apoptotic genes. Furthermore, the p53-dependent pro-apoptotic transcriptional repression required the co-binding of Sp1 to p53 target genes. Our results also highlight that Sp1 shares with p53 a common regulator, MDM2, which targets Sp1 for proteasomal degradation. This uncovers a new mechanism of the tight control of apoptosis in cells. Our study advances the understanding of the molecular basis of p53-mediated apoptosis and implicates Sp1 as one of its key modulators. We found that small molecules reactivating p53 can differentially modulate Sp1, thus providing insights into how to manipulate p53 response in a controlled way. PMID:24971482

  18. Shade avoidance 6 encodes an Arabidopsis flap endonuclease required for maintenance of genome integrity and development.

    PubMed

    Zhang, Yijuan; Wen, Chunhong; Liu, Songbai; Zheng, Li; Shen, Binghui; Tao, Yi

    2016-02-18

    Flap endonuclease-1 (FEN1) belongs to the Rad2 family of structure-specific nucleases. It is required for several DNA metabolic pathways, including DNA replication and DNA damage repair. Here, we have identified a shade avoidance mutant, sav6, which reduces the mRNA splicing efficiency of SAV6. We have demonstrated that SAV6 is an FEN1 homologue that shows double-flap endonuclease and gap-dependent endonuclease activity, but lacks exonuclease activity. sav6 mutants are hypersensitive to DNA damage induced by ultraviolet (UV)-C radiation and reagents that induce double-stranded DNA breaks, but exhibit normal responses to chemicals that block DNA replication. Signalling components that respond to DNA damage are constitutively activated in sav6 mutants. These data indicate that SAV6 is required for DNA damage repair and the maintenance of genome integrity. Mutant sav6 plants also show reduced root apical meristem (RAM) size and defective quiescent centre (QC) development. The expression of SMR7, a cell cycle regulatory gene, and ERF115 and PSK5, regulators of QC division, is increased in sav6 mutants. Their constitutive induction is likely due to the elevated DNA damage responses in sav6 and may lead to defects in the development of the RAM and QC. Therefore, SAV6 assures proper root development through maintenance of genome integrity. PMID:26721386

  19. Shade avoidance 6 encodes an Arabidopsis flap endonuclease required for maintenance of genome integrity and development

    PubMed Central

    Zhang, Yijuan; Wen, Chunhong; Liu, Songbai; Zheng, Li; Shen, Binghui; Tao, Yi

    2016-01-01

    Flap endonuclease-1 (FEN1) belongs to the Rad2 family of structure-specific nucleases. It is required for several DNA metabolic pathways, including DNA replication and DNA damage repair. Here, we have identified a shade avoidance mutant, sav6, which reduces the mRNA splicing efficiency of SAV6. We have demonstrated that SAV6 is an FEN1 homologue that shows double-flap endonuclease and gap-dependent endonuclease activity, but lacks exonuclease activity. sav6 mutants are hypersensitive to DNA damage induced by ultraviolet (UV)-C radiation and reagents that induce double-stranded DNA breaks, but exhibit normal responses to chemicals that block DNA replication. Signalling components that respond to DNA damage are constitutively activated in sav6 mutants. These data indicate that SAV6 is required for DNA damage repair and the maintenance of genome integrity. Mutant sav6 plants also show reduced root apical meristem (RAM) size and defective quiescent centre (QC) development. The expression of SMR7, a cell cycle regulatory gene, and ERF115 and PSK5, regulators of QC division, is increased in sav6 mutants. Their constitutive induction is likely due to the elevated DNA damage responses in sav6 and may lead to defects in the development of the RAM and QC. Therefore, SAV6 assures proper root development through maintenance of genome integrity. PMID:26721386

  20. An Integrative Genomic Study Implicates the Postsynaptic Density in the Pathogenesis of Bipolar Disorder.

    PubMed

    Akula, Nirmala; Wendland, Jens R; Choi, Kwang H; McMahon, Francis J

    2016-02-01

    Genome-wide association studies (GWAS) have identified several common variants associated with bipolar disorder (BD), but the biological meaning of these findings remains unclear. Integrative genomics-the integration of GWAS signals with gene expression data-may illuminate genes and gene networks that have key roles in the pathogenesis of BD. We applied weighted gene co-expression network analysis (WGCNA), which exploits patterns of co-expression among genes, to brain transcriptome data obtained by sequencing of poly-A RNA derived from postmortem dorsolateral prefrontal cortex from people with BD, along with age- and sex-matched controls. WGCNA identified 33 gene modules. Many of the modules corresponded closely to those previously reported in human cortex. Three modules were associated with BD, enriched for genes differentially expressed in BD, and also enriched for signals in prior GWAS of BD. Functional analysis of genes within these modules revealed significant enrichment of several functionally related sets of genes, especially those involved in the postsynaptic density (PSD). These results provide convergent support for the hypothesis that dysregulation of genes involved in the PSD is a key factor in the pathogenesis of BD. If replicated in larger samples, these findings could point toward new therapeutic targets for BD. PMID:26211730

  1. Gene organization and transcription of TED, a lepidopteran retrotransposon integrated within the baculovirus genome.

    PubMed Central

    Friesen, P D; Nissen, M S

    1990-01-01

    A single copy of the retrotransposon TED, from the moth Trichoplusia ni (a lepidopteran noctuid), was identified within the DNA genome of the baculovirus Autographa californica nuclear polyhedrosis virus. Determination of the complete nucleotide sequence (7,510 base pairs) of the integrated copy indicated that TED belongs to the family of retrotransposons that includes Drosophila melanogaster elements 17.6 and gypsy and thus represents the first nondipteran member of this invertebrate group to be identified. The internal portion of TED, flanked by long terminal repeats (LTRs), is composed of three long open reading frames comparable in size and location to the gag, pol, and env genes of the vertebrate retroviruses. Sequence similarity with the dipteran elements was the highest within individual domains of TED open reading frame 2 (pol region) that are also conserved among the retroviruses and encode protease, reverse transcriptase, and integrase functions, respectively. Mapping the 5' and 3' termini of TED RNAs indicated that the LTRs have a retroviral U3-R-U5 structural organization that is capable of directing the synthesis of transcripts that represent potential substrates for reverse transcription and intermediates in transposition. Abundant RNAs were also initiated from a site within the 5' LTR that matches the consensus motif for the promoter of late, hyperexpressed baculovirus genes. The presence of this viruslike promoter within TED and its subsequent activation only after integration within the viral genome suggest a possible symbiotic relationship with the baculovirus that could extend transposon host range. Images PMID:1692964

  2. Integrative Genomics-Based Discovery of Novel Regulators of the Innate Antiviral Response

    PubMed Central

    van der Lee, Robin; ter Horst, Rob; Szklarczyk, Radek; Netea, Mihai G.; Andeweg, Arno C.; van Kuppeveld, Frank J. M.; Huynen, Martijn A.

    2015-01-01

    The RIG-I-like receptor (RLR) pathway is essential for detecting cytosolic viral RNA to trigger the production of type I interferons (IFNα/β) that initiate an innate antiviral response. Through systematic assessment of a wide variety of genomics data, we discovered 10 molecular signatures of known RLR pathway components that collectively predict novel members. We demonstrate that RLR pathway genes, among others, tend to evolve rapidly, interact with viral proteins, contain a limited set of protein domains, are regulated by specific transcription factors, and form a tightly connected interaction network. Using a Bayesian approach to integrate these signatures, we propose likely novel RLR regulators. RNAi knockdown experiments revealed a high prediction accuracy, identifying 94 genes among 187 candidates tested (~50%) that affected viral RNA-induced production of IFNβ. The discovered antiviral regulators may participate in a wide range of processes that highlight the complexity of antiviral defense (e.g. MAP3K11, CDK11B, PSMA3, TRIM14, HSPA9B, CDC37, NUP98, G3BP1), and include uncharacterized factors (DDX17, C6orf58, C16orf57, PKN2, SNW1). Our validated RLR pathway list (http://rlr.cmbi.umcn.nl/), obtained using a combination of integrative genomics and experiments, is a new resource for innate antiviral immunity research. PMID:26485378

  3. Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship.

    PubMed

    Kim, Hyunsoo; Huang, Wei; Jiang, Xiuli; Pennicooke, Brenton; Park, Peter J; Johnson, Mark D

    2010-02-01

    Using a multidimensional genomic data set on glioblastoma from The Cancer Genome Atlas, we identified hsa-miR-26a as a cooperating component of a frequently occurring amplicon that also contains CDK4 and CENTG1, two oncogenes that regulate the RB1 and PI3 kinase/AKT pathways, respectively. By integrating DNA copy number, mRNA, microRNA, and DNA methylation data, we identified functionally relevant targets of miR-26a in glioblastoma, including PTEN, RB1, and MAP3K2/MEKK2. We demonstrate that miR-26a alone can transform cells and it promotes glioblastoma cell growth in vitro and in the mouse brain by decreasing PTEN, RB1, and MAP3K2/MEKK2 protein expression, thereby increasing AKT activation, promoting proliferation, and decreasing c-JUN N-terminal kinase-dependent apoptosis. Overexpression of miR-26a in PTEN-competent and PTEN-deficient glioblastoma cells promoted tumor growth in vivo, and it further increased growth in cells overexpressing CDK4 or CENTG1. Importantly, glioblastoma patients harboring this amplification displayed markedly decreased survival. Thus, hsa-miR-26a, CDK4, and CENTG1 comprise a functionally integrated oncomir/oncogene DNA cluster that promotes aggressiveness in human cancers by cooperatively targeting the RB1, PI3K/AKT, and JNK pathways. PMID:20080666

  4. A novel integrated cytogenetic and genomic classification refines risk stratification in pediatric acute lymphoblastic leukemia.

    PubMed

    Moorman, Anthony V; Enshaei, Amir; Schwab, Claire; Wade, Rachel; Chilton, Lucy; Elliott, Alannah; Richardson, Stacey; Hancock, Jeremy; Kinsey, Sally E; Mitchell, Christopher D; Goulden, Nicholas; Vora, Ajay; Harrison, Christine J

    2014-08-28

    Recent genomic studies have provided a refined genetic map of acute lymphoblastic leukemia (ALL) and increased the number of potential prognostic markers. Therefore, we integrated copy-number alteration data from the 8 most commonly deleted genes, subordinately, with established chromosomal abnormalities to derive a 2-tier genetic classification. The classification was developed using 809 ALL97/99 patients and validated using 742 United Kingdom (UK)ALL2003 patients. Good-risk (GR) genetic features included ETV6-RUNX1, high hyperdiploidy, normal copy-number status for all 8 genes, isolated deletions affecting ETV6/PAX5/BTG1, and ETV6 deletions with a single additional deletion of BTG1/PAX5/CDKN2A/B. All other genetic features were classified as poor risk (PR). Three-quarters of UKALL2003 patients had a GR genetic profile and a significantly improved event-free survival (EFS) (94%) compared with patients with a PR genetic profile (79%). This difference was driven by a lower relapse rate (4% vs 17%), was seen across all patient subgroups, and was independent of other risk factors. Even genetic GR patients with minimal residual disease (>0.01%) at day 29 had an EFS in excess of 90%. In conclusion, the integration of genomic and cytogenetic data defines 2 subgroups with distinct responses to treatment and identifies a large subset of children suitable for treatment deintensification. PMID:24957142

  5. Drug-target interaction prediction by integrating chemical, genomic, functional and pharmacological data.

    PubMed

    Yang, Fan; Xu, Jinbo; Zeng, Jianyang

    2014-01-01

    In silico prediction of unknown drug-target interactions (DTIs) has become a popular tool for drug repositioning and drug development. A key challenge in DTI prediction lies in integrating multiple types of data for accurate DTI prediction. Although recent studies have demonstrated that genomic, chemical and pharmacological data can provide reliable information for DTI prediction, it remains unclear whether functional information on proteins can also contribute to this task. Little work has been developed to combine such information with other data to identify new interactions between drugs and targets. In this paper, we introduce functional data into DTI prediction and construct biological space for targets using the functional similarity measure. We present a probabilistic graphical model, called conditional random field (CRF), to systematically integrate genomic, chemical, functional and pharmacological data plus the topology of DTI networks into a unified framework to predict missing DTIs. Tests on two benchmark datasets show that our method can achieve excellent prediction performance with the area under the precision-recall curve (AUPR) up to 94.9. These results demonstrate that our CRF model can successfully exploit heterogeneous data to capture the latent correlations of DTIs, and thus will be practically useful for drug repositioning. Supplementary Material is available at http://iiis.tsinghua.edu.cn/~compbio/papers/psb2014/psb2014_sm.pdf. PMID:24297542

  6. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments.

    PubMed

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R; Verstrepen, Kevin J; Thevelein, Johan M; Tohme, Joe

    2014-04-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species. PMID:24413664

  7. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments

    PubMed Central

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R.; Verstrepen, Kevin J.; Thevelein, Johan M.; Tohme, Joe

    2014-01-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species. PMID:24413664

  8. Structural Maintenance of Chromosome (SMC) Proteins Link Microtubule Stability to Genome Integrity*

    PubMed Central

    Laflamme, Guillaume; Tremblay-Boudreault, Thierry; Roy, Marc-André; Andersen, Parker; Bonneil, Éric; Atchia, Kaleem; Thibault, Pierre; D'Amours, Damien; Kwok, Benjamin H.

    2014-01-01

    Structural maintenance of chromosome (SMC) proteins are key organizers of chromosome architecture and are essential for genome integrity. They act by binding to chromatin and connecting distinct parts of chromosomes together. Interestingly, their potential role in providing connections between chromatin and the mitotic spindle has not been explored. Here, we show that yeast SMC proteins bind directly to microtubules and can provide a functional link between microtubules and DNA. We mapped the microtubule-binding region of Smc5 and generated a mutant with impaired microtubule binding activity. This mutant is viable in yeast but exhibited a cold-specific conditional lethality associated with mitotic arrest, aberrant spindle structures, and chromosome segregation defects. In an in vitro reconstitution assay, this Smc5 mutant also showed a compromised ability to protect microtubules from cold-induced depolymerization. Collectively, these findings demonstrate that SMC proteins can bind to and stabilize microtubules and that SMC-microtubule interactions are essential to establish a robust system to maintain genome integrity. PMID:25135640

  9. A Genetic Response Score for Hydrochlorothiazide Use: Insights From Genomics and Metabolomics Integration.

    PubMed

    Shahin, Mohamed H; Gong, Yan; McDonough, Caitrin W; Rotroff, Daniel M; Beitelshees, Amber L; Garrett, Timothy J; Gums, John G; Motsinger-Reif, Alison; Chapman, Arlene B; Turner, Stephen T; Boerwinkle, Eric; Frye, Reginald F; Fiehn, Oliver; Cooper-DeHoff, Rhonda M; Kaddurah-Daouk, Rima; Johnson, Julie A

    2016-09-01

    Hydrochlorothiazide is among the most commonly prescribed antihypertensives; yet, <50% of hydrochlorothiazide-treated patients achieve blood pressure (BP) control. Herein, we integrated metabolomic and genomic profiles of hydrochlorothiazide-treated patients to identify novel genetic markers associated with hydrochlorothiazide BP response. The primary analysis included 228 white hypertensives treated with hydrochlorothiazide from the Pharmacogenomic Evaluation of Antihypertensive Responses (PEAR) study. Genome-wide analysis was conducted using Illumina Omni 1 mol/L-Quad Chip, and untargeted metabolomics was performed on baseline fasting plasma samples using a gas chromatography-time-of-flight mass spectrometry platform. We found 13 metabolites significantly associated with hydrochlorothiazide systolic BP (SBP) and diastolic BP (DBP) responses (false discovery rate, <0.05). In addition, integrating genomic and metabolomic data revealed 3 polymorphisms (rs2727563 PRKAG2, rs12604940 DCC, and rs13262930 EPHX2) along with arachidonic acid, converging in the netrin signaling pathway (P=1×10(-5)), as potential markers, significantly influencing hydrochlorothiazide BP response. We successfully replicated the 3 genetic signals in 212 white hypertensives treated with hydrochlorothiazide and created a response score by summing their BP-lowering alleles. We found patients carrying 1 response allele had a significantly lower response than carriers of 6 alleles (∆SBP/∆DBP: -1.5/1.2 versus -16.3/-10.4 mm Hg, respectively, SBP score, P=1×10(-8) and DBP score, P=3×10(-9)). This score explained 11.3% and 11.9% of the variability in hydrochlorothiazide SBP and DBP responses, respectively, and was further validated in another independent study of 196 whites treated with hydrochlorothiazide (DBP score, P=0.03; SBP score, P=0.07). This study suggests that PRKAG2, DCC, and EPHX2 might be important determinants of hydrochlorothiazide BP response. PMID:27381900

  10. Oligonucleotide-Mediated Genome Editing Provides Precision and Function to Engineered Nucleases and Antibiotics in Plants.

    PubMed

    Sauer, Noel J; Narváez-Vásquez, Javier; Mozoruk, Jerry; Miller, Ryan B; Warburg, Zachary J; Woodward, Melody J; Mihiret, Yohannes A; Lincoln, Tracey A; Segami, Rosa E; Sanders, Steven L; Walker, Keith A; Beetham, Peter R; Schöpke, Christian R; Gocal, Greg F W

    2016-04-01

    Here, we report a form of oligonucleotide-directed mutagenesis for precision genome editing in plants that uses single-stranded oligonucleotides (ssODNs) to precisely and efficiently generate genome edits at DNA strand lesions made by DNA double strand break reagents. Employing a transgene model in Arabidopsis (Arabidopsis thaliana), we obtained a high frequency of precise targeted genome edits when ssODNs were introduced into protoplasts that were pretreated with the glycopeptide antibiotic phleomycin, a nonspecific DNA double strand breaker. Simultaneous delivery of ssODN and a site-specific DNA double strand breaker, either transcription activator-like effector nucleases (TALENs) or clustered, regularly interspaced, short palindromic repeats (CRISPR/Cas9), resulted in a much greater targeted genome-editing frequency compared with treatment with DNA double strand-breaking reagents alone. Using this site-specific approach, we applied the combination of ssODN and CRISPR/Cas9 to develop an herbicide tolerance trait in flax (Linum usitatissimum) by precisely editing the 5'-ENOLPYRUVYLSHIKIMATE-3-PHOSPHATE SYNTHASE (EPSPS) genes. EPSPS edits occurred at sufficient frequency that we could regenerate whole plants from edited protoplasts without employing selection. These plants were subsequently determined to be tolerant to the herbicide glyphosate in greenhouse spray tests. Progeny (C1) of these plants showed the expected Mendelian segregation of EPSPS edits. Our findings show the enormous potential of using a genome-editing platform for precise, reliable trait development in crop plants. PMID:26864017

  11. Fitness Cost Implications of PhiC31-Mediated Site-Specific Integrations in Target-Site Strains of the Mexican Fruit Fly, Anastrepha ludens (Diptera: Tephritidae)

    PubMed Central

    Meza, José S.; Díaz-Fleischer, Francisco; Sánchez-Velásquez, Lázaro R.; Zepeda-Cisneros, Cristina Silvia; Handler, Alfred M.; Schetelig, Marc F.

    2014-01-01

    Site-specific recombination technologies are powerful new tools for the manipulation of genomic DNA in insects that can improve transgenesis strategies such as targeting transgene insertions, allowing transgene cassette exchange and DNA mobilization for transgene stabilization. However, understanding the fitness cost implications of these manipulations for transgenic strain applications is critical. In this study independent piggyBac-mediated attP target-sites marked with DsRed were created in several genomic positions in the Mexican fruit fly, Anastrepha ludens. Two of these strains, one having an autosomal (attP_F7) and the other a Y-linked (attP_2-M6y) integration, exhibited fitness parameters (dynamic demography and sexual competitiveness) similar to wild type flies. These strains were thus selected for targeted insertion using, for the first time in mexfly, the phiC31-integrase recombination system to insert an additional EGFP-marked transgene to determine its effect on host strain fitness. Fitness tests showed that the integration event in the int_2-M6y recombinant strain had no significant effect, while the int_F7 recombinant strain exhibited significantly lower fitness relative to the original attP_F7 target-site host strain. These results indicate that while targeted transgene integrations can be achieved without an additional fitness cost, at some genomic positions insertion of additional DNA into a previously integrated transgene can have a significant negative effect. Thus, for targeted transgene insertions fitness costs must be evaluated both previous to and subsequent to new site-specific insertions in the target-site strain. PMID:25303238

  12. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine

    PubMed Central

    Orechia, John; Pathak, Ameet; Shi, Yunling; Nawani, Aniket; Belozerov, Andrey; Fontes, Caitlin; Lakhiani, Camille; Jawale, Chetan; Patel, Chetansharan; Quinn, Daniel; Botvinnik, Dmitry; Mei, Eddie; Cotter, Elizabeth; Byleckie, James; Ullman-Cullere, Mollie; Chhetri, Padam; Chalasani, Poornima; Karnam, Purushotham; Beaudoin, Ronald; Sahu, Sandeep; Belozerova, Yelena; Mathew, Jomol P.

    2015-01-01

    We live in the genomic era of medicine, where a patient's genomic/molecular data is becoming increasingly important for disease diagnosis, identification of targeted therapy, and risk assessment for adverse reactions. However, decoding the genomic test results and integrating it with clinical data for retrospective studies and cohort identification for prospective clinical trials is still a challenging task. In order to overcome these barriers, we developed an overarching enterprise informatics framework for translational research and personalized medicine called Synergistic Patient and Research Knowledge Systems (SPARKS) and a suite of tools called Oncology Data Retrieval Systems (OncDRS). OncDRS enables seamless data integration, secure and self-navigated query and extraction of clinical and genomic data from heterogeneous sources. Within a year of release, the system has facilitated more than 1500 research queries and has delivered data for more than 50 research studies. PMID:27054074

  13. [ix-layer structure for genomics:A powerful conceptual framework that integrates research, test and treatment].

    PubMed

    Kamatani, Naoyuki

    2016-04-01

    Genomic data are now available from various fields of medicine and biology, and I proposed a conceptual framework "Six-layer structure" to integrate various areas of genomics. The proposed layers are "life" as the uppermost layer, followed by "species","population","family","individual",and finally "cell" as the bottommost layer. In each pair of adjacent layers, each member of the upper layer comprises a set of members of the lower layer. In each layer, we can define consistent partial orders of members based on genomic data in the forms of phylogenic and pedigree trees. Based on this framework, we can give integrated explanations for various researches, tests and drug therapies concerning genomic data, and we can use this framework for new discoveries as well as new test and drug developments. PMID:27013629

  14. A Novel Ty1-Mediated Fragmentation Method for Native and Artificial Yeast Chromosomes Reveals That the Mouse Steel Gene Is a Hotspot for Ty1 Integration

    PubMed Central

    Dalgaard, J. Z.; Banerjee, M.; Curcio, M. J.

    1996-01-01

    We have developed a powerful new tool for the physical analysis of genomes called Ty1-mediated chromosomal fragmentation and have used the method to map 24 retrotransposon insertions into two different mouse-derived yeast artificial chromosomes (YACs). Expression of a plasmid-encoded GAL1:Ty1 fusion element marked with the retrotransposition indicator gene, ade2AI, resulted in a high fraction of cells that sustained a single Ty1 insertion marked with ADE2. Strains in which Ty1ADE2 inserted into a YAC were identified by cosegregation of the ADE2 gene with the URA3-marked YAC. Ty1ADE2 elements also carried a site for the endonuclease I-DmoI, which we demonstrate is not present anywhere in the yeast genome. Consequently, I-DmoI cleaved a single chromosome or YAC at the unique site of Ty1ADE2 insertion, allowing rapid mapping of integration events. Our analyses showed that the frequency of Ty1ADE2 integration into YACs is equivalent to or higher than that expected based on random insertion. Remarkably, the 50-kb transcription unit of the mouse Steel locus was shown to be a highly significant hotspot for Ty1 integration. The accessibility of mammalian transcription units to Ty1 insertion stands in contrast to that of yeast transcription units. PMID:8725218

  15. CRISPR/Cas9-Mediated Genome Editing of Epigenetic Factors for Cancer Therapy.

    PubMed

    Yao, Shaohua; He, Zhiyao; Chen, Chong

    2015-07-01

    Advances in engineered recombinant nuclease have provided facile and reliable methods for genome editing. Especially with the development of the CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR-associated protein-9 nuclease) system, the discovery of various versions of Cas9 proteins and delivery carriers, it is now practicable to introduce desired mutations into the genome, to correct disease-related mutations, and to activate or suppress genes of interest. Epigenetic regulators are often disturbed in cancer cells and are essential for the transformation of normal to cancerous cells. Tumor-related epigenetic alterations or epigenetic factor mutations play a major part during the various steps of carcinogenesis and affect a variety of cancer-related genes and a wide range of cancerous phenotypes. Therefore, epigenetic regulatory enzymes might be candidate targets for cancer therapy. In this review, we discuss prospects of CRISPR/Cas9-based genome editing in targeting epigenetics for cancer gene therapy. PMID:26075804

  16. Integrative analyses reveal a long noncoding RNA-mediated sponge regulatory network in prostate cancer

    PubMed Central

    Du, Zhou; Sun, Tong; Hacisuleyman, Ezgi; Fei, Teng; Wang, Xiaodong; Brown, Myles; Rinn, John L.; Lee, Mary Gwo-Shu; Chen, Yiwen; Kantoff, Philip W.; Liu, X. Shirley

    2016-01-01

    Mounting evidence suggests that long noncoding RNAs (lncRNAs) can function as microRNA sponges and compete for microRNA binding to protein-coding transcripts. However, the prevalence, functional significance and targets of lncRNA-mediated sponge regulation of cancer are mostly unknown. Here we identify a lncRNA-mediated sponge regulatory network that affects the expression of many protein-coding prostate cancer driver genes, by integrating analysis of sequence features and gene expression profiles of both lncRNAs and protein-coding genes in tumours. We confirm the tumour-suppressive function of two lncRNAs (TUG1 and CTB-89H12.4) and their regulation of PTEN expression in prostate cancer. Surprisingly, one of the two lncRNAs, TUG1, was previously known for its function in polycomb repressive complex 2 (PRC2)-mediated transcriptional regulation, suggesting its sub-cellular localization-dependent function. Our findings not only suggest an important role of lncRNA-mediated sponge regulation in cancer, but also underscore the critical influence of cytoplasmic localization on the efficacy of a sponge lncRNA. PMID:26975529

  17. Network Analysis of Epidermal Growth Factor Signaling using Integrated Genomic, Proteomic and Phosphorylation Data

    SciTech Connect

    Waters, Katrina M.; Liu, Tao; Quesenberry, Ryan D.; Willse, Alan R.; Bandyopadhyay, Somnath; Kathmann, Loel E.; Weber, Thomas J.; Smith, Richard D.; Wiley, H. S.; Thrall, Brian D.

    2012-03-29

    To understand how integration of multiple data types can help decipher cellular responses at the systems level, we analyzed the mitogenic response of human mammary epithelial cells to epidermal growth factor (EGF) using whole genome microarrays, mass spectrometry-based proteomics and large-scale western blots with over 1000 antibodies. A time course analysis revealed significant differences in the expression of 3172 genes and 596 proteins, including protein phosphorylation changes measured by western blot. Integration of these disparate data types showed that each contributed qualitatively different components to the observed cell response to EGF and that varying degrees of concordance in gene expression and protein abundance measurements could be linked to specific biological processes. Networks inferred from individual data types were relatively limited, whereas networks derived from the integrated data recapitulated the known major cellular responses to EGF and exhibited more highly connected signaling nodes than networks derived from any individual dataset. While cell cycle regulatory pathways were altered as anticipated, we found the most robust response to mitogenic concentrations of EGF was induction of matrix metalloprotease cascades, highlighting the importance of the EGFR system as a regulator of the extracellular environment. These results demonstrate the value of integrating multiple levels of biological information to more accurately reconstruct networks of cellular response.

  18. Getting ready for the future: integration of genomics into public health research, policy and practice in Europe and globally.

    PubMed

    Brand, Angela; Schroder, Peter; Brand, Helmut; Zimmern, Ron

    2006-01-01

    The integration of genomics into public health research, policy and practice will be one of the most important future challenges that our health care systems will face. The next decade will provide a window of opportunity to establish infrastructures that will enable the scientific advances to be translated into evidence-based policies and interventions that improve population health. Approaches for national, European and international institutionalization of public health genomics are shown that aim to champion these challenges. PMID:16490962

  19. Oxidative stress signalling: a potential mediator of tumour necrosis factor alpha-induced genomic instability in primary vascular endothelial cells.

    PubMed

    Natarajan, M; Gibbons, C F; Mohan, S; Moore, S; Kadhim, M A

    2007-09-01

    Studying the potential role of tumour necrosis factor (TNF)alpha in the initiation of genomic instability is necessary to understand whether TNFalpha can serve as a signalling mediator of radiation-induced genomic instability in non-irradiated bystander cells. In this study, we examined whether TNFalpha could initiate processes through oxidative stress signalling that lead to DNA damage and genomic instability in primary vascular endothelium. In these cells, low linear energy transfer (LET) radiation (0.1-2 Gy) induced the secretion of TNFalpha into the culture medium. When added ectopically, TNFalpha at concentrations ranging from 0.1 ng ml(-1) to 10 ng ml(-1) increased (twofold to threefold) intracellular oxidative stress. Next, to examine whether TNFalpha induces genetic damage, cells were treated with TNFalpha for 5 h and analysed immediately using the single cell gel electrophoresis assay or after 3 days, 12 days and 20 days using solid stain chromosomal analysis. Cells exposed to 0.1 Gy, 1 Gy or 2 Gy or treated with 100 microM H2O2 were used as positive controls. The results showed that TNFalpha as low as 0.1 ng ml(-1) could initiate increased DNA damage compared with untreated controls. When examined in the progeny cells after several generations, the chromosomal instability appeared to be carried over even after day 12 and day 20. The increased genetic damage is inhibited in cells that are pre-incubated with the antioxidant enzyme catalase, the antioxidant N-acetyl-L-cysteine or the metal chelator pyrrolidine dithiocarbamate. These results clearly indicate that TNFalpha at concentrations at which no cytotoxicity is observed could induce genetic damage through free radical generation, which could, in turn, lead to the delayed events associated with genomic instability. PMID:17704321

  20. Genomic paradigms for food-borne enteric pathogen analysis at the USFDA: case studies highlighting method utility, integration and resolution.

    PubMed

    Elkins, C A; Kotewicz, M L; Jackson, S A; Lacher, D W; Abu-Ali, G S; Patel, I R

    2013-01-01

    Modern risk control and food safety practices involving food-borne bacterial pathogens are benefiting from new genomic technologies for rapid, yet highly specific, strain characterisations. Within the United States Food and Drug Administration (USFDA) Center for Food Safety and Applied Nutrition (CFSAN), optical genome mapping and DNA microarray genotyping have been used for several years to quickly assess genomic architecture and gene content, respectively, for outbreak strain subtyping and to enhance retrospective trace-back analyses. The application and relative utility of each method varies with outbreak scenario and the suspect pathogen, with comparative analytical power enhanced by database scale and depth. Integration of these two technologies allows high-resolution scrutiny of the genomic landscapes of enteric food-borne pathogens with notable examples including Shiga toxin-producing Escherichia coli (STEC) and Salmonella enterica serovars from a variety of food commodities. Moreover, the recent application of whole genome sequencing technologies to food-borne pathogen outbreaks and surveillance has enhanced resolution to the single nucleotide scale. This new wealth of sequence data will support more refined next-generation custom microarray designs, targeted re-sequencing and "genomic signature recognition" approaches involving a combination of genes and single nucleotide polymorphism detection to distil strain-specific fingerprinting to a minimised scale. This paper examines the utility of microarrays and optical mapping in analysing outbreaks, reviews best practices and the limits of these technologies for pathogen differentiation, and it considers future integration with whole genome sequencing efforts. PMID:23199033

  1. Integrated Genomic and Transcriptional Profiling Identifies Chromosomal Loci with Altered Gene Expression in Cervical Cancer

    PubMed Central

    Wilting, Saskia M.; de Wilde, Jillian; Meijer, Chris J. L. M.; Berkhof, Johannes; Yi, Yajun; van Wieringen, Wessel N.; Braakhuis, Boudewijn J. M.; Meijer, Gerrit A.; Ylstra, Bauke; Snijders, Peter J. F.; Steenbergen, Renske D. M.

    2009-01-01

    For a better understanding of the consequences of recurrent chromosomal alterations in cervical carcinomas, we integrated genome-wide chromosomal and transcriptional profiles of 10 squamous cell carcinomas (SCCs), 5 adenocarcinomas (AdCAs) and 6 normal controls. Previous genomic profiling showed that gains at chromosome arms 1q, 3q, and 20q as well as losses at 8q, 10q, 11q, and 13q were common in cervical carcinomas. Altered regions spanned multiple megabases, and the extent to which expression of genes located there is affected remains unclear. Expression analysis of these previously chromosomally profiled carcinomas yielded 83 genes with significantly differential expression between carcinomas and normal epithelium. Application of differential gene locus mapping (DIGMAP) analysis and the array CGH expression integration tool (ACE-it) identified hotspots within large chromosomal alterations in which gene expression was altered as well. Chromosomal gains of the long arms of chromosome 1, 3, and 20 resulted in increased expression of genes located at 1q32.1-32.2, 3q13.32-23, 3q26.32-27.3, and 20q11.21-13.33, whereas a chromosomal loss of 11q22.3-25 was related to decreased expression of genes located in this region. Overexpression of DTX3L, PIK3R4, ATP2C1, and SLC25A36, all located at 3q21.1-23 and identified by DIGMAP, ACE-it or both, was confirmed in an independent validation sample set consisting of 12 SCCs and 13 normal ectocervical samples. In conclusion, integrated chromosomal and transcriptional profiling identified chromosomal hotspots at 1q, 3q, 11q, and 20q with altered gene expression within large commonly altered chromosomal regions in cervical cancer. PMID:18618715

  2. Plant Clonal Integration Mediates the Horizontal Redistribution of Soil Resources, Benefiting Neighboring Plants

    PubMed Central

    Ye, Xue-Hua; Zhang, Ya-Lin; Liu, Zhi-Lan; Gao, Shu-Qin; Song, Yao-Bin; Liu, Feng-Hong; Dong, Ming

    2016-01-01

    Resources such as water taken up by plants can be released into soils through hydraulic redistribution and can also be translocated by clonal integration within a plant clonal network. We hypothesized that the resources from one (donor) microsite could be translocated within a clonal network, released into different (recipient) microsites and subsequently used by neighbor plants in the recipient microsite. To test these hypotheses, we conducted two experiments in which connected and disconnected ramet pairs of Potentilla anserina were grown under both homogeneous and heterogeneous water regimes, with seedlings of Artemisia ordosica as neighbors. The isotopes [15N] and deuterium were used to trace the translocation of nitrogen and water, respectively, within the clonal network. The water and nitrogen taken up by P. anserina ramets in the donor microsite were translocated into the connected ramets in the recipient microsites. Most notably, portions of the translocated water and nitrogen were released into the recipient microsite and were used by the neighboring A. ordosica, which increased growth of the neighboring A. ordosica significantly. Therefore, our hypotheses were supported, and plant clonal integration mediated the horizontal hydraulic redistribution of resources, thus benefiting neighboring plants. Such a plant clonal integration-mediated resource redistribution in horizontal space may have substantial effects on the interspecific relations and composition of the community and consequently on ecosystem processes. PMID:26904051

  3. Snat: a SNP annotation tool for bovine by integrating various sources of genomic information

    PubMed Central

    2011-01-01

    Background Most recently, with maturing of bovine genome sequencing and high throughput SNP genotyping technologies, a large number of significant SNPs associated with economic important traits can be identified by genome-wide association studies (GWAS). To further determine true association findings in GWAS, the common strategy is to sift out most promising SNPs for follow-up replication studies. Hence it is crucial to explore the functional significance of the candidate SNPs in order to screen and select the potential functional ones. To systematically prioritize these statistically significant SNPs and facilitate follow-up replication studies, we developed a bovine SNP annotation tool (Snat) based on a web interface. Results With Snat, various sources of genomic information are integrated and retrieved from several leading online databases, including SNP information from dbSNP, gene information from Entrez Gene, protein features from UniProt, linkage information from AnimalQTLdb, conserved elements from UCSC Genome Browser Database and gene functions from Gene Ontology (GO), KEGG PATHWAY and Online Mendelian Inheritance in Animals (OMIA). Snat provides two different applications, including a CGI-based web utility and a command-line version, to access the integrated database, target any single nucleotide loci of interest and perform multi-level functional annotations. For further validation of the practical significance of our study, SNPs involved in two commercial bovine SNP chips, i.e., the Affymetrix Bovine 10K chip array and the Illumina 50K chip array, have been annotated by Snat, and the corresponding outputs can be directly downloaded from Snat website. Furthermore, a real dataset involving 20 identified SNPs associated with milk yield in our recent GWAS was employed to demonstrate the practical significance of Snat. Conclusions To our best knowledge, Snat is one of first tools focusing on SNP annotation for livestock. Snat confers researchers with a

  4. Phage-mediated horizontal transfer of a Staphylococcus aureus virulence-associated genomic island.

    PubMed

    Moon, Bo Youn; Park, Joo Youn; Hwang, Sun Yung; Robinson, D Ashley; Thomas, Jonathan C; Fitzgerald, J Ross; Park, Yong Ho; Seo, Keun Seok

    2015-01-01

    Staphylococcus aureus is a major pathogen of humans and animals. The capacity of S. aureus to adapt to different host species and tissue types is strongly influenced by the acquisition of mobile genetic elements encoding determinants involved in niche adaptation. The genomic islands νSaα and νSaβ are found in almost all S. aureus strains and are characterized by extensive variation in virulence gene content. However the basis for the diversity and the mechanism underlying mobilization of the genomic islands between strains are unexplained. Here, we demonstrated that the genomic island, νSaβ, encoding an array of virulence factors including staphylococcal superantigens, proteases, and leukotoxins, in addition to bacteriocins, was transferrable in vitro to human and animal strains of multiple S. aureus clones via a resident prophage. The transfer of the νSaβ appears to have been accomplished by multiple conversions of transducing phage particles carrying overlapping segments of the νSaβ. Our findings solve a long-standing mystery regarding the diversification and spread of the genomic island νSaβ, highlighting the central role of bacteriophages in the pathogenic evolution of S. aureus. PMID:25891795

  5. Genomics of peanut leaf-spot pathogens; and RNA-interference-mediated control of aflatoxins

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An overview update of the research done at USDA-ARS National Peanut Research Laboratory will be presented: including: the release of the Cercospora arachidicola genome, sequencing of Cercosporidium personatum, a workflow to study genetic diversity of aflatoxigenic Aspergillus, and progress on the us...

  6. Phage-mediated horizontal transfer of a Staphylococcus aureus virulence-associated genomic island

    PubMed Central

    Moon, Bo Youn; Park, Joo Youn; Hwang, Sun Yung; Robinson, D. Ashley; Thomas, Jonathan C.; Fitzgerald, J. Ross; Park, Yong Ho; Seo, Keun Seok

    2015-01-01

    Staphylococcus aureus is a major pathogen of humans and animals. The capacity of S. aureus to adapt to different host species and tissue types is strongly influenced by the acquisition of mobile genetic elements encoding determinants involved in niche adaptation. The genomic islands νSaα and νSaβ are found in almost all S. aureus strains and are characterized by extensive variation in virulence gene content. However the basis for the diversity and the mechanism underlying mobilization of the genomic islands between strains are unexplained. Here, we demonstrated that the genomic island, νSaβ, encoding an array of virulence factors including staphylococcal superantigens, proteases, and leukotoxins, in addition to bacteriocins, was transferrable in vitro to human and animal strains of multiple S. aureus clones via a resident prophage. The transfer of the νSaβ appears to have been accomplished by multiple conversions of transducing phage particles carrying overlapping segments of the νSaβ. Our findings solve a long-standing mystery regarding the diversification and spread of the genomic island νSaβ, highlighting the central role of bacteriophages in the pathogenic evolution of S. aureus. PMID:25891795

  7. Elastomeric microposts integrated into microfluidics for flow-mediated endothelial mechanotransduction analysis.

    PubMed

    Lam, Raymond H W; Sun, Yubing; Chen, Weiqiang; Fu, Jianping

    2012-04-24

    Mechanotransduction is known as the cellular mechanism converting insoluble biophysical signals in the local cellular microenvironment (e.g. matrix rigidity, external mechanical forces, and fluid shear) into intracellular signalling to regulate cellular behaviours. While microfluidic technologies support a precise and independent control of soluble factors in the cellular microenvironment (e.g. growth factors, nutrients, and dissolved gases), the regulation of insoluble biophysical signals in microfluidics, especially matrix rigidity and adhesive pattern, has not yet been achieved. Here we reported an integrated soft lithography-compatible microfluidic methodology that could enable independent controls and modulations of fluid shear, substrate rigidity, and adhesive pattern in a microfluidic environment, by integrating micromolded elastomeric micropost arrays and microcontact printing with microfluidics. The geometry of the elastomeric micropost array could be regulated to mediate substrate rigidity and adhesive pattern, and further the elastomeric microposts could be utilized as force sensors to map live-cell subcellular contractile forces. To illustrate the general application of our methodology, we investigated the flow-mediated endothelial mechanotransduction process and examined specifically the involvement of subcellular contractile forces in the morphological realignment process of endothelial cells under a sustained directional fluid shear. Our results showed that the cytoskeletal contractile forces of endothelial cells were spatiotemporally regulated and coordinated to facilitate their morphology elongation process along the direction of flow. Together, our study provided an integrated microfluidic strategy to modulate the in vitro cellular microenvironment with both defined soluble and insoluble signals, and we demonstrated its application to investigate quantitatively the involvement of cytoskeletal contractile forces in the flow-mediated

  8. Integrative genomics identifies molecular alterations that challenge the linear model of melanoma progression.

    PubMed

    Rose, Amy E; Poliseno, Laura; Wang, Jinhua; Clark, Michael; Pearlman, Alexander; Wang, Guimin; Vega Y Saenz de Miera, Eleazar C; Medicherla, Ratna; Christos, Paul J; Shapiro, Richard; Pavlick, Anna; Darvishian, Farbod; Zavadil, Jiri; Polsky, David; Hernando, Eva; Ostrer, Harry; Osman, Iman

    2011-04-01

    Superficial spreading melanoma (SSM) and nodular melanoma (NM) are believed to represent sequential phases of linear progression from radial to vertical growth. Several lines of clinical, pathologic, and epidemiologic evidence suggest, however, that SSM and NM might be the result of independent pathways of tumor development. We utilized an integrative genomic approach that combines single nucleotide polymorphism array (6.0; Affymetrix) with gene expression array (U133A 2.0; Affymetrix) to examine molecular differences between SSM and NM. Pathway analysis of the most differentially expressed genes between SSM and NM (N = 114) revealed significant differences related to metabolic processes. We identified 8 genes (DIS3, FGFR1OP, G3BP2, GALNT7, MTAP, SEC23IP, USO1, and ZNF668) in which NM/SSM-specific copy number alterations correlated with differential gene expression (P < 0.05; Spearman's rank). SSM-specific genomic deletions in G3BP2, MTAP, and SEC23IP were independently verified in two external data sets. Forced overexpression of metabolism-related gene MTAP (methylthioadenosine phosphorylase) in SSM resulted in reduced cell growth. The differential expression of another metabolic-related gene, aldehyde dehydrogenase 7A1 (ALDH7A1), was validated at the protein level by using tissue microarrays of human melanoma. In addition, we show that the decreased ALDH7A1 expression in SSM may be the result of epigenetic modifications. Our data reveal recurrent genomic deletions in SSM not present in NM, which challenge the linear model of melanoma progression. Furthermore, our data suggest a role for altered regulation of metabolism-related genes as a possible cause of the different clinical behavior of SSM and NM. PMID:21343389

  9. Predictive biomarker discovery through the parallel integration of clinical trial and functional genomics datasets.

    PubMed

    Swanton, Charles; Larkin, James M; Gerlinger, Marco; Eklund, Aron C; Howell, Michael; Stamp, Gordon; Downward, Julian; Gore, Martin; Futreal, P Andrew; Escudier, Bernard; Andre, Fabrice; Albiges, Laurence; Beuselinck, Benoit; Oudard, Stephane; Hoffmann, Jens; Gyorffy, Balázs; Torrance, Chris J; Boehme, Karen A; Volkmer, Hansjuergen; Toschi, Luisella; Nicke, Barbara; Beck, Marlene; Szallasi, Zoltan

    2010-01-01

    The European Union multi-disciplinary Personalised RNA interference to Enhance the Delivery of Individualised Cytotoxic and Targeted therapeutics (PREDICT) consortium has recently initiated a framework to accelerate the development of predictive biomarkers of individual patient response to anti-cancer agents. The consortium focuses on the identification of reliable predictive biomarkers to approved agents with anti-angiogenic activity for which no reliable predictive biomarkers exist: sunitinib, a multi-targeted tyrosine kinase inhibitor and everolimus, a mammalian target of rapamycin (mTOR) pathway inhibitor. Through the analysis of tumor tissue derived from pre-operative renal cell carcinoma (RCC) clinical trials, the PREDICT consortium will use established and novel methods to integrate comprehensive tumor-derived genomic data with personalized tumor-derived small hairpin RNA and high-throughput small interfering RNA screens to identify and validate functionally important genomic or transcriptomic predictive biomarkers of individual drug response in patients. PREDICT's approach to predictive biomarker discovery differs from conventional associative learning approaches, which can be susceptible to the detection of chance associations that lead to overestimation of true clinical accuracy. These methods will identify molecular pathways important for survival and growth of RCC cells and particular targets suitable for therapeutic development. Importantly, our results may enable individualized treatment of RCC, reducing ineffective therapy in drug-resistant disease, leading to improved quality of life and higher cost efficiency, which in turn should broaden patient access to beneficial therapeutics, thereby enhancing clinical outcome and cancer survival. The consortium will also establish and consolidate a European network providing the technological and clinical platform for large-scale functional genomic biomarker discovery. Here we review our current understanding

  10. Interaction with PALB2 Is Essential for Maintenance of Genomic Integrity by BRCA2.

    PubMed

    Hartford, Suzanne A; Chittela, Rajanikant; Ding, Xia; Vyas, Aradhana; Martin, Betty; Burkett, Sandra; Haines, Diana C; Southon, Eileen; Tessarollo, Lino; Sharan, Shyam K

    2016-08-01

    Human breast cancer susceptibility gene, BRCA2, encodes a 3418-amino acid protein that is essential for maintaining genomic integrity. Among the proteins that physically interact with BRCA2, Partner and Localizer of BRCA2 (PALB2), which binds to the N-terminal region of BRCA2, is vital for its function by facilitating its subnuclear localization. A functional redundancy has been reported between this N-terminal PALB2-binding domain and the C-terminal DNA-binding domain of BRCA2, which undermines the relevance of the interaction between these two proteins. Here, we describe a genetic approach to examine the functional significance of the interaction between BRCA2 and PALB2 by generating a knock-in mouse model of Brca2 carrying a single amino acid change (Gly25Arg, Brca2G25R) that disrupts this interaction. In addition, we have combined Brca2G25R homozygosity as well as hemizygosity with Palb2 and Trp53 heterozygosity to generate an array of genotypically and phenotypically distinct mouse models. Our findings reveal defects in body size, fertility, meiotic progression, and genome stability, as well as increased tumor susceptibility in these mice. The severity of the phenotype increased with a decrease in the interaction between BRCA2 and PALB2, highlighting the significance of this interaction. In addition, our findings also demonstrate that hypomorphic mutations such as Brca2G25R have the potential to be more detrimental than the functionally null alleles by increasing genomic instability to a level that induces tumorigenesis, rather than apoptosis. PMID:27490902

  11. Interaction with PALB2 Is Essential for Maintenance of Genomic Integrity by BRCA2

    PubMed Central

    Hartford, Suzanne A.; Chittela, Rajanikant; Ding, Xia; Martin, Betty; Burkett, Sandra; Haines, Diana C.; Southon, Eileen; Tessarollo, Lino; Sharan, Shyam K.

    2016-01-01

    Human breast cancer susceptibility gene, BRCA2, encodes a 3418-amino acid protein that is essential for maintaining genomic integrity. Among the proteins that physically interact with BRCA2, Partner and Localizer of BRCA2 (PALB2), which binds to the N-terminal region of BRCA2, is vital for its function by facilitating its subnuclear localization. A functional redundancy has been reported between this N-terminal PALB2-binding domain and the C-terminal DNA-binding domain of BRCA2, which undermines the relevance of the interaction between these two proteins. Here, we describe a genetic approach to examine the functional significance of the interaction between BRCA2 and PALB2 by generating a knock-in mouse model of Brca2 carrying a single amino acid change (Gly25Arg, Brca2G25R) that disrupts this interaction. In addition, we have combined Brca2G25R homozygosity as well as hemizygosity with Palb2 and Trp53 heterozygosity to generate an array of genotypically and phenotypically distinct mouse models. Our findings reveal defects in body size, fertility, meiotic progression, and genome stability, as well as increased tumor susceptibility in these mice. The severity of the phenotype increased with a decrease in the interaction between BRCA2 and PALB2, highlighting the significance of this interaction. In addition, our findings also demonstrate that hypomorphic mutations such as Brca2G25R have the potential to be more detrimental than the functionally null alleles by increasing genomic instability to a level that induces tumorigenesis, rather than apoptosis. PMID:27490902

  12. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data.

    PubMed

    Vallenet, David; Belda, Eugeni; Calteau, Alexandra; Cruveiller, Stéphane; Engelen, Stefan; Lajus, Aurélie; Le Fèvre, François; Longin, Cyrille; Mornico, Damien; Roche, David; Rouy, Zoé; Salvignol, Gregory; Scarpelli, Claude; Thil Smith, Adam Alexander; Weiman, Marion; Médigue, Claudine

    2013-01-01

    MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest. PMID:23193269

  13. Off-target Effects in CRISPR/Cas9-mediated Genome Engineering

    PubMed Central

    Zhang, Xiao-Hui; Tee, Louis Y; Wang, Xiao-Gang; Huang, Qun-Shan; Yang, Shi-Hua

    2015-01-01

    CRISPR/Cas9 is a versatile genome-editing technology that is widely used for studying the functionality of genetic elements, creating genetically modified organisms as well as preclinical research of genetic disorders. However, the high frequency of off-target activity (≥50%)—RGEN (RNA-guided endonuclease)-induced mutations at sites other than the intended on-target site—is one major concern, especially for therapeutic and clinical applications. Here, we review the basic mechanisms underlying off-target cutting in the CRISPR/Cas9 system, methods for detecting off-target mutations, and strategies for minimizing off-target cleavage. The improvement off-target specificity in the CRISPR/Cas9 system will provide solid genotype–phenotype correlations, and thus enable faithful interpretation of genome-editing data, which will certainly facilitate the basic and clinical application of this technology. PMID:26575098

  14. Genome-wide binding and mechanistic analyses of Smchd1-mediated epigenetic regulation.

    PubMed

    Chen, Kelan; Hu, Jiang; Moore, Darcy L; Liu, Ruijie; Kessans, Sarah A; Breslin, Kelsey; Lucet, Isabelle S; Keniry, Andrew; Leong, Huei San; Parish, Clare L; Hilton, Douglas J; Lemmers, Richard J L F; van der Maarel, Silvère M; Czabotar, Peter E; Dobson, Renwick C J; Ritchie, Matthew E; Kay, Graham F; Murphy, James M; Blewitt, Marnie E

    2015-07-01

    Structural maintenance of chromosomes flexible hinge domain containing 1 (Smchd1) is an epigenetic repressor with described roles in X inactivation and genomic imprinting, but Smchd1 is also critically involved in the pathogenesis of facioscapulohumeral dystrophy. The underlying molecular mechanism by which Smchd1 functions in these instances remains unknown. Our genome-wide transcriptional and epigenetic analyses show that Smchd1 binds cis-regulatory elements, many of which coincide with CCCTC-binding factor (Ctcf) binding sites, for example, the clustered protocadherin (Pcdh) genes, where we show Smchd1 and Ctcf act in opposing ways. We provide biochemical and biophysical evidence that Smchd1-chromatin interactions are established through the homodimeric hinge domain of Smchd1 and, intriguingly, that the hinge domain also has the capacity to bind DNA and RNA. Our results suggest Smchd1 imparts epigenetic regulation via physical association with chromatin, which may antagonize Ctcf-facilitated chromatin interactions, resulting in coordinated transcriptional control. PMID:26091879

  15. Integrating genomics in head and neck cancer treatment: Promises and pitfalls.

    PubMed

    Thariat, Juliette; Vignot, Stéphane; Lapierre, Ariane; Falk, Alexander T; Guigay, Joel; Van Obberghen-Schilling, Ellen; Milano, Gerard

    2015-09-01

    Head and neck squamous cell carcinomas (HNSCC) represent a multifactorial disease of poor prognosis. They have lagged behind other cancers in terms of personalized therapy. With expansion and high throughput sequencing methods, recent landmark exonic studies and Cancer Genome Atlas data have identified genes relevant to carcinogenesis and cancer progression. Mutational profiles and rates vary widely depending on exposure to carcinogens, anatomic subsites and human papilloma virus (HPV) infection. Tumors may exhibit specific, tissue-specific, not exclusively HPV-related, gene alterations, such those observed in oral cavity cancers in Asia or Occident. Except for the PI3K pathway, the rate of mutations in HPV+ cancers is much lower than in tobacco/alcohol-related cancers. Somatic driver mutation analyses show that relatively few driver genes are druggable in HNSCC and that tumor suppressor gene alterations prevail. More mature for therapeutic applications is the oncogenic PI3K pathway, with preclinical human xenograft models suggesting that PI3KCA pathway mutations may be used as predictive biomarkers and clinical data showing efficacy of mTOR/Akt inhibitors. Therapeutic guidance, to date, relies on classical histoprognostic factors, anatomic subsite and HPV status, with integration of hierarchized supervised mutational profiling to provide additional therapeutic options in advanced HNSCC in a near future. Unsupervised controlled genomic analyses remain necessary to unravel potentially relevant genes. PMID:25979769

  16. A bayesian integrative model for genetical genomics with spatially informed variable selection.

    PubMed

    Cassese, Alberto; Guindani, Michele; Vannucci, Marina

    2014-01-01

    We consider a Bayesian hierarchical model for the integration of gene expression levels with comparative genomic hybridization (CGH) array measurements collected on the same subjects. The approach defines a measurement error model that relates the gene expression levels to latent copy number states. In turn, the latent states are related to the observed surrogate CGH measurements via a hidden Markov model. The model further incorporates variable selection with a spatial prior based on a probit link that exploits dependencies across adjacent DNA segments. Posterior inference is carried out via Markov chain Monte Carlo stochastic search techniques. We study the performance of the model in simulations and show better results than those achieved with recently proposed alternative priors. We also show an application to data from a genomic study on lung squamous cell carcinoma, where we identify potential candidates of associations between copy number variants and the transcriptional activity of target genes. Gene ontology (GO) analyses of our findings reveal enrichments in genes that code for proteins involved in cancer. Our model also identifies a number of potential candidate biomarkers for further experimental validation. PMID:25288877

  17. Genomic DNA extraction from cells by electroporation on an integrated microfluidic platform

    PubMed Central

    Geng, Tao; Bao, Ning; Sriranganathanw, Nammalwar; Li, Liwu; Lu, Chang

    2012-01-01

    The vast majority of genetic analysis of cells involves chemical lysis for release of DNA molecules. However, chemical reagents required in the lysis interfere with downstream molecular biology and often require removal after the step. Electrical lysis based on irreversible electroporation is a promising technique to prepare samples for genetic analysis due to its purely physical nature, fast speed, and simple operation. However, there has been no experimental confirmation on whether electrical lysis extracts genomic DNA from cells in a reproducible and efficient fashion in comparison to chemical lysis, especially for eukaryotic cells that have most of DNA enclosed in the nucleus. In this work, we construct an integrated microfluidic chip that physically traps a low number of cells, lyses the cells using electrical pulses rapidly, then purifies and concentrates genomic DNA. We demonstrate that electrical lysis offers high efficiency for DNA extraction from both eukaryotic cells (up to ~36% for Chinese hamster ovary cells) and bacterial cells (up to ~45% for Salmonella typhimurium) that is comparable to the widely-used chemical lysis. The DNA extraction efficiency has dependence on both electric parameters and relative amount of beads used for DNA adsorption. We envision that electroporation-based DNA extraction will find use in ultrasensitive assays that benefit from minimal dilution and simple procedure. PMID:23061629

  18. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.

    PubMed

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2012-05-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology. PMID:22294438

  19. Starch biosynthesis in cassava: a genome-based pathway reconstruction and its exploitation in data integration

    PubMed Central

    2013-01-01

    Background Cassava is a well-known starchy root crop utilized for food, feed and biofuel production. However, the comprehension underlying the process of starch production in cassava is not yet available. Results In this work, we exploited the recently released genome information and utilized the post-genomic approaches to reconstruct the metabolic pathway of starch biosynthesis in cassava using multiple plant templates. The quality of pathway reconstruction was assured by the employed parsimonious reconstruction framework and the collective validation steps. Our reconstructed pathway is presented in the form of an informative map, which describes all important information of the pathway, and an interactive map, which facilitates the integration of omics data into the metabolic pathway. Additionally, to demonstrate the advantage of the reconstructed pathways beyond just the schematic presentation, the pathway could be used for incorporating the gene expression data obtained from various developmental stages of cassava roots. Our results exhibited the distinct activities of the starch biosynthesis pathway in different stages of root development at the transcriptional level whereby the activity of the pathway is higher toward the development of mature storage roots. Conclusions To expand its applications, the interactive map of the reconstructed starch biosynthesis pathway is available for download at the SBI group’s website (http://sbi.pdti.kmutt.ac.th/?page_id=33). This work is considered a big step in the quantitative modeling pipeline aiming to investigate the dynamic regulation of starch biosynthesis in cassava roots. PMID:23938102

  20. Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer.

    PubMed

    Peifer, Martin; Fernández-Cuesta, Lynnette; Sos, Martin L; George, Julie; Seidel, Danila; Kasper, Lawryn H; Plenker, Dennis; Leenders, Frauke; Sun, Ruping; Zander, Thomas; Menon, Roopika; Koker, Mirjam; Dahmen, Ilona; Müller, Christian; Di Cerbo, Vincenzo; Schildhaus, Hans-Ulrich; Altmüller, Janine; Baessmann, Ingelore; Becker, Christian; de Wilde, Bram; Vandesompele, Jo; Böhm, Diana; Ansén, Sascha; Gabler, Franziska; Wilkening, Ines; Heynck, Stefanie; Heuckmann, Johannes M; Lu, Xin; Carter, Scott L; Cibulskis, Kristian; Banerji, Shantanu; Getz, Gad; Park, Kwon-Sik; Rauh, Daniel; Grütter, Christian; Fischer, Matthias; Pasqualucci, Laura; Wright, Gavin; Wainer, Zoe; Russell, Prudence; Petersen, Iver; Chen, Yuan; Stoelben, Erich; Ludwig, Corinna; Schnabel, Philipp; Hoffmann, Hans; Muley, Thomas; Brockmann, Michael; Engel-Riedel, Walburga; Muscarella, Lucia A; Fazio, Vito M; Groen, Harry; Timens, Wim; Sietsma, Hannie; Thunnissen, Erik; Smit, Egbert; Heideman, Daniëlle A M; Snijders, Peter J F; Cappuzzo, Federico; Ligorio, Claudia; Damiani, Stefania; Field, John; Solberg, Steinar; Brustugun, Odd Terje; Lund-Iversen, Marius; Sänger, Jörg; Clement, Joachim H; Soltermann, Alex; Moch, Holger; Weder, Walter; Solomon, Benjamin; Soria, Jean-Charles; Validire, Pierre; Besse, Benjamin; Brambilla, Elisabeth; Brambilla, Christian; Lantuejoul, Sylvie; Lorimier, Philippe; Schneider, Peter M; Hallek, Michael; Pao, William; Meyerson, Matthew; Sage, Julien; Shendure, Jay; Schneider, Robert; Büttner, Reinhard; Wolf, Jürgen; Nürnberg, Peter; Perner, Sven; Heukamp, Lukas C; Brindle, Paul K; Haas, Stefan; Thomas, Roman K

    2012-10-01

    Small-cell lung cancer (SCLC) is an aggressive lung tumor subtype with poor prognosis. We sequenced 29 SCLC exomes, 2 genomes and 15 transcriptomes and found an extremely high mutation rate of 7.4±1 protein-changing mutations per million base pairs. Therefore, we conducted integrated analyses of the various data sets to identify pathogenetically relevant mutated genes. In all cases, we found evidence for inactivation of TP53 and RB1 and identified recurrent mutations in the CREBBP, EP300 and MLL genes that encode histone modifiers. Furthermore, we observed mutations in PTEN, SLIT2 and EPHA7, as well as focal amplifications of the FGFR1 tyrosine kinase gene. Finally, we detected many of the alterations found in humans in SCLC tumors from Tp53 and Rb1 double knockout mice. Our study implicates histone modification as a major feature of SCLC, reveals potentially therapeutically tractable genomic alterations and provides a generalizable framework for the identification of biologically relevant genes in the context of high mutational background. PMID:22941188

  1. Integrative genome analyses identify key somatic driver mutations of small cell lung cancer

    PubMed Central

    Peifer, Martin; Fernández-Cuesta, Lynnette; Sos, Martin L; George, Julie; Seidel, Danila; Kasper, Lawryn H; Plenker, Dennis; Leenders, Frauke; Sun, Ruping; Zander, Thomas; Menon, Roopika; Koker, Mirjam; Dahmen, Ilona; Müller, Christian; Di Cerbo, Vincenzo; Schildhaus, Hans-Ulrich; Altmüller, Janine; Baessmann, Ingelore; Becker, Christian; de Wilde, Bram; Vandesompele, Jo; Böhm, Diana; Ansén, Sascha; Gabler, Franziska; Wilkening, Ines; Heynck, Stefanie; Heuckmann, Johannes M; Lu, Xin; Carter, Scott L; Cibulskis, Kristian; Banerji, Shantanu; Getz, Gad; Park, Kwon-Sik; Rauh, Daniel; Grütter, Christian; Fischer, Matthias; Pasqualucci, Laura; Wright, Gavin; Wainer, Zoe; Russell, Prudence; Petersen, Iver; Chen, Yuan; Stoelben, Erich; Ludwig, Corinna; Schnabel, Philipp; Hoffmann, Hans; Muley, Thomas; Brockmann, Michael; Engel-Riedel, Walburga; Muscarella, Lucia A; Fazio, Vito M; Groen, Harry; Timens, Wim; Sietsma, Hannie; Thunnissen, Erik; Smit, Egbert; Heideman, Daniëlle AM; Snijders, Peter JF; Cappuzzo, Federico; Ligorio, Claudia; Damiani, Stefania; Field, John; Solberg, Steinar; Brustugun, Odd Terje; Lund-Iversen, Marius; Sänger, Jörg; Clement, Joachim H; Soltermann, Alex; Moch, Holger; Weder, Walter; Solomon, Benjamin; Soria, Jean-Charles; Validire, Pierre; Besse, Benjamin; Brambilla, Elisabeth; Brambilla, Christian; Lantuejoul, Sylvie; Lorimier, Philippe; Schneider, Peter M; Hallek, Michael; Pao, William; Meyerson, Matthew; Sage, Julien; Shendure, Jay; Schneider, Robert; Büttner, Reinhard; Wolf, Jürgen; Nürnberg, Peter; Perner, Sven; Heukamp, Lukas C; Brindle, Paul K; Haas, Stefan; Thomas, Roman K

    2016-01-01

    Small-cell lung cancer (SCLC) is an aggressive lung tumor subtype with poor survival1–3. We sequenced 29 SCLC exomes, two genomes and 15 transcriptomes and found an extremely high mutation rate of 7.4±1 protein-changing mutations per million basepairs. Therefore, we conducted integrated analyses of the various data sets to identify pathogenetically relevant mutated genes. In all cases we found evidence for inactivation of TP53 and RB1 and identified recurrent mutations in histone-modifying genes, CREBBP, EP300, and MLL. Furthermore, we observed mutations in PTEN, in SLIT2, and EPHA7, as well as focal amplifications of the FGFR1 tyrosine kinase gene. Finally, we detected many of the alterations found in humans in SCLC tumors from p53/Rb1-deficient mice4. Our study implicates histone modification as a major feature of SCLC, reveals potentially therapeutically tractable genome alterations, and provides a generalizable framework for identification of biologically relevant genes in the context of high mutational background. PMID:22941188

  2. Histone H3.3 maintains genome integrity during mammalian development

    PubMed Central

    Jang, Chuan-Wei; Shibata, Yoichiro; Starmer, Joshua; Yee, Della; Magnuson, Terry

    2015-01-01

    Histone H3.3 is a highly conserved histone H3 replacement variant in metazoans and has been implicated in many important biological processes, including cell differentiation and reprogramming. Germline and somatic mutations in H3.3 genomic incorporation pathway components or in H3.3 encoding genes have been associated with human congenital diseases and cancers, respectively. However, the role of H3.3 in mammalian development remains unclear. To address this question, we generated H3.3-null mouse models through classical genetic approaches. We found that H3.3 plays an essential role in mouse development. Complete depletion of H3.3 leads to developmental retardation and early embryonic lethality. At the cellular level, H3.3 loss triggers cell cycle suppression and cell death. Surprisingly, H3.3 depletion does not dramatically disrupt gene regulation in the developing embryo. Instead, H3.3 depletion causes dysfunction of heterochromatin structures at telomeres, centromeres, and pericentromeric regions of chromosomes, leading to mitotic defects. The resulting karyotypical abnormalities and DNA damage lead to p53 pathway activation. In summary, our results reveal that an important function of H3.3 is to support chromosomal heterochromatic structures, thus maintaining genome integrity during mammalian development. PMID:26159997

  3. Integrated Genomic and Network-Based Analyses of Complex Diseases and Human Disease Network.

    PubMed

    Al-Harazi, Olfat; Al Insaif, Sadiq; Al-Ajlan, Monirah A; Kaya, Namik; Dzimiri, Nduna; Colak, Dilek

    2016-06-20

    A disease phenotype generally reflects various pathobiological processes that interact in a complex network. The highly interconnected nature of the human protein interaction network (interactome) indicates that, at the molecular level, it is difficult to consider diseases as being independent of one another. Recently, genome-wide molecular measurements, data mining and bioinformatics approaches have provided the means to explore human diseases from a molecular basis. The exploration of diseases and a system of disease relationships based on the integration of genome-wide molecular data with the human interactome could offer a powerful perspective for understanding the molecular architecture of diseases. Recently, subnetwork markers have proven to be more robust and reliable than individual biomarker genes selected based on gene expression profiles alone, and achieve higher accuracy in disease classification. We have applied one of these methodologies to idiopathic dilated cardiomyopathy (IDCM) data that we have generated using a microarray and identified significant subnetworks associated with the disease. In this paper, we review the recent endeavours in this direction, and summarize the existing methodologies and computational tools for network-based analysis of complex diseases and molecular relationships among apparently different disorders and human disease network. We also discuss the future research trends and topics of this promising field. PMID:27318646

  4. Integrative computational approach for genome-based study of microbial lipid-degrading enzymes.

    PubMed

    Vorapreeda, Tayvich; Thammarongtham, Chinae; Laoteng, Kobkul

    2016-07-01

    Lipid-degrading or lipolytic enzymes have gained enormous attention in academic and industrial sectors. Several efforts are underway to discover new lipase enzymes from a variety of microorganisms with particular catalytic properties to be used for extensive applications. In addition, various tools and strategies have been implemented to unravel the functional relevance of the versatile lipid-degrading enzymes for special purposes. This review highlights the study of microbial lipid-degrading enzymes through an integrative computational approach. The identification of putative lipase genes from microbial genomes and metagenomic libraries using homology-based mining is discussed, with an emphasis on sequence analysis of conserved motifs and enzyme topology. Molecular modelling of three-dimensional structure on the basis of sequence similarity is shown to be a potential approach for exploring the structural and functional relationships of candidate lipase enzymes. The perspectives on a discriminative framework of cutting-edge tools and technologies, including bioinformatics, computational biology, functional genomics and functional proteomics, intended to facilitate rapid progress in understanding lipolysis mechanism and to discover novel lipid-degrading enzymes of microorganisms are discussed. PMID:27263017

  5. Regulation of active genome integrity and expression by Rad26p

    PubMed Central

    Malik, Shivani; Bhaumik, Sukesh R

    2014-01-01

    Rad26p is a SWI/SNF-like ATPase in yeast, and is conserved among eukaryotes. Both Rad26p and its human homolog CSB (Cockayne syndrome group B) are involved in regulation of chromatin structure, transcription and DNA repair.  Thus, mutations or malfunctions of these proteins have significant effects on cellular functions. Mutations in CSB are associated with Cockayne syndrome (CS) that is characterized by heterogeneous pathologies such as mental and physical retardation, sun sensitivity, premature aging, muscular and skeletal abnormalities, and progressive decline in neurological and cognitive functions. Therefore, many research groups focused their studies to understand the mechanisms of Rad26p/CSB functions to illuminate the molecular bases of CS. These studies have provided significant functional and mechanistic insights of Rad26p/CSB in regulation of gene expression and genome integrity as described here. PMID:25484185

  6. Analysis and visualization of RNA-Seq expression data using RStudio, Bioconductor, and Integrated Genome Browser.

    PubMed

    Loraine, Ann E; Blakley, Ivory Clabaugh; Jagadeesan, Sridharan; Harper, Jeff; Miller, Gad; Firon, Nurit

    2015-01-01

    Sequencing costs are falling, but the cost of data analysis remains high, often because unforeseen problems arise, such as insufficient depth of sequencing or batch effects. Experimenting with data analysis methods during the planning phase of an experiment can reveal unanticipated problems and build valuable bioinformatics expertise in the organism or process being studied. This protocol describes using R Markdown and RStudio, user-friendly tools for statistical analysis and reproducible research in bioinformatics, to analyze and document the analysis of an example RNA-Seq data set from tomato pollen undergoing chronic heat stress. Also, we show how to use Integrated Genome Browser to visualize read coverage graphs for differentially expressed genes. Applying the protocol described here and using the provided data sets represent a useful first step toward building RNA-Seq data analysis expertise in a research group. PMID:25757788

  7. Expression of Active Subunit of Nitrogenase via Integration into Plant Organelle Genome

    PubMed Central

    Groat, Jeanna; Staub, Jeffrey M.; Stephens, Michael

    2016-01-01

    Nitrogen availability is crucial for crop yield with nitrogen fertilizer accounting for a large percentage of farmers’ expenses. However, an untimely or excessive application of fertilizer can increase risks of negative environmental effects. These factors, along with the environmental and energy costs of synthesizing nitrogen fertilizer, led us to seek out novel biotechnology-driven approaches to supply nitrogen to plants. The strategy we focused on involves transgenic expression of nitrogenase, a bacterial multi-subunit enzyme that can capture atmospheric nitrogen. Here we report expression of the active Fe subunit of nitrogenase via integration into the tobacco plastid genome of bacterial gene sequences modified for expression in plastid. Our study suggests that it will be possible to engineer plants that are able to produce their own nitrogen fertilizer by expressing nitrogenase genes in plant plastids. PMID:27529475

  8. The next generation of literature analysis: integration of genomic analysis into text mining.