Science.gov

Sample records for genomic integration mediated

  1. Altering genomic integrity: heavy metal exposure promotes trans-posable element-mediated damage

    PubMed Central

    Morales, Maria E.; Servant, Geraldine; Ade, Catherine; Roy-Enge, Astrid M.

    2015-01-01

    Maintenance of genomic integrity is critical for cellular homeostasis and survival. The active transposable elements (TEs) composed primarily of three mobile element lineages LINE-1, Alu, and SVA comprise approximately 30% of the mass of the human genome. For the past two decades, studies have shown that TEs significantly contribute to genetic instability and that TE-caused damages are associated with genetic diseases and cancer. Different environmental exposures, including several heavy metals, influence how TEs interact with its host genome increasing their negative impact. This mini-review provides some basic knowledge on TEs, their contribution to disease and an overview of the current knowledge on how heavy metals influence TE-mediated damage. PMID:25774044

  2. iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis.

    PubMed

    Huang, Yen-Tsung; Liang, Liming; Moffatt, Miriam F; Cookson, William O C M; Lin, Xihong

    2015-07-01

    Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful. PMID:25997986

  3. iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis.

    PubMed

    Huang, Yen-Tsung; Liang, Liming; Moffatt, Miriam F; Cookson, William O C M; Lin, Xihong

    2015-07-01

    Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful.

  4. Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism.

    PubMed

    Hu, Zheng; Zhu, Da; Wang, Wei; Li, Weiyang; Jia, Wenlong; Zeng, Xi; Ding, Wencheng; Yu, Lan; Wang, Xiaoli; Wang, Liming; Shen, Hui; Zhang, Changlin; Liu, Hongjie; Liu, Xiao; Zhao, Yi; Fang, Xiaodong; Li, Shuaicheng; Chen, Wei; Tang, Tang; Fu, Aisi; Wang, Zou; Chen, Gang; Gao, Qinglei; Li, Shuang; Xi, Ling; Wang, Changyu; Liao, Shujie; Ma, Xiangyi; Wu, Peng; Li, Kezhen; Wang, Shixuan; Zhou, Jianfeng; Wang, Jun; Xu, Xun; Wang, Hui; Ma, Ding

    2015-02-01

    Human papillomavirus (HPV) integration is a key genetic event in cervical carcinogenesis. By conducting whole-genome sequencing and high-throughput viral integration detection, we identified 3,667 HPV integration breakpoints in 26 cervical intraepithelial neoplasias, 104 cervical carcinomas and five cell lines. Beyond recalculating frequencies for the previously reported frequent integration sites POU5F1B (9.7%), FHIT (8.7%), KLF12 (7.8%), KLF5 (6.8%), LRP1B (5.8%) and LEPREL1 (4.9%), we discovered new hot spots HMGA2 (7.8%), DLG2 (4.9%) and SEMA3D (4.9%). Protein expression from FHIT and LRP1B was downregulated when HPV integrated in their introns. Protein expression from MYC and HMGA2 was elevated when HPV integrated into flanking regions. Moreover, microhomologous sequence between the human and HPV genomes was significantly enriched near integration breakpoints, indicating that fusion between viral and human DNA may have occurred by microhomology-mediated DNA repair pathways. Our data provide insights into HPV integration-driven cervical carcinogenesis. PMID:25581428

  5. Biologically inspired survival analysis based on integrating gene expression as mediator with genomic variants.

    PubMed

    Youssef, Ibrahim; Clarke, Robert; Shih, Ie-Ming; Wang, Yue; Yu, Guoqiang

    2016-10-01

    Accurately linking cancer molecular profiling with survival can lead to improvements in the clinical management of cancer. However, existing survival analysis relies on statistical evidence from a single level of data, without paying much attention to the integration of interacting multi-level data and the underlying biology. Advances in genomic techniques provide unprecedented power of characterizing the cancer tissue in a more complete manner than before, offering the opportunity to design biologically informed and integrative approaches for survival data analysis. Human cancer is characterized by somatic copy number alternation and unique gene expression profiles. However, it remains largely unclear how to integrate the gene expression and genetic variant data to achieve a better prediction of patient survival and an improved understanding of disease progression. Consistent with the biological hierarchy from DNA to RNA, we prioritize each survival-relevant feature with two separate scores, predictive and mechanistic. For mRNA expression levels, predictive features are those mRNAs whose variation in expression levels is associated with survival outcome, and mechanistic features are those mRNAs whose variation in expression levels is associated with genomic variants. Further, we simultaneously integrate information from both the predictive model and the mechanistic model through our new approach, GEMPS (Gene Expression as a Mediator for Predicting Survival). Applied on two cancer types (ovarian and glioblastoma multiforme), our method achieved better prediction power (p-value: 6.18E-03-5.15E-11) than peer methods (GE.CNAs and GE.CNAs. Lasso). Gene set enrichment analysis confirms that the genes utilized for the final survival analysis are biologically important and relevant. PMID:27619193

  6. Integrated Genomics Identifies Convergence of Ankylosing Spondylitis with Global Immune Mediated Disease Pathways

    PubMed Central

    Uddin, Mohammed; Codner, Dianne; Mahmud Hasan, S M; Scherer, Stephen W; O’Rielly, Darren D; Rahman, Proton

    2015-01-01

    Ankylosing spondylitis(AS), a highly heritable complex inflammatory arthritis. Although, a handful of non-HLA risk loci have been identified, capturing the unexplained genetic contribution to AS pathogenesis remains a challenge attributed to additive, pleiotropic and epistatic-interactions at the molecular level. Here, we developed multiple integrated genomic approaches to quantify molecular convergence of non-HLA loci with global immune mediated diseases. We show that non-HLA genes are significantly sensitive to deleterious mutation accumulation in the general population compared with tolerant genes. Human developmental proteomics (prenatal to adult) analysis revealed that proteins encoded by non-HLA AS risk loci are 2-fold more expressed in adult hematopoietic cells.Enrichment analysis revealed AS risk genes overlap with a significant number of immune related pathways (p < 0.0001 to 9.8 × 10-12). Protein-protein interaction analysis revealed non-shared AS risk genes are highly clustered seeds that significantly converge (empirical; p < 0.01 to 1.6 × 10-4) into networks of global immune mediated disease risk loci. We have also provided initial evidence for the involvement of STAT2/3 in AS pathogenesis. Collectively, these findings highlight molecular insight on non-HLA AS risk loci that are not exclusively connected with overlapping immune mediated diseases; rather a component of common pathophysiological pathways with other immune mediated diseases. This information will be pivotal to fully explain AS pathogenesis and identify new therapeutic targets. PMID:25980808

  7. Expression and genomic integration of transgenes after Agrobacterium-mediated transformation of mature barley embryos.

    PubMed

    Uçarlı, C; Tufan, F; Gürel, F

    2015-01-01

    Mature embryos in tissue cultures are advantageous because of their abundance and rapid germination, which reduces genomic instability problems. In this study, 2-day-old isolated mature barley embryos were infected with 2 Agrobacterium hypervirulent strains (AGL1 and EHA105), followed by a 3-day period of co-cultivation in the presence of L-cystein amino acid. Chimeric expression of the b-glucuronidase gene (gusA) directed by a viral promoter of strawberry vein banding virus was observed in coleoptile epidermal cells and seminal roots in 5-day-old germinated seedlings. In addition to varying infectivity patterns in different strains, there was a higher ratio of transient b-glucuronidase expression in developing coleoptiles than in embryonic roots, indicating the high competency of shoot apical meristem cells in the mature embryo. A total of 548 explants were transformed and 156 plants developed to maturity on G418 media after 18-25 days. We detected transgenes in 74% of the screened plant leaves by polymerase chain reaction, and 49% of these expressed neomycin phosphotransferase II gene following AGL1 transformation. Ten randomly selected T0 transformants were analyzed using thermal asymmetric interlaced polymerase chain reaction and 24 fragments ranged between 200-600 base pairs were sequenced. Three of the sequences flanked with transferred-DNA showed high similarity to coding regions of the barley genome, including alpha tubulin5, homeobox 1, and mitochondrial 16S genes. We observed 70-200-base pair filler sequences only in the coding regions of barley in this study. PMID:25730049

  8. Expression and genomic integration of transgenes after Agrobacterium-mediated transformation of mature barley embryos.

    PubMed

    Uçarlı, C; Tufan, F; Gürel, F

    2015-02-06

    Mature embryos in tissue cultures are advantageous because of their abundance and rapid germination, which reduces genomic instability problems. In this study, 2-day-old isolated mature barley embryos were infected with 2 Agrobacterium hypervirulent strains (AGL1 and EHA105), followed by a 3-day period of co-cultivation in the presence of L-cystein amino acid. Chimeric expression of the b-glucuronidase gene (gusA) directed by a viral promoter of strawberry vein banding virus was observed in coleoptile epidermal cells and seminal roots in 5-day-old germinated seedlings. In addition to varying infectivity patterns in different strains, there was a higher ratio of transient b-glucuronidase expression in developing coleoptiles than in embryonic roots, indicating the high competency of shoot apical meristem cells in the mature embryo. A total of 548 explants were transformed and 156 plants developed to maturity on G418 media after 18-25 days. We detected transgenes in 74% of the screened plant leaves by polymerase chain reaction, and 49% of these expressed neomycin phosphotransferase II gene following AGL1 transformation. Ten randomly selected T0 transformants were analyzed using thermal asymmetric interlaced polymerase chain reaction and 24 fragments ranged between 200-600 base pairs were sequenced. Three of the sequences flanked with transferred-DNA showed high similarity to coding regions of the barley genome, including alpha tubulin5, homeobox 1, and mitochondrial 16S genes. We observed 70-200-base pair filler sequences only in the coding regions of barley in this study.

  9. XerD-mediated FtsK-independent integration of TLCϕ into the Vibrio cholerae genome.

    PubMed

    Midonet, Caroline; Das, Bhabatosh; Paly, Evelyne; Barre, Francois-Xavier

    2014-11-25

    As in most bacteria, topological problems arising from the circularity of the two Vibrio cholerae chromosomes, chrI and chrII, are resolved by the addition of a crossover at a specific site of each chromosome, dif, by two tyrosine recombinases, XerC and XerD. The reaction is under the control of a cell division protein, FtsK, which activates the formation of a Holliday Junction (HJ) intermediate by XerD catalysis that is resolved into product by XerC catalysis. Many plasmids and phages exploit Xer recombination for dimer resolution and for integration, respectively. In all cases so far described, they rely on an alternative recombination pathway in which XerC catalyzes the formation of a HJ independently of FtsK. This is notably the case for CTXϕ, the cholera toxin phage. Here, we show that in contrast, integration of TLCϕ, a toxin-linked cryptic satellite phage that is almost always found integrated at the chrI dif site before CTXϕ, depends on the formation of a HJ by XerD catalysis, which is then resolved by XerC catalysis. The reaction nevertheless escapes the normal cellular control exerted by FtsK on XerD. In addition, we show that the same reaction promotes the excision of TLCϕ, along with any CTXϕ copy present between dif and its left attachment site, providing a plausible mechanism for how chrI CTXϕ copies can be eliminated, as occurred in the second wave of the current cholera pandemic. PMID:25385643

  10. Establishment of an improved high-efficiency thermal asymmetric interlaced PCR for identification of genomic integration sites mediated by phiC31 integrase.

    PubMed

    Zhou, Zaiwei; Ma, Haiyan; Qu, Lijuan; Xie, Fei; Ma, Qingwen; Ren, Zhaorui

    2012-03-01

    Streptomyces phage phiC31 integrase is widely used to mediate the integration of exogenous genes into host genomes for gene therapy and genomic modification, as it autonomously performs efficient, unidirectional, site-specific integration into pseudo attP sites of the host genome. Although pseudo attP sites are rarely found within exons, it is necessary to map their precise locations to avoid the risk of insertion mutagenesis. High-efficiency thermal asymmetric interlaced PCR (hiTAIL-PCR) is a technique that has been developed to recover genomic sequences that flank insertion tags. We have found, however, that this technique is poorly efficient, as it amplifies many non-specific targets and frequently does not generate sufficient product for downstream analysis. Therefore, we have modified the hiTAIL-PCR procedure and re-designed the random primers. As a result, both the amount and specificity of the reaction product were enhanced for each integration site. Restriction analysis of known sequences within the integrated vector, which co-amplified with the flanking genomic sequences, validated 90% of these bands for sequencing. In contrast, only 30% of the bands produced by previous hiTAIL-PCR could be validated. Compared with the original hiTAIL-PCR, our improved hiTAIL-PCR procedure identified phiC31 integration sites more accurately and efficiently.

  11. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity

    PubMed Central

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  12. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    PubMed Central

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease. PMID:27585646

  13. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    NASA Astrophysics Data System (ADS)

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-09-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease.

  14. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer's disease.

    PubMed

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease. PMID:27585646

  15. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer's disease.

    PubMed

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-09-02

    Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease.

  16. Integrative Genomics Implicates EGFR as a Downstream Mediator in NKX2-1 Amplified Non-Small Cell Lung Cancer

    PubMed Central

    Clarke, Nicole; Biscocho, Jewison; Kwei, Kevin A.; Davidson, Jean M.; Sridhar, Sushmita; Gong, Xue; Pollack, Jonathan R.

    2015-01-01

    NKX2-1, encoding a homeobox transcription factor, is amplified in approximately 15% of non-small cell lung cancers (NSCLC), where it is thought to drive cancer cell proliferation and survival. However, its mechanism of action remains largely unknown. To identify relevant downstream transcriptional targets, here we carried out a combined NKX2-1 transcriptome (NKX2-1 knockdown followed by RNAseq) and cistrome (NKX2-1 binding sites by ChIPseq) analysis in four NKX2-1-amplified human NSCLC cell lines. While NKX2-1 regulated genes differed among the four cell lines assayed, cell proliferation emerged as a common theme. Moreover, in 3 of the 4 cell lines, epidermal growth factor receptor (EGFR) was among the top NKX2-1 upregulated targets, which we confirmed at the protein level by western blot. Interestingly, EGFR knockdown led to upregulation of NKX2-1, suggesting a negative feedback loop. Consistent with this finding, combined knockdown of NKX2-1 and EGFR in NCI-H1819 lung cancer cells reduced cell proliferation (as well as MAP-kinase and PI3-kinase signaling) more than knockdown of either alone. Likewise, NKX2-1 knockdown enhanced the growth-inhibitory effect of the EGFR-inhibitor erlotinib. Taken together, our findings implicate EGFR as a downstream effector of NKX2-1 in NKX2-1 amplified NSCLC, with possible clinical implications, and provide a rich dataset for investigating additional mediators of NKX2-1 driven oncogenesis. PMID:26556242

  17. Yeast Oligo-mediated Genome Engineering (YOGE)

    PubMed Central

    DiCarlo, JE; Conley, AJ; Penttilä, M; Jäntti, J; Wang, HH; Church, GM

    2014-01-01

    High-frequency oligonucleotide-directed recombination engineering (recombineering) has enabled rapid modification of several prokaryotic genomes to date. Here, we present a method for oligonucleotide-mediated recombineering in the model eukaryote and industrial production host S. cerevisiae, which we call Yeast Oligo-mediated Genome Engineering (YOGE). Through a combination of overexpression and knockouts of relevant genes and optimization of transformation and oligonucleotide designs, we achieve high gene modification frequencies at levels that only require screening of dozens of cells. We demonstrate the robustness of our approach in three divergent yeast strains, including those involved in industrial production of bio-based chemicals. Furthermore, YOGE can be iteratively executed via cycling to generate genomic libraries up to 105 individuals at each round for diversity generation. YOGE cycling alone, or in combination with phenotypic selections or endonuclease-based negative genotypic selections, can be used to easily generate modified alleles in yeast populations with high frequencies. PMID:24160921

  18. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  19. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  20. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search.

  1. Chromatin and the genome integrity network

    PubMed Central

    Papamichos-Chronakis, Manolis; Peterson, Craig L.

    2013-01-01

    The maintenance of genome integrity is essential for organism survival and for the inheritance of traits to offspring. Genomic instability is caused by DNA damage, aberrant DNA replication or uncoordinated cell division, which can lead to chromosomal aberrations and gene mutations. Recently, chromatin regulators that shape the epigenetic landscape have emerged as potential gatekeepers and signalling coordinators for the maintenance of genome integrity. Here, we review chromatin functions during the two major pathways that control genome integrity: namely, repair of DNA damage and DNA replication. We also discuss recent evidence that suggests a novel role for chromatin-remodelling factors in chromosome segregation and in the prevention of aneuploidy. PMID:23247436

  2. Integrative genomics identifies DSCR1 (RCAN1) as a novel NFAT-dependent mediator of phenotypic modulation in vascular smooth muscle cells.

    PubMed

    Lee, Monica Y; Garvey, Sean M; Baras, Alex S; Lemmon, Julia A; Gomez, Maria F; Schoppee Bortz, Pamela D; Daum, Guenter; LeBoeuf, Renee C; Wamhoff, Brian R

    2010-02-01

    Vascular smooth muscle cells (SMCs) display remarkable phenotypic plasticity in response to environmental cues. The nuclear factor of activated T-cells (NFAT) family of transcription factors plays a critical role in vascular pathology. However, known functional NFAT gene targets in vascular SMCs are currently limited. Publicly available whole-genome expression array data sets were analyzed to identify differentially expressed genes in human, mouse and rat SMCs. Comparison between vehicle and phenotypic modulatory stimuli identified 63 species-conserved, upregulated genes. Integration of the 63 upregulated genes with an in silico NFAT-ome (a species-conserved list of gene promoters containing at least one NFAT binding site) identified 18 putative NFAT-dependent genes. Further intersection of these 18 potential NFAT target genes with a mouse in vivo vascular injury microarray identified four putative NFAT-dependent, injury-responsive genes. In vitro validations substantiated the NFAT-dependent role of Cyclooxygenase 2 (COX2/PTGS2) in SMC phenotypic modulation and uncovered Down Syndrome Candidate Region 1 (DSCR1/RCAN1) as a novel NFAT target gene in SMCs. We show that induction of DSCR1 inhibits calcineurin/NFAT signaling through a negative feedback mechanism; DSCR1 overexpression attenuates NFAT transcriptional activity and COX2 protein expression, whereas knockdown of endogenous DSCR1 enhances NFAT transcriptional activity. Our integrative genomics approach illustrates how the combination of publicly available gene expression arrays, computational databases and empirical research methods can answer specific questions in any cell type for a transcriptional network of interest. Herein, we report DSCR1 as a novel NFAT-dependent, injury-inducible, early gene that may serve to negatively regulate SMC phenotypic switching.

  3. Integrative Genomics of Chronic Obstructive Pulmonary Disease

    PubMed Central

    Hobbs, Brian D.; Hersh, Craig P.

    2014-01-01

    Chronic obstructive pulmonary disease (COPD) is a complex disease with both environmental and genetic determinants, the most important of which is cigarette smoking. There is marked heterogeneity in the development of COPD among persons with similar cigarette smoking histories, which is likely partially explained by genetic variation. Genomic approaches such as genomewide association studies and gene expression studies have been used to discover genes and molecular pathways involved in COPD pathogenesis; however, these “first generation” omics studies have limitations. Integrative genomic studies are emerging which can combine genomic datasets to further examine the molecular underpinnings of COPD. Future research in COPD genetics will likely use network-based approaches to integrate multiple genomic data types in order to model the complex molecular interactions involved in COPD pathogenesis. This article reviews the genomic research to date and offers a vision for the future of integrative genomic research in COPD. PMID:25078622

  4. Reverse transcriptase: mediator of genomic plasticity.

    PubMed

    Brosius, J; Tiedge, H

    1995-01-01

    Reverse transcription has been an important mediator of genomic change. This influence dates back more than three billion years, when the RNA genome was converted into the DNA genome. While the current cellular role(s) of reverse transcriptase are not yet completely understood, it has become clear over the last few years that this enzyme is still responsible for generating significant genomic change and that its activities are one of the driving forces of evolution. Reverse transcriptase generates, for example, extra gene copies (retrogenes), using as a template mature messenger RNAs. Such retrogenes do not always end up as nonfunctional pseudogenes but form, after reinsertion into the genome, new unions with resident promoter elements that may alter the gene's temporal and/or spatial expression levels. More frequently, reverse transcriptase produces copies of nonmessenger RNAs, such as small nuclear or cytoplasmic RNAs. Extremely high copy numbers can be generated by this process. The resulting reinserted DNA copies are therefore referred to as short interspersed repetitive elements (SINEs). SINEs have long been considered selfish DNA, littering the genome via exponential propagation but not contributing to the host's fitness. Many SINEs, however, can give rise to novel genes encoding small RNAs, and are the migrant carriers of numerous control elements and sequence motifs that can equip resident genes with novel regulatory elements [Brosius J. and Gould S.J., Proc Natl Acad Sci USA 89, 10706-10710, 1992]. Retrosequences, such as SINEs and portions of retroelements (e.g., long terminal repeats, LTRs), are capable of donating sequence motifs for nucleosome positioning, DNA methylation, transcriptional enhancers and silencers, poly(A) addition sequences, determinants of RNA stability or transport, splice sites, and even amino acid codons for incorporation into open reading frames as novel protein domains. Retroposition can therefore be considered as a major

  5. How RNA viruses maintain their genome integrity.

    PubMed

    Barr, John N; Fearns, Rachel

    2010-06-01

    RNA genomes are vulnerable to corruption by a range of activities, including inaccurate replication by the error-prone replicase, damage from environmental factors, and attack by nucleases and other RNA-modifying enzymes that comprise the cellular intrinsic or innate immune response. Damage to coding regions and loss of critical cis-acting signals inevitably impair genome fitness; as a consequence, RNA viruses have evolved a variety of mechanisms to protect their genome integrity. These include mechanisms to promote replicase fidelity, recombination activities that allow exchange of sequences between different RNA templates, and mechanisms to repair the genome termini. In this article, we review examples of these processes from a range of RNA viruses to showcase the diverse approaches that viruses have evolved to maintain their genome sequence integrity, focusing first on mechanisms that viruses use to protect their entire genome, and then concentrating on mechanisms that allow protection of the genome termini, which are especially vulnerable. In addition, we discuss examples in which it might be beneficial for a virus to 'lose' its genomic termini and reduce its replication efficiency.

  6. Integrated genome browser: visual analytics platform for genomics

    PubMed Central

    Norris, David C.; Loraine, Ann E.

    2016-01-01

    Motivation: Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Results: Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB’s ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. Availability and implementation: IGB is open source and is freely available from http://bioviz.org/igb. Contact: aloraine@uncc.edu PMID:27153568

  7. Integrating Mediators and Moderators in Research Design

    ERIC Educational Resources Information Center

    MacKinnon, David P.

    2011-01-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating…

  8. Methods of Genomic Competency Integration in Practice

    PubMed Central

    Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

    2015-01-01

    Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through

  9. Methods for TALEN-Mediated Genomic Manipulations in Drosophila.

    PubMed

    Liu, Jiyong; Guo, Yawen; Li, Changqing; Chen, Yixu; Jiao, Renjie

    2016-01-01

    TALEN (transcription activator-like effector nuclease) is a powerful tool for gene disruption and other genomic modifications. In the past 3 years or so, it has attracted eyes from every corner of the biological world, due to its characteristics of simplicity, high efficiency, low toxicity, and applicability across almost all species. In our lab, we first reported the TALEN-mediated gene disruption in Drosophila, and recently employed this technique in precisely modifying the Drosophila genome, such as in vivo tagging and gene correction. Here, we describe in detail the protocols and experiences in TALEN-mediated genomic modifications to share with the Drosophilists all over the world.

  10. Integrating Mediators and Moderators in Research Design

    PubMed Central

    MacKinnon, David P.

    2012-01-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating and moderating variables in a research study are discussed as well as approaches to routinely including these variables in outcome research. The routine inclusion of mediating and moderating variables holds the promise of increasing the amount of information from outcome studies by generating practical information about interventions as well as testing theory. The primary focus is on mediating and moderating variables for intervention research but many issues apply to nonintervention research as well. PMID:22675239

  11. Integrating Mediators and Moderators in Research Design.

    PubMed

    Mackinnon, David P

    2011-11-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating and moderating variables in a research study are discussed as well as approaches to routinely including these variables in outcome research. The routine inclusion of mediating and moderating variables holds the promise of increasing the amount of information from outcome studies by generating practical information about interventions as well as testing theory. The primary focus is on mediating and moderating variables for intervention research but many issues apply to nonintervention research as well.

  12. An Integrated Approach to Predictive Genomic Analytics

    SciTech Connect

    McDermott, Jason E.; Sanfilippo, Antonio P.; Taylor, Ronald C.; Baddeley, Robert L.; Riensche, Roderick M.; Jensen, Russell S.

    2010-08-02

    A variety of methods and algorithms have recently been employed in the analysis of gene expression data, including reverse-engineering and knowledge-based pathway modeling, semantic gene similarity, network analysis and clustering. These methods and algorithms address different subparts of the same overall challenge and need to be applied in combination to address predictive genomic analysis as a whole. In this paper, we present an integrated approach to predictive genomic analysis that achieves this objective and describe an application of the approach to the study of neuroprotection in stroke.

  13. Integrative Genomics and Computational Systems Medicine

    SciTech Connect

    McDermott, Jason E.; Huang, Yufei; Zhang, Bing; Xu, Hua; Zhao, Zhongming

    2014-01-01

    The exponential growth in generation of large amounts of genomic data from biological samples has driven the emerging field of systems medicine. This field is promising because it improves our understanding of disease processes at the systems level. However, the field is still in its young stage. There exists a great need for novel computational methods and approaches to effectively utilize and integrate various omics data.

  14. Integrating Computer-Mediated Communication Strategy Instruction

    ERIC Educational Resources Information Center

    McNeil, Levi

    2016-01-01

    Communication strategies (CSs) play important roles in resolving problematic second language interaction and facilitating language learning. While studies in face-to-face contexts demonstrate the benefits of communication strategy instruction (CSI), there have been few attempts to integrate computer-mediated communication and CSI. The study…

  15. TALEN-Mediated Mutagenesis and Genome Editing.

    PubMed

    Ma, Alvin C H; Chen, Yi; Blackburn, Patrick R; Ekker, Stephen C

    2016-01-01

    Transcription activator-like effectors (TALEs) are important genomic tools with customizable DNA-binding motifs for locus-specific modifications. In particular, TALE nucleases or TALENs have been successfully used in the zebrafish model system to introduce targeted mutations via repair of double-stranded breaks (DSBs) either through nonhomologous end joining (NHEJ) or by homology-directed repair (HDR) and homology-independent repair in the presence of a donor template. Compared with other customizable nucleases, TALENs offer high binding specificity and fewer sequence constraints in targeting the genome, with comparable mutagenic activity. Here, we describe a detailed in silico design tool for zebrafish genome editing for TALENs and CRISPR/Cas9 custom restriction enzymes using Mojo Hand 2.0 software. PMID:27464798

  16. Adeno-Associated Virus Type 2 Wild-Type and Vector-Mediated Genomic Integration Profiles of Human Diploid Fibroblasts Analyzed by Third-Generation PacBio DNA Sequencing

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei

    2014-01-01

    ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will

  17. RNA-Mediated Epigenetic Programming of Genome Rearrangements

    PubMed Central

    Nowacki, Mariusz; Shetty, Keerthi; Landweber, Laura F.

    2012-01-01

    RNA, normally thought of as a conduit in gene expression, has a novel mode of action in ciliated protozoa. Maternal RNA templates provide both an organizing guide for DNA rearrangements and a template that can transport somatic mutations to the next generation. This opportunity for RNA-mediated genome rearrangement and DNA repair is profound in the ciliate Oxytricha, which deletes 95% of its germline genome during development in a process that severely fragments its chromosomes and then sorts and reorders the hundreds of thousands of pieces remaining. Oxytricha’s somatic nuclear genome is therefore an epigenome formed through RNA templates and signals arising from the previous generation. Furthermore, this mechanism of RNA-mediated epigenetic inheritance can function across multiple generations, and the discovery of maternal template RNA molecules has revealed new biological roles for RNA and has hinted at the power of RNA molecules to sculpt genomic information in cells. PMID:21801022

  18. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational

  19. Multidimensional Genome-wide Analyses Show Accurate FVIII Integration by ZFN in Primary Human Cells

    PubMed Central

    Sivalingam, Jaichandran; Kenanov, Dimitar; Han, Hao; Nirmal, Ajit Johnson; Ng, Wai Har; Lee, Sze Sing; Masilamani, Jeyakumar; Phan, Toan Thang; Maurer-Stroh, Sebastian; Kon, Oi Lian

    2016-01-01

    Costly coagulation factor VIII (FVIII) replacement therapy is a barrier to optimal clinical management of hemophilia A. Therapy using FVIII-secreting autologous primary cells is potentially efficacious and more affordable. Zinc finger nucleases (ZFN) mediate transgene integration into the AAVS1 locus but comprehensive evaluation of off-target genome effects is currently lacking. In light of serious adverse effects in clinical trials which employed genome-integrating viral vectors, this study evaluated potential genotoxicity of ZFN-mediated transgenesis using different techniques. We employed deep sequencing of predicted off-target sites, copy number analysis, whole-genome sequencing, and RNA-seq in primary human umbilical cord-lining epithelial cells (CLECs) with AAVS1 ZFN-mediated FVIII transgene integration. We combined molecular features to enhance the accuracy and activity of ZFN-mediated transgenesis. Our data showed a low frequency of ZFN-associated indels, no detectable off-target transgene integrations or chromosomal rearrangements. ZFN-modified CLECs had very few dysregulated transcripts and no evidence of activated oncogenic pathways. We also showed AAVS1 ZFN activity and durable FVIII transgene secretion in primary human dermal fibroblasts, bone marrow- and adipose tissue-derived stromal cells. Our study suggests that, with close attention to the molecular design of genome-modifying constructs, AAVS1 ZFN-mediated FVIII integration in several primary human cell types may be safe and efficacious. PMID:26689265

  20. Site-specific recombination in the chicken genome using Flipase recombinase-mediated cassette exchange.

    PubMed

    Lee, Hong Jo; Lee, Hyung Chul; Kim, Young Min; Hwang, Young Sun; Park, Young Hyun; Park, Tae Sub; Han, Jae Yong

    2016-02-01

    Targeted genome recombination has been applied in diverse research fields and has a wide range of possible applications. In particular, the discovery of specific loci in the genome that support robust and ubiquitous expression of integrated genes and the development of genome-editing technology have facilitated rapid advances in various scientific areas. In this study, we produced transgenic (TG) chickens that can induce recombinase-mediated gene cassette exchange (RMCE), one of the site-specific recombination technologies, and confirmed RMCE in TG chicken-derived cells. As a result, we established TG chicken lines that have, Flipase (Flp) recognition target (FRT) pairs in the chicken genome, mediated by piggyBac transposition. The transgene integration patterns were diverse in each TG chicken line, and the integration diversity resulted in diverse levels of expression of exogenous genes in each tissue of the TG chickens. In addition, the replaced gene cassette was expressed successfully and maintained by RMCE in the FRT predominant loci of TG chicken-derived cells. These results indicate that targeted genome recombination technology with RMCE could be adaptable to TG chicken models and that the technology would be applicable to specific gene regulation by cis-element insertion and customized expression of functional proteins at predicted levels without epigenetic influence.

  1. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  2. Transposon-mediated Genome Manipulations in Vertebrates

    PubMed Central

    Ivics, Zoltán; Li, Meng Amy; Mátés, Lajos; Boeke, Jef D.; Bradley, Allan; Izsvák, Zsuzsanna

    2010-01-01

    Transposable elements are segments of DNA with the unique ability to move about in the genome. This inherent feature can be exploited to harness these elements as gene vectors for diverse genome manipulations. Transposon-based genetic strategies have been established in vertebrate species over the last decade, and current progress in this field indicates that transposable elements will serve as indispensable tools in the genetic toolkit of vertebrate models. In particular, transposons can be applied as vectors for somatic and germline transgenesis, and as insertional mutagens in both loss-of-function and gain-of-function forward mutagenesis screens. The major advantage of using transposons as genetic tools is that they facilitate analysis of gene function in an easy, controlled and scalable manner. Transposon-based technologies are beginning to be exploited to link sequence information to gene functions in vertebrate models. In this article, we provide an overview of transposon-based methods used in vertebrate model organisms, and highlight the most important considerations concerning genetic applications of the transposon systems. PMID:19478801

  3. Integrated genomic characterization of endometrial carcinoma.

    PubMed

    Kandoth, Cyriac; Schultz, Nikolaus; Cherniack, Andrew D; Akbani, Rehan; Liu, Yuexin; Shen, Hui; Robertson, A Gordon; Pashtan, Itai; Shen, Ronglai; Benz, Christopher C; Yau, Christina; Laird, Peter W; Ding, Li; Zhang, Wei; Mills, Gordon B; Kucherlapati, Raju; Mardis, Elaine R; Levine, Douglas A

    2013-05-01

    We performed an integrated genomic, transcriptomic and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumours and ∼25% of high-grade endometrioid tumours had extensive copy number alterations, few DNA methylation changes, low oestrogen receptor/progesterone receptor levels, and frequent TP53 mutations. Most endometrioid tumours had few copy number alterations or TP53 mutations, but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A and KRAS and novel mutations in the SWI/SNF chromatin remodelling complex gene ARID5B. A subset of endometrioid tumours that we identified had a markedly increased transversion mutation frequency and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy-number low, and copy-number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may affect post-surgical adjuvant treatment for women with aggressive tumours.

  4. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  5. Receptor-mediated delivery of engineered nucleases for genome modification.

    PubMed

    Chen, Zhong; Jaafar, Lahcen; Agyekum, Davies G; Xiao, Haiyan; Wade, Marlene F; Kumaran, R Ileng; Spector, David L; Bao, Gang; Porteus, Matthew H; Dynan, William S; Meiler, Steffen E

    2013-10-01

    Engineered nucleases, which incise the genome at predetermined sites, have a number of laboratory and clinical applications. There is, however, a need for better methods for controlled intracellular delivery of nucleases. Here, we demonstrate a method for ligand-mediated delivery of zinc finger nucleases (ZFN) proteins using transferrin receptor-mediated endocytosis. Uptake is rapid and efficient in established mammalian cell lines and in primary cells, including mouse and human hematopoietic stem-progenitor cell populations. In contrast to cDNA expression, ZFN protein levels decline rapidly following internalization, affording better temporal control of nuclease activity. We show that transferrin-mediated ZFN uptake leads to site-specific in situ cleavage of the target locus. Additionally, despite the much shorter duration of ZFN activity, the efficiency of gene correction approaches that seen with cDNA-mediated expression. The approach is flexible and general, with the potential for extension to other targeting ligands and nuclease architectures.

  6. TG1 integrase-based system for site-specific gene integration into bacterial genomes.

    PubMed

    Muroi, Tetsurou; Kokuzawa, Takaaki; Kihara, Yoshihiko; Kobayashi, Ryuichi; Hirano, Nobutaka; Takahashi, Hideo; Haruki, Mitsuru

    2013-05-01

    Serine-type phage integrases catalyze unidirectional site-specific recombination between the attachment sites, attP and attB, in the phage and host bacterial genomes, respectively; these integrases and DNA target sites function efficiently when transferred into heterologous cells. We previously developed an in vivo site-specific genomic integration system based on actinophage TG1 integrase that introduces ∼2-kbp DNA into an att site inserted into a heterologous Escherichia coli genome. Here, we analyzed the TG1 integrase-mediated integrations of att site-containing ∼10-kbp DNA into the corresponding att site pre-inserted into various genomic locations; moreover, we developed a system that introduces ∼10-kbp DNA into the genome with an efficiency of ∼10(4) transformants/μg DNA. Integrations of attB-containing DNA into an attP-containing genome were more efficient than integrations of attP-containing DNA into an attB-containing genome, and integrations targeting attP inserted near the replication origin, oriC, and the E. coli "centromere" analogue, migS, were more efficient than those targeting attP within other regions of the genome. Because the genomic region proximal to the oriC and migS sites is located at the extreme poles of the cell during chromosomal segregation, the oriC-migS region may be more exposed to the cytosol than are other regions of the E. coli chromosome. Thus, accessibility of pre-inserted attP to attB-containing incoming DNA may be crucial for the integration efficiency by serine-type integrases in heterologous cells. These results may be beneficial to the development of serine-type integrases-based genomic integration systems for various bacterial species.

  7. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  8. Genomic islands are dynamic, ancient integrative elements in bacterial evolution.

    PubMed

    Boyd, E Fidelma; Almagro-Moreno, Salvador; Parent, Michelle A

    2009-02-01

    Acquisition of genomic islands plays a central part in bacterial evolution as a mechanism of diversification and adaptation. Genomic islands are non-self-mobilizing integrative and excisive elements that encode diverse functional characteristics but all contain a recombination module comprised of an integrase, associated attachment sites and, in some cases, a recombination directionality factor. Here, we discuss how a group of related genomic islands are evolutionarily ancient elements unrelated to plasmids, phages, integrons and integrative conjugative elements. In addition, we explore the diversity of genomic islands and their insertion sites among Gram-negative bacteria and discuss why they integrate at a limited number of tRNA genes.

  9. Genomic islands are dynamic, ancient integrative elements in bacterial evolution.

    PubMed

    Boyd, E Fidelma; Almagro-Moreno, Salvador; Parent, Michelle A

    2009-02-01

    Acquisition of genomic islands plays a central part in bacterial evolution as a mechanism of diversification and adaptation. Genomic islands are non-self-mobilizing integrative and excisive elements that encode diverse functional characteristics but all contain a recombination module comprised of an integrase, associated attachment sites and, in some cases, a recombination directionality factor. Here, we discuss how a group of related genomic islands are evolutionarily ancient elements unrelated to plasmids, phages, integrons and integrative conjugative elements. In addition, we explore the diversity of genomic islands and their insertion sites among Gram-negative bacteria and discuss why they integrate at a limited number of tRNA genes. PMID:19162481

  10. Small tumor virus genomes are integrated near nuclear matrix attachment regions in transformed cells.

    PubMed

    Shera, K A; Shera, C A; McDougall, J K

    2001-12-01

    More than 15% of human cancers have a viral etiology. In benign lesions induced by the small DNA tumor viruses, viral genomes are typically maintained extrachromosomally. Malignant progression is often associated with viral integration into host cell chromatin. To study the role of viral integration in tumorigenesis, we analyzed the positions of integrated viral genomes in tumors and tumor cell lines induced by the small oncogenic viruses, including the high-risk human papillomaviruses, hepatitis B virus, simian virus 40, and human T-cell leukemia virus type 1. We show that viral integrations in tumor cells lie near cellular sequences identified as nuclear matrix attachment regions (MARs), while integrations in nonneoplastic cells show no significant correlation with these regions. In mammalian cells, the nuclear matrix functions in gene expression and DNA replication. MARs play varied but poorly understood roles in eukaryotic gene expression. Our results suggest that integrated tumor virus genomes are subject to MAR-mediated transcriptional regulation, providing insight into mechanisms of viral carcinogenesis. Furthermore, the viral oncoproteins serve as invaluable tools for the study of mechanisms controlling cellular growth. Similarly, our demonstration that integrated viral genomes may be subject to MAR-mediated transcriptional effects should facilitate elucidation of fundamental mechanisms regulating eukaryotic gene expression.

  11. Genome-wide analysis of T-DNA integration into the chromosomes of Magnaporthe oryzae

    PubMed Central

    Choi, Jaehyuk; Park, Jongsun; Jeon, Junhyun; Chi, Myoung-Hwan; Goh, Jaeduk; Yoo, Sung-Yong; Park, Jaejin; Jung, Kyongyong; Kim, Hyojeong; Park, Sook-Young; Rho, Hee-Sool; Kim, Soonok; Kim, Byeong Ryun; Han, Seong-Sook; Kang, Seogchan; Lee, Yong-Hwan

    2007-01-01

    Agrobacterium tumefaciens-mediated transformation (ATMT) has become a prevalent tool for functional genomics of fungi, but our understanding of T-DNA integration into the fungal genome remains limited relative to that in plants. Using a model plant-pathogenic fungus, Magnaporthe oryzae, here we report the most comprehensive analysis of T-DNA integration events in fungi and the development of an informatics infrastructure, termed a T-DNA analysis platform (TAP). We identified a total of 1110 T-DNA-tagged locations (TTLs) and processed the resulting data via TAP. Analysis of the TTLs showed that T-DNA integration was biased among chromosomes and preferred the promoter region of genes. In addition, irregular patterns of T-DNA integration, such as chromosomal rearrangement and readthrough of plasmid vectors, were also observed, showing that T-DNA integration patterns into the fungal genome are as diverse as those of their plant counterparts. However, overall the observed junction structures between T-DNA borders and flanking genomic DNA sequences revealed that T-DNA integration into the fungal genome was more canonical than those observed in plants. Our results support the potential of ATMT as a tool for functional genomics of fungi and show that the TAP is an effective informatics platform for handling data from large-scale insertional mutagenesis. PMID:17850257

  12. Homology-Independent Integration of Plasmid DNA into the Zebrafish Genome.

    PubMed

    Auer, Thomas O; Del Bene, Filippo

    2016-01-01

    Targeting nucleases like zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system have revolutionized genome-editing possibilities in many model organisms. They allow the generation of loss-of-function alleles by the introduction of double-strand breaks at defined sites within genes, but also more sophisticated genome-editing approaches have become possible. These include the integration of donor plasmid DNA into the genome by homology-independent repair mechanisms after CRISPR/Cas9-mediated cleavage. Here we present a protocol outlining the most important steps to target a genomic site and to integrate a donor plasmid at this defined locus.

  13. International regulatory landscape and integration of corrective genome editing into in vitro fertilization.

    PubMed

    Araki, Motoko; Ishii, Tetsuya

    2014-11-24

    Genome editing technology, including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas, has enabled far more efficient genetic engineering even in non-human primates. This biotechnology is more likely to develop into medicine for preventing a genetic disease if corrective genome editing is integrated into assisted reproductive technology, represented by in vitro fertilization. Although rapid advances in genome editing are expected to make germline gene correction feasible in a clinical setting, there are many issues that still need to be addressed before this could occur. We herein examine current status of genome editing in mammalian embryonic stem cells and zygotes and discuss potential issues in the international regulatory landscape regarding human germline gene modification. Moreover, we address some ethical and social issues that would be raised when each country considers whether genome editing-mediated germline gene correction for preventive medicine should be permitted.

  14. Herpesvirus Genome Integration into Telomeric Repeats of Host Cell Chromosomes.

    PubMed

    Osterrieder, Nikolaus; Wallaschek, Nina; Kaufer, Benedikt B

    2014-11-01

    It is well known that numerous viruses integrate their genetic material into host cell chromosomes. Human herpesvirus 6 (HHV-6) and oncogenic Marek's disease virus (MDV) have been shown to integrate their genomes into host telomeres of latently infected cells. This is unusual for herpesviruses as most maintain their genomes as circular episomes during the quiescent stage of infection. The genomic DNA of HHV-6, MDV, and several other herpesviruses harbors telomeric repeats (TMRs) that are identical to host telomere sequences (TTAGGG). At least in the case of MDV, viral TMRs facilitate integration into host telomeres. Integration of HHV-6 occurs not only in lymphocytes but also in the germline of some individuals, allowing vertical virus transmission. Although the molecular mechanism of telomere integration is poorly understood, the presence of TMRs in a number of herpesviruses suggests it is their default program for genome maintenance during latency and also allows efficient reactivation.

  15. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database

    PubMed Central

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T.; Karra, Kalpana; Hitz, Benjamin C.; Nash, Robert S.; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J.

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences. Database URL: www.yeastgenome.org PMID:27252399

  16. Integrated proteomic and genomic analysis of colorectal cancer

    Cancer.gov

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  17. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Zhou, Jizhong; He, Zhili

    2014-04-08

    As a part of the Shewanella Federation project, we have used integrated genomic, proteomic and computational technologies to study various aspects of energy metabolism of two Shewanella strains from a systems-level perspective.

  18. Target-specific variants of Flp recombinase mediate genome engineering reactions in mammalian cells.

    PubMed

    Shah, Riddhi; Li, Feng; Voziyanova, Eugenia; Voziyanov, Yuri

    2015-09-01

    Genome engineering relies on DNA-modifying enzymes that are able to locate a DNA sequence of interest and initiate a desired genome rearrangement. Currently, the field predominantly utilizes site-specific DNA nucleases that depend on the host DNA repair machinery to complete a genome modification task. We show here that genome engineering approaches that employ target-specific variants of the self-sufficient, versatile site-specific DNA recombinase Flp can be developed into promising alternatives. We demonstrate that the Flp variant evolved to recombine an FRT-like sequence, FL-IL10A, which is located upstream of the human interleukin-10 gene, and can target this sequence in the model setting of Chinese hamster ovary and human embryonic kidney 293 cells. This target-specific Flp variant is able to perform the integration reaction and, when paired with another recombinase, the dual recombinase-mediated cassette exchange reaction. The efficiency of the integration reaction in human cells can be enhanced by 'humanizing' the Flp variant gene and by adding the nuclear localization sequence to the recombinase.

  19. Principles and methods of integrative genomic analyses in cancer.

    PubMed

    Kristensen, Vessela N; Lingjærde, Ole Christian; Russnes, Hege G; Vollan, Hans Kristian M; Frigessi, Arnoldo; Børresen-Dale, Anne-Lise

    2014-05-01

    Combined analyses of molecular data, such as DNA copy-number alteration, mRNA and protein expression, point to biological functions and molecular pathways being deregulated in multiple cancers. Genomic, metabolomic and clinical data from various solid cancers and model systems are emerging and can be used to identify novel patient subgroups for tailored therapy and monitoring. The integrative genomics methodologies that are used to interpret these data require expertise in different disciplines, such as biology, medicine, mathematics, statistics and bioinformatics, and they can seem daunting. The objectives, methods and computational tools of integrative genomics that are available to date are reviewed here, as is their implementation in cancer research.

  20. Integrated Microbial Genomes (IMG) System from the DOE Joint Genome Institute (JGI)

    DOE Data Explorer

    The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov. [Abstract from The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions; Victor M. Markowitz, Ernest Szeto, Krishna Palaniappan, Yuri Grechkin, Ken Chu, I-Min A. Chen, Inna Dubchak, Iain Anderson, Athanasios Lykidis, Konstantinos Mavromatis, Natalia N. Ivanova and Nikos C. Kyrpides; Nucleic Acids Research, 2008, Vol. 36. (Database Issue) See also the companion system, Integrated Microbial Genomes with Microbiome Samples.

  1. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma

    PubMed Central

    Zhao, Ling-Hao; Liu, Xiao; Yan, He-Xin; Li, Wei-Yang; Zeng, Xi; Yang, Yuan; Zhao, Jie; Liu, Shi-Ping; Zhuang, Xue-Han; Lin, Chuan; Qin, Chen-Jie; Zhao, Yi; Pan, Ze-Ya; Huang, Gang; Liu, Hui; Zhang, Jin; Wang, Ruo-Yu; Yang, Yun; Wen, Wen; Lv, Gui-Shuai; Zhang, Hui-Lu; Wu, Han; Huang, Shuai; Wang, Ming-Da; Tang, Liang; Cao, Hong-Zhi; Wang, Ling; Lee, T.P.; Jiang, Hui; Tan, Ye-Xiong; Yuan, Sheng-Xian; Hou, Guo-Jun; Tao, Qi-Fei; Xu, Qin-Guo; Zhang, Xiu-Qing; Wu, Meng-Chao; Xu, Xun; Wang, Jun; Yang, Huan-Ming; Zhou, Wei-Ping; Wang, Hong-Yang

    2016-01-01

    Hepatitis B virus (HBV) can integrate into the human genome, contributing to genomic instability and hepatocarcinogenesis. Here by conducting high-throughput viral integration detection and RNA sequencing, we identify 4,225 HBV integration events in tumour and adjacent non-tumour samples from 426 patients with HCC. We show that HBV is prone to integrate into rare fragile sites and functional genomic regions including CpG islands. We observe a distinct pattern in the preferential sites of HBV integration between tumour and non-tumour tissues. HBV insertional sites are significantly enriched in the proximity of telomeres in tumours. Recurrent HBV target genes are identified with few that overlap. The overall HBV integration frequency is much higher in tumour genomes of males than in females, with a significant enrichment of integration into chromosome 17. Furthermore, a cirrhosis-dependent HBV integration pattern is observed, affecting distinct targeted genes. Our data suggest that HBV integration has a high potential to drive oncogenic transformation. PMID:27703150

  2. Nuclease-mediated genome editing: At the front-line of functional genomics technology.

    PubMed

    Sakuma, Tetsushi; Woltjen, Knut

    2014-01-01

    Genome editing with engineered endonucleases is rapidly becoming a staple method in developmental biology studies. Engineered nucleases permit random or designed genomic modification at precise loci through the stimulation of endogenous double-strand break repair. Homology-directed repair following targeted DNA damage is mediated by co-introduction of a custom repair template, allowing the derivation of knock-out and knock-in alleles in animal models previously refractory to classic gene targeting procedures. Currently there are three main types of customizable site-specific nucleases delineated by the source mechanism of DNA binding that guides nuclease activity to a genomic target: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). Among these genome engineering tools, characteristics such as the ease of design and construction, mechanism of inducing DNA damage, and DNA sequence specificity all differ, making their application complementary. By understanding the advantages and disadvantages of each method, one may make the best choice for their particular purpose.

  3. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.

    PubMed

    Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora

    2015-05-01

    Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders. PMID:25739389

  4. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.

    PubMed

    Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora

    2015-05-01

    Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders.

  5. Bacterial conjugation protein MobA mediates integration of complex DNA structures into plant cells.

    PubMed

    Bravo-Angel, A M; Gloeckler, V; Hohn, B; Tinland, B

    1999-09-01

    Agrobacterium tumefaciens transfers T-DNA to plant cells, where it integrates into the genome, a property that is ensured by bacterial proteins VirD2 and VirE2. Under natural conditions, the protein MobA mobilizes its encoding plasmid, RSF1010, between different bacteria. A detailed analysis of MobA-mediated DNA mobilization by Agrobacterium to plants was performed. We compared the ability of MobA to transfer DNA and integrate it into the plant genome to that of pilot protein VirD2. MobA was found to be about 100-fold less efficient than VirD2 in conducting the DNA from the pTi plasmid to the plant cell nucleus. However, interestingly, DNAs transferred by the two proteins were integrated into the plant cell genome with similar efficiencies. In contrast, most of the integrated DNA copies transferred from a MobA-containing strain were truncated at the 5' end. Isolation and analysis of the most conserved 5' ends revealed patterns which resulted from the illegitimate integration of one transferred DNA within another. These complex integration patterns indicate a specific deficiency in MobA. The data conform to a model according to which efficiency of T-DNA integration is determined by plant enzymes and integrity is determined by bacterial proteins. PMID:10482518

  6. Integration of genomic datasets to predict protein complexes in yeast.

    PubMed

    Jansen, Ronald; Lan, Ning; Qian, Jiang; Gerstein, Mark

    2002-01-01

    The ultimate goal of functional genomics is to define the function of all the genes in the genome of an organism. A large body of information of the biological roles of genes has been accumulated and aggregated in the past decades of research, both from traditional experiments detailing the role of individual genes and proteins, and from newer experimental strategies that aim to characterize gene function on a genomic scale. It is clear that the goal of functional genomics can only be achieved by integrating information and data sources from the variety of these different experiments. Integration of different data is thus an important challenge for bioinformatics. The integration of different data sources often helps to uncover non-obvious relationships between genes, but there are also two further benefits. First, it is likely that whenever information from multiple independent sources agrees, it should be more valid and reliable. Secondly, by looking at the union of multiple sources, one can cover larger parts of the genome. This is obvious for integrating results from multiple single gene or protein experiments, but also necessary for many of the results from genome-wide experiments since they are often confined to certain (although sizable) subsets of the genome. In this paper, we explore an example of such a data integration procedure. We focus on the prediction of membership in protein complexes for individual genes. For this, we recruit six different data sources that include expression profiles, interaction data, essentiality and localization information. Each of these data sources individually contains some weakly predictive information with respect to protein complexes, but we show how this prediction can be improved by combining all of them. Supplementary information is available at http:// bioinfo.mbb.yale.edu/integrate/interactions/. PMID:12836664

  7. A physical map of the papaya genome with integrated genetic map and genome sequence

    PubMed Central

    2009-01-01

    Background Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we constructed a BAC-based physical map using a high information-content fingerprinting approach to assist whole genome shotgun sequence assembly. Results The physical map consists of 963 contigs, representing 9.4× genome equivalents, and was integrated with the genetic map and genome sequence using BAC end sequences and a sequence-tagged high-density genetic map. The estimated genome coverage of the physical map is about 95.8%, while 72.4% of the genome was aligned to the genetic map. A total of 1,181 high quality overgo (overlapping oligonucleotide) probes representing conserved sequences in Arabidopsis and genetically mapped loci in Brassica were anchored on the physical map, which provides a foundation for comparative genomics in the Brassicales. The integrated genetic and physical map aligned with the genome sequence revealed recombination hotspots as well as regions suppressed for recombination across the genome, particularly on the recently evolved sex chromosomes. Suppression of recombination spread to the adjacent region of the male specific region of the Y chromosome (MSY), and recombination rates were recovered gradually and then exceeded the genome average. Recombination hotspots were observed at about 10 Mb away on both sides of the MSY, showing 7-fold increase compared with the genome wide average, demonstrating the dynamics of recombination of the sex chromosomes. Conclusion A BAC-based physical map of papaya was constructed and integrated with the genetic map and genome sequence. The integrated map facilitated the draft genome assembly

  8. Orchidstra: An Integrated Orchid Functional Genomics Database

    PubMed Central

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-01-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species. PMID:23324169

  9. An integrated approach to structural genomics.

    PubMed

    Heinemann, U; Frevert, J; Hofmann, K; Illing, G; Maurer, C; Oschkinat, H; Saenger, W

    2000-01-01

    Structural genomics aims at determining a set of protein structures that will represent all domain folds present in the biosphere. These structures can be used as the basis for the homology modelling of the majority of all remaining protein domains or, indeed, proteins. Structural genomics therefore promises to provide a comprehensive structural description of the protein universe. To achieve this, a broad scientific effort is required. The Berlin-based "Protein Structure Factory" (PSF) plans to contribute to this effort by setting up a local infrastructure for the low-cost, high-throughput analysis of soluble human proteins. In close collaboration with the German Human Genome Project (DHGP) protein-coding genes will be expressed in Escherichia coli or yeast. Affinity-tagged proteins will be purified semi-automatically for biophysical characterization and structure analysis by X-ray diffraction methods and NMR spectroscopy. In all steps of the structure analysis process, possibilities for automation, parallelization and standardization will be explored. Major new facilities that are created for the PSF include a robotic station for large-scale protein crystallization, an NMR center and an experimental station for protein crystallography at the synchrotron storage ring BESSY II in Berlin. PMID:11063780

  10. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M. ); Micheals, G.S.; Taylor, R. . Div. of Computer Resources and Technology)

    1992-01-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator's tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  11. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M.; Micheals, G.S.; Taylor, R.

    1992-12-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator`s tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  12. IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

    PubMed Central

    Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

    2009-01-01

    Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385

  13. Identifying potential cancer driver genes by genomic data integration

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-12-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis.

  14. Identifying potential cancer driver genes by genomic data integration

    PubMed Central

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-01-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis. PMID:24346768

  15. Integrated Genomic Analyses in Bronchopulmonary Dysplasia

    PubMed Central

    Ambalavanan, Namasivayam; Cotten, C. Michael; Page, Grier P.; Carlo, Waldemar A.; Murray, Jeffrey C.; Bhattacharya, Soumyaroop; Mariani, Thomas J.; Cuna, Alain C.; Faye-Petersen, Ona M.; Kelly, David; Higgins, Rosemary D.

    2014-01-01

    Objective To identify single nucleotide polymorphisms (SNPs) and pathways associated with bronchopulmonary dysplasia (BPD) because O2 requirement at 36 weeks’ post-menstrual age risk is strongly influenced by heritable factors. Study design A genome-wide scan was conducted on 1.2 million genotyped SNPs, and an additional 7 million imputed SNPs, using a DNA repository of extremely low birth weight infants. Genome-wide association and gene set analysis was performed for BPD or death, severe BPD or death, and severe BPD in survivors. Specific targets were validated using gene expression in BPD lung tissue and in mouse models. Results Of 751 infants analyzed, 428 developed BPD or died. No SNPs achieved genome-wide significance (p<10−8) although multiple SNPs in adenosine deaminase (ADARB2), CD44, and other genes were just below p<10−6. Of approximately 8000 pathways, 75 were significant at False Discovery Rate (FDR) <0.1 and p<0.001 for BPD/death, 95 for severe BPD/death, and 90 for severe BPD in survivors. The pathway with lowest FDR was miR-219 targets (p=1.41E-08, FDR 9.5E-05) for BPD/death and Phosphorous Oxygen Lyase Activity (includes adenylate and guanylate cyclases) for both severe BPD/death (p=5.68E-08, FDR 0.00019) and severe BPD in survivors (p=3.91E-08, FDR 0.00013). Gene expression analysis confirmed significantly increased miR-219 and CD44 in BPD. Conclusions Pathway analyses confirmed involvement of known pathways of lung development and repair (CD44, Phosphorus Oxygen Lyase Activity) and indicated novel molecules and pathways (ADARB2, Targets of miR-219) involved in genetic predisposition to BPD. PMID:25449221

  16. Integrated Genomic Characterization of Papillary Thyroid Carcinoma

    PubMed Central

    Agrawal, Nishant; Akbani, Rehan; Aksoy, B. Arman; Ally, Adrian; Arachchi, Harindra; Asa, Sylvia L.; Auman, J. Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B.; Behera, Madhusmita; Bernard, Brady; Beroukhim, Rameen; Bishop, Justin A.; Black, Aaron D.; Bodenheimer, Tom; Boice, Lori; Bootwalla, Moiz S.; Bowen, Jay; Bowlby, Reanne; Bristow, Christopher A.; Brookens, Robin; Brooks, Denise; Bryant, Robert; Buda, Elizabeth; Butterfield, Yaron S.N.; Carling, Tobias; Carlsen, Rebecca; Carter, Scott L.; Carty, Sally E.; Chan, Timothy A.; Chen, Amy Y.; Cherniack, Andrew D.; Cheung, Dorothy; Chin, Lynda; Cho, Juok; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Ciriello, Giovanni; Clarke, Amanda; Clayman, Gary L.; Cope, Leslie; Copland, John; Covington, Kyle; Danilova, Ludmila; Davidsen, Tanja; Demchok, John A.; DiCara, Daniel; Dhalla, Noreen; Dhir, Rajiv; Dookran, Sheliann S.; Dresdner, Gideon; Eldridge, Jonathan; Eley, Greg; El-Naggar, Adel K.; Eng, Stephanie; Fagin, James A.; Fennell, Timothy; Ferris, Robert L.; Fisher, Sheila; Frazer, Scott; Frick, Jessica; Gabriel, Stacey B.; Ganly, Ian; Gao, Jianjiong; Garraway, Levi A.; Gastier-Foster, Julie M.; Getz, Gad; Gehlenborg, Nils; Ghossein, Ronald; Gibbs, Richard A.; Giordano, Thomas J.; Gomez-Hernandez, Karen; Grimsby, Jonna; Gross, Benjamin; Guin, Ranabir; Hadjipanayis, Angela; Harper, Hollie A.; Hayes, D. Neil; Heiman, David I.; Herman, James G.; Hoadley, Katherine A.; Hofree, Matan; Holt, Robert A.; Hoyle, Alan P.; Huang, Franklin W.; Huang, Mei; Hutter, Carolyn M.; Ideker, Trey; Iype, Lisa; Jacobsen, Anders; Jefferys, Stuart R.; Jones, Corbin D.; Jones, Steven J.M.; Kasaian, Katayoon; Kebebew, Electron; Khuri, Fadlo R.; Kim, Jaegil; Kramer, Roger; Kreisberg, Richard; Kucherlapati, Raju; Kwiatkowski, David J.; Ladanyi, Marc; Lai, Phillip H.; Laird, Peter W.; Lander, Eric; Lawrence, Michael S.; Lee, Darlene; Lee, Eunjung; Lee, Semin; Lee, William; Leraas, Kristen M.; Lichtenberg, Tara M.; Lichtenstein, Lee; Lin, Pei; Ling, Shiyun; Liu, Jinze; Liu, Wenbin; Liu, Yingchun; LiVolsi, Virginia A.; Lu, Yiling; Ma, Yussanne; Mahadeshwar, Harshad S.; Marra, Marco A.; Mayo, Michael; McFadden, David G.; Meng, Shaowu; Meyerson, Matthew; Mieczkowski, Piotr A.; Miller, Michael; Mills, Gordon; Moore, Richard A.; Mose, Lisle E.; Mungall, Andrew J.; Murray, Bradley A.; Nikiforov, Yuri E.; Noble, Michael S.; Ojesina, Akinyemi I.; Owonikoko, Taofeek K.; Ozenberger, Bradley A.; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J.; Parker, Joel S.; Paull, Evan O.; Pedamallu, Chandra Sekhar; Perou, Charles M.; Prins, Jan F.; Protopopov, Alexei; Ramalingam, Suresh S.; Ramirez, Nilsa C.; Ramirez, Ricardo; Raphael, Benjamin J.; Rathmell, W. Kimryn; Ren, Xiaojia; Reynolds, Sheila M.; Rheinbay, Esther; Ringel, Matthew D.; Rivera, Michael; Roach, Jeffrey; Robertson, A. Gordon; Rosenberg, Mara W.; Rosenthall, Matthew; Sadeghi, Sara; Saksena, Gordon; Sander, Chris; Santoso, Netty; Schein, Jacqueline E.; Schultz, Nikolaus; Schumacher, Steven E.; Seethala, Raja R.; Seidman, Jonathan; Senbabaoglu, Yasin; Seth, Sahil; Sharpe, Samantha; Mills Shaw, Kenna R.; Shen, John P.; Shen, Ronglai; Sherman, Steven; Sheth, Margi; Shi, Yan; Shmulevich, Ilya; Sica, Gabriel L.; Simons, Janae V.; Sipahimalani, Payal; Smallridge, Robert C.; Sofia, Heidi J.; Soloway, Matthew G.; Song, Xingzhi; Sougnez, Carrie; Stewart, Chip; Stojanov, Petar; Stuart, Joshua M.; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Tarnuzzer, Roy; Taylor, Barry S.; Thiessen, Nina; Thorne, Leigh; Thorsson, Vésteinn; Tuttle, R. Michael; Umbricht, Christopher B.; Van Den Berg, David J.; Vandin, Fabio; Veluvolu, Umadevi; Verhaak, Roel G.W.; Vinco, Michelle; Voet, Doug; Walter, Vonn; Wang, Zhining; Waring, Scot; Weinberger, Paul M.; Weinstein, John N.; Weisenberger, Daniel J.; Wheeler, David; Wilkerson, Matthew D.; Wilson, Jocelyn; Williams, Michelle; Winer, Daniel A.; Wise, Lisa; Wu, Junyuan; Xi, Liu; Xu, Andrew W.; Yang, Liming; Yang, Lixing; Zack, Travis I.; Zeiger, Martha A.; Zeng, Dong; Zenklusen, Jean Claude; Zhao, Ni; Zhang, Hailei; Zhang, Jianhua; Zhang, Jiashan (Julia); Zhang, Wei; Zmuda, Erik; Zou., Lihua

    2014-01-01

    Summary Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  17. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  18. Mutation Detection with Next-Generation Resequencing through a Mediator Genome

    SciTech Connect

    Wurtzel, Omri; Dori-Bachash, Mally; Pietrokovski, Shmuel; Jurkevitch, Edouard; Sorek, Rotem

    2010-12-20

    The affordability of next generation sequencing (NGS) is transforming the field of mutation analysis in bacteria. The genetic basis for phenotype alteration can be identified directly by sequencing the entire genome of the mutant and comparing it to the wild-type (WT) genome, thus identifying acquired mutations. A major limitation for this approach is the need for an a-priori sequenced reference genome for the WT organism, as the short reads of most current NGS approaches usually prohibit de-novo genome assembly. To overcome this limitation we propose a general framework that utilizes the genome of relative organisms as mediators for comparing WT and mutant bacteria. Under this framework, both mutant and WT genomes are sequenced with NGS, and the short sequencing reads are mapped to the mediator genome. Variations between the mutant and the mediator that recur in the WT are ignored, thus pinpointing the differences between the mutant and the WT. To validate this approach we sequenced the genome of Bdellovibrio bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent mutant, and compared both to the mediator species Bdellovibrio bacteriovorus HD100. Although the mutant and the mediator sequences differed in more than 28,000 nucleotide positions, our approach enabled pinpointing the single causative mutation. Experimental validation in 53 additional mutants further established the implicated gene. Our approach extends the applicability of NGS-based mutant analyses beyond the domain of available reference genomes.

  19. Integrated translational genomics for analysis of complex traits in sorghum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  20. Integrating genomic selection into dairy cattle breeding programmes: a review.

    PubMed

    Bouquet, A; Juga, J

    2013-05-01

    Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However

  1. Defining nephrotic syndrome from an integrative genomics perspective

    PubMed Central

    Sampson, Matthew G.; Hodgin, Jeffrey B.; Kretzler, Matthias

    2014-01-01

    Nephrotic syndrome (NS) is a clinical condition with a high degree of morbidity and mortality, caused by failure of the glomerular filtration barrier, resulting in massive proteinuria. Our current diagnostic, prognostic and therapeutic decisions in NS are largely based upon clinical or histological patterns such as “focal segmental glomerulosclerosis” or “steroid sensitive”. Yet these descriptive classifications lack the precision to explain the physiologic origins and clinical heterogeneity observed in this syndrome. A more precise definition of NS is required to identify mechanisms of disease and capture various clinical trajectories. An integrative genomics approach to NS applies bioinformatics and computational methods to comprehensive experimental, molecular and clinical data for holistic disease definition. A unique aspect is analysis of data together to discover NS-associated molecules, pathways and networks. Integrating multidimensional datasets from the outset highlights how molecular lesions impact the entire individual. Data sets integrated range from mutation to gene expression, to histologic changes, to progression of chronic kidney disease (CKD). This review will introduce the tenets of integrative genomics and suggest how it can increase our understanding of NS from molecular and pathophysiological perspectives. A diverse group of genome-scale experiments are presented that have sought to define molecular signatures of NS. Finally, the Nephrotic Syndrome Study Network (NEPTUNE) will be introduced as an international, prospective cohort study of patients with NS that utilizes an integrated systems genomics approach from the outset. A major NEPTUNE goal is to achieve comprehensive disease definition from a genomics perspective and identify shared molecular drivers of disease. PMID:24890338

  2. Integrative prescreening in analysis of multiple cancer genomic studies

    PubMed Central

    2012-01-01

    Background In high throughput cancer genomic studies, results from the analysis of single datasets often suffer from a lack of reproducibility because of small sample sizes. Integrative analysis can effectively pool and analyze multiple datasets and provides a cost effective way to improve reproducibility. In integrative analysis, simultaneously analyzing all genes profiled may incur high computational cost. A computationally affordable remedy is prescreening, which fits marginal models, can be conducted in a parallel manner, and has low computational cost. Results An integrative prescreening approach is developed for the analysis of multiple cancer genomic datasets. Simulation shows that the proposed integrative prescreening has better performance than alternatives, particularly including prescreening with individual datasets, an intensity approach and meta-analysis. We also analyze multiple microarray gene profiling studies on liver and pancreatic cancers using the proposed approach. Conclusions The proposed integrative prescreening provides an effective way to reduce the dimensionality in cancer genomic studies. It can be coupled with existing analysis methods to identify cancer markers. PMID:22799431

  3. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  4. Megx.net: integrated database resource for marine ecological genomics.

    PubMed

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net.

  5. GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

    PubMed Central

    Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191

  6. Knowledge integration at the center of genomic medicine.

    PubMed

    Khoury, Muin J; Gwinn, Marta; Dotson, W David; Schully, Sheri D

    2012-07-01

    Three articles in this issue of Genetics in Medicine describe examples of "knowledge integration," involving methods for generating and synthesizing rapidly emerging information on health-related genomic technologies and engaging stakeholders around the evidence. Knowledge integration, the central process in translating genomic research, involves three closely related, iterative components: knowledge management, knowledge synthesis, and knowledge translation. Knowledge management is the ongoing process of obtaining, organizing, and displaying evolving evidence. For example, horizon scanning and "infoveillance" use emerging technologies to scan databases, registries, publications, and cyberspace for information on genomic applications. Knowledge synthesis is the process of conducting systematic reviews using a priori rules of evidence. For example, methods including meta-analysis, decision analysis, and modeling can be used to combine information from basic, clinical, and population research. Knowledge translation refers to stakeholder engagement and brokering to influence policy, guidelines and recommendations, as well as the research agenda to close knowledge gaps. The ultrarapid production of information requires adequate public and private resources for knowledge integration to support the evidence-based development of genomic medicine. PMID:22555656

  7. Integrated genomic analysis of breast cancers.

    PubMed

    Addou-Klouche, L; Adélaïde, J; Cornen, S; Bekhouche, I; Finetti, P; Guille, A; Sircoulomb, F; Raynaud, S; Bertucci, F; Birnbaum, D; Chaffanet, M

    2012-12-01

    Breast cancer is the most frequent and the most deadly cancer in women in Western countries. Different classifications of disease (anatomoclinical, pathological, prognostic, genetic) are used for guiding the management of patients. Unfortunately, they fail to reflect the whole clinical heterogeneity of the disease. Consequently, molecularly distinct diseases are grouped in similar clinical classes, likely explaining the different clinical outcome between patients in a given class, and the fact that selection of the most appropriate diagnostic or therapeutic strategy for each patient is not done accurately. Today, treatment is efficient in only 70.0-75.0% of cases overall. Our repertoire of efficient drugs is limited but is being expanded with the discovery of new molecular targets for new drugs, based on the identification of candidate oncogenes and tumor suppressor genes (TSG) functionally relevant in disease. Development of new drugs makes therapeutical decisions even more demanding of reliable classifiers and prognostic/predictive tests. Breast cancer is a complex, heterogeneous disease at the molecular level. The combinatorial molecular origin and the heterogeneity of malignant cells, and the variability of the host background, create distinct subgroups of tumors endowed with different phenotypic features such as response to therapy and clinical outcome. Cellular and molecular analyses can identify new classes biologically and clinically relevant, as well as provide new clinically relevant markers and targets. The various stages of mammary tumorigenesis are not clearly defined and the genetic and epigenetic events critical to the development and aggressiveness of breast cancer are not precisely known. Because the phenotype of tumors is dependent on many genes, a large-scale and integrated molecular characterization of the genetic and epigenetic alterations and gene expression deregulation should allow the identification of new molecular classes clinically

  8. TALEN-mediated genome engineering to generate targeted mice.

    PubMed

    Sommer, Daniel; Peters, Annika E; Baumgart, Ann-Kathrin; Beyer, Marc

    2015-02-01

    Genetic mouse models are critical for biomedical research to understand gene function and pathophysiology. In the last years, the generation of genetic mouse models has been revolutionized by the emergence of transcription activator-like effector nucleases (TALENs). TALENs are programmable, sequence-specific DNA-binding proteins fused to a non-specific endonuclease domain used as powerful tools for site-specific induction of DNA double-strand breaks. These result in disruption of the gene product of the targeted locus by mutations induced during repair by error-prone non-homologous end-joining. Alternatively, these DNA double-strand breaks can be exploited to integrate a user-defined sequence by homologous recombination if an appropriate repair plasmid is provided. In this review, we highlight the major technological improvements for genome editing in murine oocytes which have been achieved using TALENs, discuss current limitations of the technology, suggest strategies to broadly apply TALENs, and describe possible future directions to facilitate gene editing in murine oocytes.

  9. Integrating hospital information systems in healthcare institutions: a mediation architecture.

    PubMed

    El Azami, Ikram; Cherkaoui Malki, Mohammed Ouçamah; Tahon, Christian

    2012-10-01

    Many studies have examined the integration of information systems into healthcare institutions, leading to several standards in the healthcare domain (CORBAmed: Common Object Request Broker Architecture in Medicine; HL7: Health Level Seven International; DICOM: Digital Imaging and Communications in Medicine; and IHE: Integrating the Healthcare Enterprise). Due to the existence of a wide diversity of heterogeneous systems, three essential factors are necessary to fully integrate a system: data, functions and workflow. However, most of the previous studies have dealt with only one or two of these factors and this makes the system integration unsatisfactory. In this paper, we propose a flexible, scalable architecture for Hospital Information Systems (HIS). Our main purpose is to provide a practical solution to insure HIS interoperability so that healthcare institutions can communicate without being obliged to change their local information systems and without altering the tasks of the healthcare professionals. Our architecture is a mediation architecture with 3 levels: 1) a database level, 2) a middleware level and 3) a user interface level. The mediation is based on two central components: the Mediator and the Adapter. Using the XML format allows us to establish a structured, secured exchange of healthcare data. The notion of medical ontology is introduced to solve semantic conflicts and to unify the language used for the exchange. Our mediation architecture provides an effective, promising model that promotes the integration of hospital information systems that are autonomous, heterogeneous, semantically interoperable and platform-independent.

  10. Transgene integration and organization in cotton (Gossypium hirsutum L.) genome.

    PubMed

    Zhang, Jun; Cai, Lin; Cheng, Jiaqin; Mao, Huizhu; Fan, Xiaoping; Meng, Zhaohong; Chan, Ka Man; Zhang, Huijun; Qi, Jianfei; Ji, Lianghui; Hong, Yan

    2008-04-01

    While genetically modified upland cotton (Gossypium hirsutum L.) varieties are ranked among the most successful genetically modified organisms (GMO), there is little knowledge on transgene integration in the cotton genome, partly because of the difficulty in obtaining large numbers of transgenic plants. In this study, we analyzed 139 independently derived T0 transgenic cotton plants transformed by Agrobacterium tumefaciens strain AGL1 carrying a binary plasmid pPZP-GFP. It was found by PCR that as many as 31% of the plants had integration of vector backbone sequences. Of the 110 plants with good genomic Southern blot results, 37% had integration of a single T-DNA, 24% had two T-DNA copies and 39% had three or more copies. Multiple copies of the T-DNA existed either as repeats in complex loci or unlinked loci. Our further analysis of two T1 populations showed that segregants with a single T-DNA and no vector sequence could be obtained from T0 plants having multiple T-DNA copies and vector sequence. Out of the 57 T-DNA/T-DNA junctions cloned from complex loci, 27 had canonical T-DNA tandem repeats, the rest (30) had deletions to T-DNAs or had inclusion of vector sequences. Overlapping micro-homology was present for most of the T-DNA/T-DNA junctions (38/57). Right border (RB) ends of the T-DNA were precise while most left border (LB) ends (64%) had truncations to internal border sequences. Sequencing of collinear vector integration outside LB in 33 plants gave evidence that collinear vector sequence was determined in agrobacterium culture. Among the 130 plants with characterized flanking sequences, 12% had the transgene integrated into coding sequences, 12% into repetitive sequences, 7% into rDNAs. Interestingly, 7% had the transgene integrated into chloroplast derived sequences. Nucleotide sequence comparison of target sites in cotton genome before and after T-DNA integration revealed overlapping microhomology between target sites and the T-DNA (8/8), deletions to

  11. CRISPR-mediated genome editing of Plasmodium falciparum malaria parasites.

    PubMed

    Lee, Marcus Cs; Fidock, David A

    2014-01-01

    The development of the CRISPR-Cas system is revolutionizing genome editing in a variety of organisms. The system has now been used to manipulate the genome of Plasmodium falciparum, the most lethal malaria-causing species. The ability to generate gene deletions or nucleotide substitutions rapidly and economically promises to accelerate the analysis of novel drug targets and to help elucidate the function of specific genes or gene families, while complementing genome-wide association studies.

  12. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  13. PhytoPath: an integrative resource for plant pathogen genomics.

    PubMed

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D; Staines, Daniel M; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species.

  14. Integrative functional genomic analysis unveils the differing dysregulated metabolic processes across hepatocellular carcinoma stages.

    PubMed

    Ramesh, Vignesh; Ganesan, Kumaresan

    2016-08-15

    Hepatocellular carcinoma (HCC) is a highly heterogeneous disease and the development of targeted therapeutics is still at an early stage. The 'omics' based genome-wide profiling comprising the transcriptome, miRNome and proteome are highly useful in identifying the deregulated molecular processes involved in hepatocarcinogenesis. One of the end products and processes of the central dogma being the metabolites and metabolic processes mediate the cellular functions. In recent years, metabolomics based investigations have revealed the major deregulated metabolic processes involved in carcinogenesis. However, the integrative analysis of the holistic metabolic processes with genomics is at an early stage. Since the gene-sets are highly useful in assessing the biological processes and pathways, we made an attempt to infer the deregulated cellular metabolic processes involved in HCC by employing metabolism associated gene-set enrichment analysis. Further, the metabolic process enrichment scores were integrated with the transcriptome profiles of HCC. Integrative analysis shows three distinct metabolic deregulations: i) hepatocyte function related molecular processes involving lipid/fatty acid/bile acid synthesis, ii) inflammatory processes with cytokine, sphingolipid & chondriotin sulphate metabolism and iii) enriched nucleotide metabolic process involving purine/pyrimidine & glucose mediated catabolic process, in hepatocarcinogenesis. The three distinct metabolic processes were found to occur both in tumor and liver cancer cell line profiles. Unsupervised hierarchical clustering of the metabolic processes along with clinical sample information has identified two major clusters based on AFP (alpha-fetoprotein) and metastasis. The study reveals the three major regulatory processes involved in HCC stages. PMID:27107678

  15. Integrative functional genomic analysis unveils the differing dysregulated metabolic processes across hepatocellular carcinoma stages.

    PubMed

    Ramesh, Vignesh; Ganesan, Kumaresan

    2016-08-15

    Hepatocellular carcinoma (HCC) is a highly heterogeneous disease and the development of targeted therapeutics is still at an early stage. The 'omics' based genome-wide profiling comprising the transcriptome, miRNome and proteome are highly useful in identifying the deregulated molecular processes involved in hepatocarcinogenesis. One of the end products and processes of the central dogma being the metabolites and metabolic processes mediate the cellular functions. In recent years, metabolomics based investigations have revealed the major deregulated metabolic processes involved in carcinogenesis. However, the integrative analysis of the holistic metabolic processes with genomics is at an early stage. Since the gene-sets are highly useful in assessing the biological processes and pathways, we made an attempt to infer the deregulated cellular metabolic processes involved in HCC by employing metabolism associated gene-set enrichment analysis. Further, the metabolic process enrichment scores were integrated with the transcriptome profiles of HCC. Integrative analysis shows three distinct metabolic deregulations: i) hepatocyte function related molecular processes involving lipid/fatty acid/bile acid synthesis, ii) inflammatory processes with cytokine, sphingolipid & chondriotin sulphate metabolism and iii) enriched nucleotide metabolic process involving purine/pyrimidine & glucose mediated catabolic process, in hepatocarcinogenesis. The three distinct metabolic processes were found to occur both in tumor and liver cancer cell line profiles. Unsupervised hierarchical clustering of the metabolic processes along with clinical sample information has identified two major clusters based on AFP (alpha-fetoprotein) and metastasis. The study reveals the three major regulatory processes involved in HCC stages.

  16. Evaluating the genomic and sequence integrity of human ES cell lines; comparison to normal genomes.

    PubMed

    Funk, Walter D; Labat, Ivan; Sampathkumar, Janani; Gourraud, Pierre-Antoine; Oksenberg, Jorge R; Rosler, Elen; Steiger, Daniel; Sheibani, Nadia; Caillier, Stacy; Stache-Crain, Birgit; Johnson, Julie A; Meisner, Lorraine; Lacher, Markus D; Chapman, Karen B; Park, Myung Jin; Shin, Kyoung-Jin; Drmanac, Rade; West, Michael D

    2012-03-01

    Copy number variation (CNV) is a common chromosomal alteration that can occur during in vitro cultivation of human cells and can be accompanied by the accumulation of mutations in coding region sequences. We describe here a systematic application of current molecular technologies to provide a detailed understanding of genomic and sequence profiles of human embryonic stem cell (hESC) lines that were derived under GMP-compliant conditions. We first examined the overall chromosomal integrity using cytogenetic techniques to determine chromosome count, and to detect the presence of cytogenetically aberrant cells in the culture (mosaicism). Assays of copy number variation, using both microarray and sequence-based analyses, provide a detailed view genomic variation in these lines and shows that in early passage cultures of these lines, the size range and distribution of CNVs are entirely consistent with those seen in the genomes of normal individuals. Similarly, genome sequencing shows variation within these lines that is completely within the range seen in normal genomes. Important gene classes, such as tumor suppressors and genetic disease genes, do not display overtly disruptive mutations that could affect the overall safety of cell-based therapeutics. Complete sequence also allows the analysis of important transplantation antigens, such as ABO and HLA types. The combined application of cytogenetic and molecular technologies provides a detailed understanding of genomic and sequence profiles of GMP produced ES lines for potential use as therapeutic agents. PMID:22265736

  17. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  18. Construction of an integrated database to support genomic sequence analysis

    SciTech Connect

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  19. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation

    PubMed Central

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi (“truffles”) with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites. PMID:24949275

  20. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation.

    PubMed

    Brenna, Andrea; Montanini, Barbara; Muggiano, Eleonora; Proietto, Marco; Filetici, Patrizia; Ottonello, Simone; Ballario, Paola

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi ("truffles") with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites.

  1. CRISPR/Cas9-mediated genome modification in the mollusc, Crepidula fornicata.

    PubMed

    Perry, Kimberly J; Henry, Jonathan Q

    2015-02-01

    The discovery and application of the CRISPR/Cas9 genome editing method has greatly enhanced the ease with which transgenic manipulation can occur. We applied this technology to the mollusc, Crepidula fornicata, and have successfully created transgenic embryos expressing mCherry fused to endogenous β-catenin. Specific integration of the fluorescent reporter was achieved by homologous recombination with a β-catenin-specific donor DNA containing the mCherry coding sequence. This fluorescent gene knock-in strategy permits in vivo observations of β-catenin expression during embryonic development and represents the first demonstration of CRISPR/Cas9-mediated transgenesis in the Lophotrochozoa superphylum. The CRISPR/Cas9 method is a powerful and economical tool for genome modification and presents an option for analysis of gene expression in not only major model systems, but also in those more diverse species that may not have been amenable to the classic methods of transgenesis. This approach will allow one to generate transgenic lines of snails for future studies.

  2. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Andrei L. Osterman, Ph.D.

    2012-12-17

    Integration of bioinformatics and experimental techniques was applied to mapping and characterization of the key components (pathways, enzymes, transporters, regulators) of the core metabolic machinery in Shewanella oneidensis and related species with main focus was on metabolic and regulatory pathways involved in utilization of various carbon and energy sources. Among the main accomplishments reflected in ten joint publications with other participants of Shewanella Federation are: (i) A systems-level reconstruction of carbohydrate utilization pathways in the genus of Shewanella (19 species). This analysis yielded reconstruction of 18 sugar utilization pathways including 10 novel pathway variants and prediction of > 60 novel protein families of enzymes, transporters and regulators involved in these pathways. Selected functional predictions were verified by focused biochemical and genetic experiments. Observed growth phenotypes were consistent with bioinformatic predictions providing strong validation of the technology and (ii) Global genomic reconstruction of transcriptional regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors, 8 riboswitches and 6 translational attenuators. Of those, 45 regulons were inferred directly from the genome context analysis, whereas others were propagated from previously characterized regulons in other species. Selected regulatory predictions were experimentally tested. Integration of this analysis with microarray data revealed overall consistency and provided additional layer of interactions between regulons. All the results were captured in the new database RegPrecise, which is a joint development with the LBNL team. A more detailed analysis of the individual subsystems, pathways and regulons in Shewanella spp included bioinfiormatics-based prediction and experimental characterization of: (i) N-Acetylglucosamine catabolic pathway; (ii)Lactate utilization machinery; (iii) Novel Nrt

  3. Applications of CRISPR-Cas9 mediated genome engineering.

    PubMed

    Yang, Xiao

    2015-01-01

    Targeted mutagenesis based on homologous recombination has been a powerful tool for understanding the mechanisms underlying development, normal physiology, and disease. A recent breakthrough in genome engineering technology based on the class of RNA-guided endonucleases, such as clustered regularly interspaced short palindromic repeats (CRISPR)-associated Cas9, is further revolutionizing biology and medical studies. The simplicity of the CRISPR-Cas9 system has enabled its widespread applications in generating germline animal models, somatic genome engineering, and functional genomic screening and in treating genetic and infectious diseases. This technology will likely be used in all fields of biomedicine, ranging from basic research to human gene therapy.

  4. The Npl3 hnRNP prevents R-loop-mediated transcription–replication conflicts and genome instability

    PubMed Central

    Santos-Pereira, José M.; Herrero, Ana B.; García-Rubio, María L.; Marín, Antonio; Moreno, Sergio; Aguilera, Andrés

    2013-01-01

    Transcription is a major obstacle for replication fork (RF) progression and a cause of genome instability. Part of this instability is mediated by cotranscriptional R loops, which are believed to increase by suboptimal assembly of the nascent messenger ribonucleoprotein particle (mRNP). However, no clear evidence exists that heterogeneous nuclear RNPs (hnRNPs), the basic mRNP components, prevent R-loop stabilization. Here we show that yeast Npl3, the most abundant RNA-binding hnRNP, prevents R-loop-mediated genome instability. npl3Δ cells show transcription-dependent and R-loop-dependent hyperrecombination and genome-wide replication obstacles as determined by accumulation of the Rrm3 helicase. Such obstacles preferentially occur at long and highly expressed genes, to which Npl3 is preferentially bound in wild-type cells, and are reduced by RNase H1 overexpression. The resulting replication stress confers hypersensitivity to double-strand break-inducing agents. Therefore, our work demonstrates that mRNP factors are critical for genome integrity and opens the option of using them as therapeutic targets in anti-cancer treatment. PMID:24240235

  5. Precision genome editing in plants via gene targeting and piggyBac-mediated marker excision

    PubMed Central

    Nishizawa-Yokoi, Ayako; Endo, Masaki; Ohtsuki, Namie; Saika, Hiroaki; Toki, Seiichi

    2015-01-01

    Precise genome engineering via homologous recombination (HR)-mediated gene targeting (GT) has become an essential tool in molecular breeding as well as in basic plant science. As HR-mediated GT is an extremely rare event, positive–negative selection has been used extensively in flowering plants to isolate cells in which GT has occurred. In order to utilize GT as a methodology for precision mutagenesis, the positive selectable marker gene should be completely eliminated from the GT locus. Here, we introduce targeted point mutations conferring resistance to herbicide into the rice acetolactate synthase (ALS) gene via GT with subsequent marker excision by piggyBac transposition. Almost all regenerated plants expressing piggyBac transposase contained exclusively targeted point mutations without concomitant re-integration of the transposon, resulting in these progeny showing a herbicide bispyribac sodium (BS)-tolerant phenotype. This approach was also applied successfully to the editing of a microRNA targeting site in the rice cleistogamy 1 gene. Therefore, our approach provides a general strategy for the targeted modification of endogenous genes in plants. PMID:25284193

  6. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/

  7. Preventing Replication Fork Collapse to Maintain Genome Integrity

    PubMed Central

    Cortez, David

    2015-01-01

    Billions of base pairs of DNA must be replicated trillions of times in a human lifetime. Complete and accurate replication once and only once per cell division cycle is essential to maintain genome integrity and prevent disease. Impediments to replication fork progression including difficult to replicate DNA sequences, conflicts with transcription, and DNA damage further add to the genome maintenance challenge. These obstacles frequently cause fork stalling, but only rarely cause a failure to complete replication. Robust mechanisms ensure that stalled forks remain stable and capable of either resuming DNA synthesis or being rescued by converging forks. However, when failures do happen the fork collapses leading to genome rearrangements, cell death and disease. Despite intense interest, the mechanisms to repair damaged replication forks, stabilize them, and ensure successful replication remain only partly understood. Different models of fork collapse have been proposed with varying descriptions of what happens to the DNA and replisome. Here, I will define fork collapse and describe what is known about how the replication checkpoint prevents it to maintain genome stability. PMID:25957489

  8. An integrated semiconductor device enabling non-optical genome sequencing.

    PubMed

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-21

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome. PMID:21776081

  9. ONTOFUSION: ontology-based integration of genomic and clinical databases.

    PubMed

    Pérez-Rey, D; Maojo, V; García-Remesal, M; Alonso-Calvo, R; Billhardt, H; Martin-Sánchez, F; Sousa, A

    2006-01-01

    ONTOFUSION is an ontology-based system designed for biomedical database integration. It is based on two processes: mapping and unification. Mapping is a semi-automated process that uses ontologies to link a database schema with a conceptual framework-named virtual schema. There are three methodologies for creating virtual schemas, according to the origin of the domain ontology used: (1) top-down--e.g. using an existing ontology, such as the UMLS or Gene Ontology--, (2) bottom-up--building a new domain ontology-- and (3) a hybrid combination. Unification is an automated process for integrating ontologies and hence the database to which they are linked. Using these methods, we employed ONTOFUSION to integrate a large number of public genomic and clinical databases, as well as biomedical ontologies.

  10. STINGRAY: system for integrated genomic resources and analysis

    PubMed Central

    2014-01-01

    Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. Conclusion STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/. PMID:24606808

  11. Efficient homologous recombination-mediated genome engineering in zebrafish using TALE nucleases.

    PubMed

    Shin, Jimann; Chen, Jiakun; Solnica-Krezel, Lilianna

    2014-10-01

    Custom-designed nucleases afford a powerful reverse genetic tool for direct gene disruption and genome modification in vivo. Among various applications of the nucleases, homologous recombination (HR)-mediated genome editing is particularly useful for inserting heterologous DNA fragments, such as GFP, into a specific genomic locus in a sequence-specific fashion. However, precise HR-mediated genome editing is still technically challenging in zebrafish. Here, we establish a GFP reporter system for measuring the frequency of HR events in live zebrafish embryos. By co-injecting a TALE nuclease and GFP reporter targeting constructs with homology arms of different size, we defined the length of homology arms that increases the recombination efficiency. In addition, we found that the configuration of the targeting construct can be a crucial parameter in determining the efficiency of HR-mediated genome engineering. Implementing these modifications improved the efficiency of zebrafish knock-in generation, with over 10% of the injected F0 animals transmitting gene-targeting events through their germline. We generated two HR-mediated insertion alleles of sox2 and gfap loci that express either superfolder GFP (sfGFP) or tandem dimeric Tomato (tdTomato) in a spatiotemporal pattern that mirrors the endogenous loci. This efficient strategy provides new opportunities not only to monitor expression of endogenous genes and proteins and follow specific cell types in vivo, but it also paves the way for other sophisticated genetic manipulations of the zebrafish genome.

  12. A model for integration of DNA into the genome during transformation of Fusarium graminearum.

    PubMed

    Watson, R J; Burchat, S; Bosley, J

    2008-10-01

    Transformants of Fusarium graminearum were derived using linearized DNA of plasmids designed to replace the trichodiene synthase gene, a cutinase gene or a xylanase gene with a hygromycin-resistance marker cassette by homologous recombination between 1-kbp segments of flanking DNA. Most transformants did not exhibit the DNA structure expected of integration by classical double recombination. Instead, they contained linearized plasmid joined end-to-end and variably incorporated into the genome. Transformant types included ectopic integrations and integrations at the target site with or without removal of the targeted gene. We have analyzed a large number of transformants using cloning, PCR and DNA sequencing to determine the structures of their integrated DNA, and describe a model to explain their derivations. The data indicate that 1-3 copies of input DNA are first joined end-to-end to produce either linear or circular structures, probably mediated by the non-homologous end-joining (NHEJ) system. The end-joins typically have 1-5 nucleotides in common and are near or within the original cleavage site of the plasmid. Ectopic integrations occur by attaching linear DNA to two ends of genomic DNA via the same joining mechanism. Integration at the target site is consistent with replication around circularized input DNA, beginning and ending within the flanking homologous DNA, resulting in the integration of multiple copies of the entire structure. This results in deletion or duplication of the target site, or leaves one copy at either end of the integrated multimer. Reiterated DNA in the more complex structures is unstable due to homologous recombination, such that conversion to simpler forms is detected. PMID:18722542

  13. Integration Preferences of Wildtype AAV-2 for Consensus Rep-Binding Sites at Numerous Loci in the Human Genome

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Lutter, Timo; Weger, Stefan; Winter, Kerstin; Hammer, Eva-Maria; Cathomen, Toni; Reinert, Knut; Heilbronn, Regine

    2010-01-01

    Adeno-associated virus type 2 (AAV) is known to establish latency by preferential integration in human chromosome 19q13.42. The AAV non-structural protein Rep appears to target a site called AAVS1 by simultaneously binding to Rep-binding sites (RBS) present on the AAV genome and within AAVS1. In the absence of Rep, as is the case with AAV vectors, chromosomal integration is rare and random. For a genome-wide survey of wildtype AAV integration a linker-selection-mediated (LSM)-PCR strategy was designed to retrieve AAV-chromosomal junctions. DNA sequence determination revealed wildtype AAV integration sites scattered over the entire human genome. The bioinformatic analysis of these integration sites compared to those of rep-deficient AAV vectors revealed a highly significant overrepresentation of integration events near to consensus RBS. Integration hotspots included AAVS1 with 10% of total events. Novel hotspots near consensus RBS were identified on chromosome 5p13.3 denoted AAVS2 and on chromsome 3p24.3 denoted AAVS3. AAVS2 displayed seven independent junctions clustered within only 14 bp of a consensus RBS which proved to bind Rep in vitro similar to the RBS in AAVS3. Expression of Rep in the presence of rep-deficient AAV vectors shifted targeting preferences from random integration back to the neighbourhood of consensus RBS at hotspots and numerous additional sites in the human genome. In summary, targeted AAV integration is not as specific for AAVS1 as previously assumed. Rather, Rep targets AAV to integrate into open chromatin regions in the reach of various, consensus RBS homologues in the human genome. PMID:20628575

  14. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

    PubMed

    Moss, Nathan A; Bertin, Matthew J; Kleigrewe, Karin; Leão, Tiago F; Gerwick, Lena; Gerwick, William H

    2016-03-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques.

  15. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery

    PubMed Central

    Bertin, Matthew J.; Kleigrewe, Karin; Leão, Tiago F.; Gerwick, Lena

    2016-01-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313

  16. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

    PubMed

    Moss, Nathan A; Bertin, Matthew J; Kleigrewe, Karin; Leão, Tiago F; Gerwick, Lena; Gerwick, William H

    2016-03-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313

  17. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  18. Excision and episomal replication of cauliflower mosaic virus integrated into a plant genome.

    PubMed

    Squires, Julie; Gillespie, Trudi; Schoelz, James E; Palukaitis, Peter

    2011-04-01

    Transgenic Arabidopsis (Arabidopsis thaliana) plants containing a monomeric copy of the cauliflower mosaic virus (CaMV) genome exhibited the generation of infectious, episomally replicating virus. The circular viral genome had been split within the nonessential gene II for integration into the Arabidopsis genome by Agrobacterium tumefaciens-mediated transformation. Transgenic plants were assessed for episomal infections at flowering, seed set, and/or senescence. The infections were confirmed by western blot for the CaMV P6 and P4 proteins, electron microscopy for the presence of icosahedral virions, and through polymerase chain reaction across the recombination junction. By the end of the test period, a majority of the transgenic Arabidopsis plants had developed episomal infections. The episomal form of the virus was infectious to nontransgenic plants, indicating that no essential functions were lost after release from the Arabidopsis chromosome. An analysis of the viral genomes recovered from either transgenic Arabidopsis or nontransgenic turnip (Brassica rapa var rapa) revealed that the viruses contained deletions within gene II, and in some cases, the deletions extended to the beginning of gene III. In addition, many of the progeny viruses contained small regions of nonviral sequence derived from the flanking transformation vector. The nature of the nucleotide sequences at the recombination junctions in the circular progeny virus indicated that most were generated by nonhomologous recombination during the excision event. The release of the CaMV viral genomes from an integrated copy was not dependent upon the application of environmental stresses but occurred with greater frequency with either age or the late stages of plant maturation.

  19. Survey of Nursing Integration of Genomics Into Nursing Practice

    PubMed Central

    Calzone, Kathleen A.; Jenkins, Jean; Yates, Jan; Cusack, Georgie; Wallen, Gwenyth R.; Liewehr, David J.; Steinberg, Seth M.; McBride, Colleen

    2012-01-01

    Purpose Translating clinically valid genomic discoveries into practice is hinged not only on technologic advances, but also on nurses—the largest global contingent of health providers—acquiring requisite competencies to apply these discoveries in clinical care. The study aim was to assess practicing nurse attitudes, practices, receptivity, confidence, and competency of integrating genomics into nursing practice. Design A convenience sample of practicing nurses was recruited to complete an online survey that assessed domains from Roger’s Diffusion of Innovations Theory and used family history utilization as the basis for competency assessment. Methods Results were tabulated and analyzed using descriptive statistical techniques. Findings Two-hundred-thirty-nine licensed registered nurses, 22 to 72 years of age, with a median of 20 years in practice, responded, for an overall response rate of 28%. Most were White (83%), female (92%), and held baccalaureate degrees (56%). Seventy-one percent considered genetics to be very important to nursing practice; however, 81% rated their understanding of the genetics of common diseases as poor or fair. Per-question response rates varied widely. Instrument assessment indicated that modifications were necessary to decrease respondent burden. Conclusions Respondents’ perceived genomic competency was inadequate, family history was not routinely utilized in care delivery, and the extent of family history varied widely. However, most nurses indicated interest in pursuing continuing genomic education. Clinical Relevance Findings from this study can lead to the development of targeted education that will facilitate optimal workforce preparation for the ongoing influx of genetics and genomics information, technologies, and targeted therapies into the healthcare arena. This pilot study provides a foundation on which to build the next step, which includes a national nursing workforce study. PMID:23205780

  20. Integrated Analysis of Whole Genome and Transcriptome Sequencing Reveals Diverse Transcriptomic Aberrations Driven by Somatic Genomic Changes in Liver Cancers

    PubMed Central

    Shiraishi, Yuichi; Fujimoto, Akihiro; Furuta, Mayuko; Tanaka, Hiroko; Chiba, Ken-ichi; Boroevich, Keith A.; Abe, Tetsuo; Kawakami, Yoshiiku; Ueno, Masaki; Gotoh, Kunihito; Ariizumi, Shun-ichi; Shibuya, Tetsuo; Nakano, Kaoru; Sasaki, Aya; Maejima, Kazuhiro; Kitada, Rina; Hayami, Shinya; Shigekawa, Yoshinobu; Marubashi, Shigeru; Yamada, Terumasa; Kubo, Michiaki; Ishikawa, Osamu; Aikata, Hiroshi; Arihiro, Koji; Ohdan, Hideki; Yamamoto, Masakazu; Yamaue, Hiroki; Chayama, Kazuaki; Tsunoda, Tatsuhiko; Miyano, Satoru; Nakagawa, Hidewaki

    2014-01-01

    Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV)-related hepatocellular carcinomas (HCCs) and their matched controls. Comparison of whole genome sequence (WGS) and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3), and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome. PMID:25526364

  1. Integrated Genome-Based Studies of Shewanella Echophysiology

    SciTech Connect

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high

  2. An integrative computational approach for prioritization of genomic variants

    DOE PAGESBeta

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; et al

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less

  3. An integrative computational approach for prioritization of genomic variants

    SciTech Connect

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad; Huang, Qingyang

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.

  4. An integrative computational approach for prioritization of genomic variants.

    PubMed

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Cem, Meydan; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R; Mirzaa, Ghayda M; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E; Ross, M Elizabeth; Maltsev, Natalia; Gilliam, T Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  5. An integrative computational approach for prioritization of genomic variants.

    PubMed

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Cem, Meydan; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R; Mirzaa, Ghayda M; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E; Ross, M Elizabeth; Maltsev, Natalia; Gilliam, T Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.

  6. TALEN-mediated genome editing: prospects and perspectives

    SciTech Connect

    Wright, DA; Li, T; Yang, B; Spalding, MH

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges, and future prospects of this quickly evolving area of study and application.

  7. Integrative Genomics Identifies Gene Signature Associated with Melanoma Ulceration

    PubMed Central

    Toth, Reka; Vizkeleti, Laura; Herandez-Vargas, Hector; Lazar, Viktoria; Emri, Gabriella; Szatmari, Istvan; Herceg, Zdenko; Adany, Roza; Balazs, Margit

    2013-01-01

    Background Despite the extensive research approaches applied to characterise malignant melanoma, no specific molecular markers are available that are clearly related to the progression of this disease. In this study, our aims were to define a gene expression signature associated with the clinical outcome of melanoma patients and to provide an integrative interpretation of the gene expression -, copy number alterations -, and promoter methylation patterns that contribute to clinically relevant molecular functional alterations. Methods Gene expression profiles were determined using the Affymetrix U133 Plus2.0 array. The NimbleGen Human CGH Whole-Genome Tiling array was used to define CNAs, and the Illumina GoldenGate Methylation platform was applied to characterise the methylation patterns of overlapping genes. Results We identified two subclasses of primary melanoma: one representing patients with better prognoses and the other being characteristic of patients with unfavourable outcomes. We assigned 1,080 genes as being significantly correlated with ulceration, 987 genes were downregulated and significantly enriched in the p53, Nf-kappaB, and WNT/beta-catenin pathways. Through integrated genome analysis, we defined 150 downregulated genes whose expression correlated with copy number losses in ulcerated samples. These genes were significantly enriched on chromosome 6q and 10q, which contained a total of 36 genes. Ten of these genes were downregulated and involved in cell-cell and cell-matrix adhesion or apoptosis. The expression and methylation patterns of additional genes exhibited an inverse correlation, suggesting that transcriptional silencing of these genes is driven by epigenetic events. Conclusion Using an integrative genomic approach, we were able to identify functionally relevant molecular hotspots characterised by copy number losses and promoter hypermethylation in distinct molecular subtypes of melanoma that contribute to specific transcriptomic silencing

  8. Bilayer-thickness-mediated interactions between integral membrane proteins.

    PubMed

    Kahraman, Osman; Koch, Peter D; Klug, William S; Haselwandter, Christoph A

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  9. Integrative Genomic Characterization and a Genomic Staging System for Gastrointestinal Stromal Tumors

    PubMed Central

    Ylipää, Antti; Hunt, Kelly K.; Yang, Jilong; Lazar, Alexander J. F.; Torres, Keila E.; Lev, Dina Chelouche; Nykter, Matti; Pollock, Raphael E.; Trent, Jonathan; Zhang, Wei

    2010-01-01

    Gastrointestinal stromal tumors (GISTs) were historically grouped with leiomyosarcomas (LMSs) based on their morphological similarities, but recently they have been unequivocally established as a distinct type of sarcoma based on the molecular features and response to imatinib treatment. To gain further insight into the genomic differences between GISTs and LMSs, we mapped gene copy number aberrations (CNAs) in 42 GISTs and 30 LMSs and integrated them with gene expression profiles. Our studies revealed distinct patterns of CNAs between GISTs and LMSs. Losses in chromosomes 1p, 14q, 15q, and 22q were significantly more frequent in GISTs than in LMSs (P < 0.001), whereas losses in chromosomes 10 and 16 as well as gains in 1q, 14q, and 15q (P < 0.001) were more common in LMSs. By integrating CNAs with gene expression data and clinical information, we found several clinically relevant CNAs that were prognostic of survival in patients with GIST. Furthermore, GISTs were categorized into four groups according to an accumulating pattern of genetic alterations. Many key cellular pathways were differently expressed in the four groups and the patients had increasingly worse prognosis as the extent of genomic alterations increased. These findings lead us to propose a new tumor-progression genetic staging system termed Genomic Instability Stage (GIS) to complement the current prognostic predictive system based on tumor size, mitotic index (MI), and KIT mutation. PMID:20818650

  10. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    PubMed Central

    Staub, Jack E.; Han, Yonghua; Cheng, Zhouchao; Li, Xuefeng; Lu, Jingyuan; Miao, Han; Kang, Houxiang; Xie, Bingyan; Gu, Xingfang; Wang, Xiaowu; Du, Yongchen; Jin, Weiwei; Huang, Sanwen

    2009-01-01

    The Cucurbitaceae includes important crops such as cucumber, melon, watermelon, squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular linkage maps. We report herein the development of highly polymorphic simple sequence repeat (SSR) markers originated from whole genome shotgun sequencing and the subsequent construction of a high-density genetic linkage map. This map includes 995 SSRs in seven linkage groups which spans in total 573 cM, and defines ∼680 recombination breakpoints with an average of 0.58 cM between two markers. These linkage groups were then assigned to seven corresponding chromosomes using fluorescent in situ hybridization (FISH). FISH assays also revealed a chromosomal inversion between Cucumis subspecies [C. sativus var. sativus L. and var. hardwickii (R.) Alef], which resulted in marker clustering on the genetic map. A quarter of the mapped markers showed relatively high polymorphism levels among 11 inbred lines of cucumber. Among the 995 markers, 49%, 26% and 22% were conserved in melon, watermelon and pumpkin, respectively. This map will facilitate whole genome sequencing, positional cloning, and molecular breeding in cucumber, and enable the integration of knowledge of gene and trait in cucurbits. PMID:19495411

  11. Integrative genome-wide approaches in embryonic stem cell research.

    PubMed

    Zhang, Xinyue; Huang, Jing

    2010-10-01

    Embryonic stem (ES) cells are derived from blastocysts. They can differentiate into the three embryonic germ layers and essentially any type of somatic cells. They therefore hold great potential in tissue regeneration therapy. The ethical issues associated with the use of human embryonic stem cells are resolved by the technical break-through of generating induced pluripotent stem (iPS) cells from various types of somatic cells. However, how ES and iPS cells self-renew and maintain their pluripotency is still largely unknown in spite of the great progress that has been made in the last two decades. Integrative genome-wide approaches, such as the gene expression microarray, chromatin immunoprecipitation based microarray (ChIP-chip) and chromatin immunoprecipitation followed by massive parallel sequencing (ChIP-seq) offer unprecedented opportunities to elucidate the mechanism of the pluripotency, reprogramming and DNA damage response of ES and iPS cells. This frontier article summarizes the fundamental biological questions about ES and iPS cells and reviews the recent advances in ES and iPS cell research using genome-wide technologies. To this end, we offer our perspectives on the future of genome-wide studies on stem cells.

  12. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two brea...

  13. A New Approach to Dissect Nuclear Organization: TALE-Mediated Genome Visualization (TGV).

    PubMed

    Miyanari, Yusuke

    2016-01-01

    Spatiotemporal organization of chromatin within the nucleus has so far remained elusive. Live visualization of nuclear remodeling could be a promising approach to understand its functional relevance in genome functions and mechanisms regulating genome architecture. Recent technological advances in live imaging of chromosomes begun to explore the biological roles of the movement of the chromatin within the nucleus. Here I describe a new technique, called TALE-mediated genome visualization (TGV), which allows us to visualize endogenous repetitive sequence including centromeric, pericentromeric, and telomeric repeats in living cells.

  14. Random tag insertions by Transposon Integration mediated Mutagenesis (TIM).

    PubMed

    Hoeller, Brigitte M; Reiter, Birgit; Abad, Sandra; Graze, Ina; Glieder, Anton

    2008-10-01

    Transposon Integration mediated Mutagenesis (TIM) is a broadly applicable tool for protein engineering. This method combines random integration of modified bacteriophage Mu transposons with their subsequent defined excision employing type IIS restriction endonuclease AarI. TIM enables deletion or insertion of an arbitrary number of bases at random positions, insertion of functional sequence tags at random positions, replacing randomly selected triplets by a specific codon (e.g. scanning) and site-saturation mutagenesis. As a proof of concept a transposon named GeneOpenerAarIKan was designed and employed to introduce 6xHis tags randomly into the esterase EstC from Burkholderia gladioli. A TIM library was screened with colony based assays for clones with an integrated 6xHis tag and for clones exhibiting esterase activity. The employed strategy enables the isolation of randomly tagged active enzymes in single mutagenesis experiments.

  15. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    SciTech Connect

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  16. Integrated cytogenetics and genomics analysis of transposable elements in the Nile tilapia, Oreochromis niloticus.

    PubMed

    Valente, Guilherme; Kocher, Thomas; Eickbush, Thomas; Simões, Rafael P; Martins, Cesar

    2016-06-01

    Integration of cytogenetics and genomics has become essential to a better view of architecture and function of genomes. Although the advances on genomic sequencing have contributed to study genes and genomes, the repetitive DNA fraction of the genome is still enigmatic and poorly understood. Among repeated DNAs, transposable elements (TEs) are major components of eukaryotic chromatin and their investigation has been hindered even after the availability of whole sequenced genomes. The cytogenetic mapping of TEs in chromosomes has proved to be of high value to integrate information from the micro level of nucleotide sequence to a cytological view of chromosomes. Different TEs have been cytogenetically mapped in cichlids; however, neither details about their genomic arrangement nor appropriated copy number are well defined by these approaches. The current study integrates TEs distribution in Nile tilapia Oreochromis niloticus genome based on cytogenetic and genomics/bioinformatics approach. The results showed that some elements are not randomly distributed and that some are genomic dependent on each other. Moreover, we found extensive overlap between genomics and cytogenetics data and that tandem duplication may be the major mechanism responsible for the genomic dynamics of TEs here analyzed. This paper provides insights in the genomic organization of TEs under an integrated view based on cytogenetics and genomics. PMID:26860923

  17. Potential pitfalls of CRISPR/Cas9-mediated genome editing.

    PubMed

    Peng, Rongxue; Lin, Guigao; Li, Jinming

    2016-04-01

    Recently, a novel technique named the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas)9 system has been rapidly developed. This genome editing tool has improved our ability tremendously with respect to exploring the pathogenesis of diseases and correcting disease mutations, as well as phenotypes. With a short guide RNA, Cas9 can be precisely directed to target sites, and functions as an endonuclease to efficiently produce breaks in DNA double strands. Over the past 30 years, CRISPR has evolved from the 'curious sequences of unknown biological function' into a promising genome editing tool. As a result of the incessant development in the CRISPR/Cas9 system, Cas9 co-expressed with custom guide RNAs has been successfully used in a variety of cells and organisms. This genome editing technology can also be applied to synthetic biology, functional genomic screening, transcriptional modulation and gene therapy. However, although CRISPR/Cas9 has a broad range of action in science, there are several aspects that affect its efficiency and specificity, including Cas9 activity, target site selection and short guide RNA design, delivery methods, off-target effects and the incidence of homology-directed repair. In the present review, we highlight the factors that affect the utilization of CRISPR/Cas9, as well as possible strategies for handling any problems. Addressing these issues will allow us to take better advantage of this technique. In addition, we also review the history and rapid development of the CRISPR/Cas system from the time of its initial discovery in 2012.

  18. CRISPR/Cas9-mediated genome editing of Epstein-Barr virus in human cells.

    PubMed

    Yuen, Kit-San; Chan, Chi-Ping; Wong, Nok-Hei Mickey; Ho, Chau-Ha; Ho, Ting-Hin; Lei, Ting; Deng, Wen; Tsao, Sai Wah; Chen, Honglin; Kok, Kin-Hang; Jin, Dong-Yan

    2015-03-01

    The CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR-associated 9) system is a highly efficient and powerful tool for RNA-guided editing of the cellular genome. Whether CRISPR/Cas9 can also cleave the genome of DNA viruses such as Epstein-Barr virus (EBV), which undergo episomal replication in human cells, remains to be established. Here, we reported on CRISPR/Cas9-mediated editing of the EBV genome in human cells. Two guide RNAs (gRNAs) were used to direct a targeted deletion of 558 bp in the promoter region of BART (BamHI A rightward transcript) which encodes viral microRNAs (miRNAs). Targeted editing was achieved in several human epithelial cell lines latently infected with EBV, including nasopharyngeal carcinoma C666-1 cells. CRISPR/Cas9-mediated editing of the EBV genome was efficient. A recombinant virus with the desired deletion was obtained after puromycin selection of cells expressing Cas9 and gRNAs. No off-target cleavage was found by deep sequencing. The loss of BART miRNA expression and activity was verified, supporting the BART promoter as the major promoter of BART RNA. Although CRISPR/Cas9-mediated editing of the multicopy episome of EBV in infected HEK293 cells was mostly incomplete, viruses could be recovered and introduced into other cells at low m.o.i. Recombinant viruses with an edited genome could be further isolated through single-cell sorting. Finally, a DsRed selectable marker was successfully introduced into the EBV genome during the course of CRISPR/Cas9-mediated editing. Taken together, our work provided not only the first genetic evidence that the BART promoter drives the expression of the BART transcript, but also a new and efficient method for targeted editing of EBV genome in human cells.

  19. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  20. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  1. Integrated genome-wide analysis of genomic changes and gene regulation in human adrenocortical tissue samples

    PubMed Central

    Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S.; Kebebew, Electron

    2015-01-01

    To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics. PMID:26446994

  2. Integrated genome-wide analysis of genomic changes and gene regulation in human adrenocortical tissue samples.

    PubMed

    Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S; Kebebew, Electron

    2015-10-30

    To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics.

  3. Functional profile of the binary brain corticosteroid receptor system: mediating, multitasking, coordinating, integrating.

    PubMed

    de Kloet, E R

    2013-11-01

    This contribution is focused on the action of the naturally occurring corticosteroids, cortisol and corticosterone, which are secreted from the adrenals in hourly pulses and after stress with the goal to maintain resilience and health. To achieve this goal the action of the corticosteroids displays an impressive diversity, because it is cell-specific and context-dependent in coordinating the individual's response to changing environments. These diverse actions of corticosterone are mediated by mineralocorticoid- and glucocorticoid-receptors that operate as a binary system in concert with neurotransmitter and neuropeptide signals to activate and inhibit stress reactions, respectively. Classically MR and GR are gene transcription factors, but recently these receptors appear to mediate also rapid non-genomic actions on excitatory neurotransmission suggesting that they integrate functions over time. Hence the balance of receptor-mediated actions is crucial for homeostasis. This balanced function of mineralo- and glucocorticoid-receptors can be altered epigenetically by a history of traumatic (early) life events and the experience of repeated stressors as well as by predisposing genetic variants in signaling pathways of these receptors. One of these variants, mineralocorticoid receptor haplotype 2, is associated with dispositional optimism in appraisal of environmental challenges. Imbalance in receptor-mediated corticosterone actions was found to leave a genomic signature highlighting the role of master switches such as cAMP response element-binding protein and mammalian target of rapamycin to compromise health, and to promote vulnerability to disease. Diabetic encephalopathy is a pathology of imbalanced corticosterone action, which can be corrected in its pre-stage by a brief treatment with the antiglucocorticoid mifepristone.

  4. Dual sgRNAs facilitate CRISPR/Cas9-mediated mouse genome targeting.

    PubMed

    Zhou, Jiankui; Wang, Jianying; Shen, Bin; Chen, Li; Su, Yang; Yang, Jing; Zhang, Wensheng; Tian, Xuemei; Huang, Xingxu

    2014-04-01

    The bacterial clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system is a versatile RNA-guided mammalian genome modification system. One-step generation of mouse genome targeting has been achieved by co-microinjection of one-cell stage embryos with Cas9 mRNA and small/single guide (sg)RNA. Many studies have focused on enhancing the efficiency of this system. In the present study, we report that simultaneous use of dual sgRNAs to target an individual gene significantly improved the Cas9-mediated genome targeting with a bi-allelic modification efficiency of up to 78%. We further observed that the target gene modifications were characterized by efficient germline transmission and site-dependent off-target effects, and also that the apolipoprotein E gene knockout-mediated defects in blood biochemical parameters were recapitulated by CRISPR/Cas9-mediated heritable gene modification. Our results provide a dual sgRNAs strategy to facilitate CRISPR/Cas9-mediated mouse genome targeting.

  5. CSN6 deregulation impairs genome integrity in a COP1-dependent pathway

    PubMed Central

    Choi, Hyun Ho; Su, Chun-Hui; Fang, Lekun; Zhang, Jin; Yeung, Sai-Ching J.; Lee, Mong-Hong

    2015-01-01

    Understanding genome integrity and DNA damage response are critical to cancer treatment. In this study, we identify CSN6's biological function in regulating genome integrity. Constitutive photomorphogenic 1 (COP1), an E3 ubiquitin ligase regulated by CSN6, is downregulated by DNA damage, but the biological consequences of this phenomenon are poorly understood. p27Kip1 is a critical CDK inhibitor involved in cell cycle regulation, but its response to DNA damage remains unclear. Here, we report that p27Kip1 levels are elevated after DNA damage, with concurrent reduction of COP1 levels. Mechanistic studies showed that during DNA damage response COP1's function as an E3 ligase of p27 is compromised, thereby reducing the ubiquitin-mediated degradation of p27Kip1. Also, COP1 overexpression leads to downregulation of p27Kip1, thereby promoting the expression of mitotic kinase Aurora A. Overexpression of Aurora A correlates with poor survival. These findings provide new insight into CSN6-COP1-p27Kip1-Aurora A axis in DNA damage repair and tumorigenesis. PMID:25957415

  6. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease. PMID:26395457

  7. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease.

  8. Integrated Genomic and Epigenomic Analysis of Breast Cancer Brain Metastasis

    PubMed Central

    Salhia, Bodour; Kiefer, Jeff; Ross, Julianna T. D.; Metapally, Raghu; Martinez, Rae Anne; Johnson, Kyle N.; DiPerna, Danielle M.; Paquette, Kimberly M.; Jung, Sungwon; Nasser, Sara; Wallstrom, Garrick; Tembe, Waibhav; Baker, Angela; Carpten, John; Resau, Jim; Ryken, Timothy; Sibenaller, Zita; Petricoin, Emanuel F.; Liotta, Lance A.; Ramanathan, Ramesh K.; Berens, Michael E.; Tran, Nhan L.

    2014-01-01

    The brain is a common site of metastatic disease in patients with breast cancer, which has few therapeutic options and dismal outcomes. The purpose of our study was to identify common and rare events that underlie breast cancer brain metastasis. We performed deep genomic profiling, which integrated gene copy number, gene expression and DNA methylation datasets on a collection of breast brain metastases. We identified frequent large chromosomal gains in 1q, 5p, 8q, 11q, and 20q and frequent broad-level deletions involving 8p, 17p, 21p and Xq. Frequently amplified and overexpressed genes included ATAD2, BRAF, DERL1, DNMTRB and NEK2A. The ATM, CRYAB and HSPB2 genes were commonly deleted and underexpressed. Knowledge mining revealed enrichment in cell cycle and G2/M transition pathways, which contained AURKA, AURKB and FOXM1. Using the PAM50 breast cancer intrinsic classifier, Luminal B, Her2+/ER negative, and basal-like tumors were identified as the most commonly represented breast cancer subtypes in our brain metastasis cohort. While overall methylation levels were increased in breast cancer brain metastasis, basal-like brain metastases were associated with significantly lower levels of methylation. Integrating DNA methylation data with gene expression revealed defects in cell migration and adhesion due to hypermethylation and downregulation of PENK, EDN3, and ITGAM. Hypomethylation and upregulation of KRT8 likely affects adhesion and permeability. Genomic and epigenomic profiling of breast brain metastasis has provided insight into the somatic events underlying this disease, which have potential in forming the basis of future therapeutic strategies. PMID:24489661

  9. Cas9-Mediated Genome Engineering in Drosophila melanogaster.

    PubMed

    Housden, Benjamin E; Perrimon, Norbert

    2016-01-01

    The recent development of the CRISPR-Cas9 system for genome engineering has revolutionized our ability to modify the endogenous DNA sequence of many organisms, including Drosophila This system allows alteration of DNA sequences in situ with single base-pair precision and is now being used for a wide variety of applications. To use the CRISPR system effectively, various design parameters must be considered, including single guide RNA target site selection and identification of successful editing events. Here, we review recent advances in CRISPR methodology in Drosophila and introduce protocols for some of the more difficult aspects of CRISPR implementation: designing and generating CRISPR reagents and detecting indel mutations by high-resolution melt analysis. PMID:27587786

  10. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    SciTech Connect

    Baliga, Nitin S

    2011-05-26

    applied to the manually curated training set. Applying this method to the data representing around a quarter of the fraction space for water soluble proteins in D. vulgaris, we obtained 854 reliable pair wise interactions. Further, we have developed algorithms to analyze and assign significance to protein interaction data from bait pull-down experiments and integrate these data with other systems biology data through associative biclustering in a parallel computing environment. We will 'fill-in' missing information in these interaction data using a 'Transitive Closure' algorithm and subsequently use 'Between Commonality Decomposition' algorithm to discover complexes within these large graphs of protein interactions. To characterize the metabolic activities of proteins and their complexes we are developing algorithms to deconvolute pure mass spectra, estimate chemical formula for m/z values, and fit isotopic fine structure to metabolomics data. We have discovered that in comparison to isotopic pattern fitting methods restricting the chemical formula by these two dimensions actually facilitates unique solutions for chemical formula generators. To understand how microbial functions are regulated we have developed complementary algorithms for reconstructing gene regulatory networks (GRNs). Whereas the network inference algorithms cMonkey and Inferelator developed enable de novo reconstruction of predictive models for GRNs from diverse systems biology data, the RegPrecise and RegPredict framework developed uses evolutionary comparisons of genomes from closely related organisms to reconstruct conserved regulons. We have integrated the two complementary algorithms to rapidly generate comprehensive models for gene regulation of understudied organisms. Our preliminary analyses of these reconstructed GRNs have revealed novel regulatory mechanisms and cis-regulatory motifs, as well asothers that are conserved across species. Finally, we are supporting scientific efforts in ENIGMA

  11. Integrative bioinformatics for functional genome annotation: trawling for G protein-coupled receptors.

    PubMed

    Flower, Darren R; Attwood, Teresa K

    2004-12-01

    G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow. PMID:15561589

  12. An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes.

    PubMed

    Bai, Ling; Yang, Howard H; Hu, Ying; Shukla, Anjali; Ha, Ngoc-Han; Doran, Anthony; Faraji, Farhoud; Goldberger, Natalie; Lee, Maxwell P; Keane, Thomas; Hunter, Kent W

    2016-04-01

    Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease. PMID:27074153

  13. Nuclear pores protect genome integrity by assembling a premitotic and Mad1-dependent anaphase inhibitor

    PubMed Central

    Rodriguez-Bravo, Veronica; Maciejowski, John; Corona, Jennifer; Buch, Håkon Kirkeby; Collin, Philippe; Kanemaki, Masato T.; Shah, Jagesh V.; Jallepalli, Prasad V.

    2014-01-01

    Summary The spindle assembly checkpoint (SAC) delays anaphase until all chromosomes are bi-oriented on the mitotic spindle. Under current models, unattached kinetochores transduce the SAC by catalyzing the intramitotic production of a diffusible APC/CCdc20 inhibitor. Here we show that nuclear pore complexes (NPCs) in interphase cells also function as scaffolds for anaphase-inhibitory signaling. This role is mediated by Mad1-Mad2 complexes tethered to the nuclear basket, which activate soluble Mad2 as a binding partner and inhibitor of Cdc20 in the cytoplasm. Displacing Mad1-Mad2 from nuclear pores accelerated anaphase onset, prevented effective correction of merotelic errors, and increased the threshold of kinetochore-dependent signaling needed to halt mitosis in response to spindle poisons. A heterologous Mad1-NPC tether restored Cdc20 inhibitor production and normal M phase control. We conclude that nuclear pores and kinetochores both emit “wait anaphase” signals that preserve genome integrity. PMID:24581499

  14. Androgen receptor-mediated non-genomic regulation of prostate cancer cell proliferation

    PubMed Central

    Liao, Ross S.; Ma, Shihong; Miao, Lu; Li, Rui; Yin, Yi

    2013-01-01

    Androgen receptor (AR)-mediated signaling is necessary for prostate cancer cell proliferation and an important target for therapeutic drug development. Canonically, AR signals through a genomic or transcriptional pathway, involving the translocation of androgen-bound AR to the nucleus, its binding to cognate androgen response elements on promoter, with ensuing modulation of target gene expression, leading to cell proliferation. However, prostate cancer cells can show dose-dependent proliferation responses to androgen within minutes, without the need for genomic AR signaling. This proliferation response known as the non-genomic AR signaling is mediated by cytoplasmic AR, which facilitates the activation of kinase-signaling cascades, including the Ras-Raf-1, phosphatidyl-inositol 3-kinase (PI3K)/Akt and protein kinase C (PKC), which in turn converge on mitogen-activated protein kinase (MAPK)/extracellular signal-regulated kinase (ERK) activation, leading to cell proliferation. Further, since activated ERK may also phosphorylate AR and its coactivators, the non-genomic AR signaling may enhance AR genomic activity. Non-genomic AR signaling may occur in an ERK-independent manner, via activation of mammalian target of rapamycin (mTOR) pathway, or modulation of intracellular Ca2+ concentration through plasma membrane G protein-coupled receptors (GPCRs). These data suggest that therapeutic strategies aimed at preventing AR nuclear translocation and genomic AR signaling alone may not completely abrogate AR signaling. Thus, elucidation of mechanisms that underlie non-genomic AR signaling may identify potential mechanisms of resistance to current anti-androgens and help developing novel therapies that abolish all AR signaling in prostate cancer. PMID:26816736

  15. Examination of host genome for the presence of integrated fragments of Solenopsis invicta virus 1.

    PubMed

    Valles, Steven M; Bextine, Blake

    2011-07-01

    A series of oligonucleotide primer pairs covering the entire genome of Solenopsis invicta virus 1 (SINV-1) were used to probe the genome of its host, S. invicta, for integrated fragments of the viral genome. All of the oligonucleotide primer sets yielded amplicons of anticipated size from cDNA created from an RNA template from SINV-1. However, no corresponding amplification was observed when genomic DNA (from 32 colonies of S. invicta) was used as template for the PCR amplifications. Host DNA integrity was verified by amplification of an ant-specific gene, SiGSTS1. The representation of fire ant colonies included both social forms, monogyne and polygyne, and those infected and uninfected with SINV-1. Furthermore, no amplification was observed from genomic DNA from ant samples collected from Argentina or the US. Thus, it appears that SINV-1 genome integration, or a portion therein, has not likely occurred within the S. invicta host genome.

  16. CRISPR/Cas9-mediated genome engineering of CHO cell factories: Application and perspectives.

    PubMed

    Lee, Jae Seong; Grav, Lise Marie; Lewis, Nathan E; Faustrup Kildegaard, Helene

    2015-07-01

    Chinese hamster ovary (CHO) cells are the most widely used production host for therapeutic proteins. With the recent emergence of CHO genome sequences, CHO cell line engineering has taken on a new aspect through targeted genome editing. The bacterial clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system enables rapid, easy and efficient engineering of mammalian genomes. It has a wide range of applications from modification of individual genes to genome-wide screening or regulation of genes. Facile genome editing using CRISPR/Cas9 empowers researchers in the CHO community to elucidate the mechanistic basis behind high level production of proteins and product quality attributes of interest. In this review, we describe the basis of CRISPR/Cas9-mediated genome editing and its application for development of next generation CHO cell factories while highlighting both future perspectives and challenges. As one of the main drivers for the CHO systems biology era, genome engineering with CRISPR/Cas9 will pave the way for rational design of CHO cell factories.

  17. Integrated database of information from structural genomics experiments.

    PubMed

    Asada, Yukuhiko; Sugahara, Michihiro; Mizutani, Hisashi; Naitow, Hisashi; Tanaka, Tomoyuki; Matsuura, Yoshinori; Agari, Yoshihiro; Ebihara, Akio; Shinkai, Akeo; Kuramitsu, Seiki; Yokoyama, Shigeyuki; Kaminuma, Eri; Kobayashi, Norio; Nishikata, Koro; Shimoyama, Sayoko; Toyoda, Tetsuro; Ishikawa, Tetsuya; Kunishima, Naoki

    2013-05-01

    Information from structural genomics experiments at the RIKEN SPring-8 Center, Japan has been compiled and published as an integrated database. The contents of the database are (i) experimental data from nine species of bacteria that cover a large variety of protein molecules in terms of both evolution and properties (http://database.riken.jp/db/bacpedia), (ii) experimental data from mutant proteins that were designed systematically to study the influence of mutations on the diffraction quality of protein crystals (http://database.riken.jp/db/bacpedia) and (iii) experimental data from heavy-atom-labelled proteins from the heavy-atom database HATODAS (http://database.riken.jp/db/hatodas). The database integration adopts the semantic web, which is suitable for data reuse and automatic processing, thereby allowing batch downloads of full data and data reconstruction to produce new databases. In addition, to enhance the use of data (i) and (ii) by general researchers in biosciences, a comprehensible user interface, Bacpedia (http://bacpedia.harima.riken.jp), has been developed.

  18. The integrated web service and genome database for agricultural plants with biotechnology information

    PubMed Central

    Kim, ChangKug; Park, DongSuk; Seol, YoungJoo; Hahn, JangHo

    2011-01-01

    The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage. PMID:21887015

  19. Accessing integrated genomic data using GenoBase: A tutorial, Part 1

    SciTech Connect

    Overbeek, R.; Price, M.

    1993-01-01

    GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.

  20. Non-coding RNAs mediate the rearrangements of genomic DNA in ciliates.

    PubMed

    Feng, Xuezhu; Guang, Shouhong

    2013-10-01

    Most eukaryotes employ a variety of mechanisms to defend the integrity of their genome by recognizing and silencing parasitic mobile nucleic acids. However, recent studies have shown that genomic DNA undergoes extensive rearrangements, including DNA elimination, fragmentation, and unscrambling, during the sexual reproduction of ciliated protozoa. Non-coding RNAs have been identified to program and regulate genome rearrangement events. In Paramecium and Tetrahymena, scan RNAs (scnRNAs) are produced from micronuclei and transported to vegetative macronuclei, in which scnRNA elicits the elimination of cognate genomic DNA. In contrast, Piwi-interacting RNAs (piRNAs) in Oxytricha enable the retention of genomic DNA that exhibits sequence complementarity in macronuclei. An RNA interference (RNAi)-like mechanism has been found to direct these genomic rearrangements. Furthermore, in Oxytricha, maternal RNA templates can guide the unscrambling process of genomic DNA. The non-coding RNA-directed genome rearrangements may have profound evolutionary implications, for example, eliciting the multigenerational inheritance of acquired adaptive traits. PMID:24008384

  1. Integration of molecular functions at the ecosystemic level: breakthroughs and future goals of environmental genomics and post-genomics

    PubMed Central

    Vandenkoornhuyse, Philippe; Dufresne, Alexis; Quaiser, Achim; Gouesbet, Gwenola; Binet, Françoise; Francez, André-Jean; Mahé, Stéphane; Bormans, Myriam; Lagadeuc, Yvan; Couée, Ivan

    2010-01-01

    Environmental genomics and genome-wide expression approaches deal with large-scale sequence-based information obtained from environmental samples, at organismal, population or community levels. To date, environmental genomics, transcriptomics and proteomics are arguably the most powerful approaches to discover completely novel ecological functions and to link organismal capabilities, organism–environment interactions, functional diversity, ecosystem processes, evolution and Earth history. Thus, environmental genomics is not merely a toolbox of new technologies but also a source of novel ecological concepts and hypotheses. By removing previous dichotomies between ecophysiology, population ecology, community ecology and ecosystem functioning, environmental genomics enables the integration of sequence-based information into higher ecological and evolutionary levels. However, environmental genomics, along with transcriptomics and proteomics, must involve pluridisciplinary research, such as new developments in bioinformatics, in order to integrate high-throughput molecular biology techniques into ecology. In this review, the validity of environmental genomics and post-genomics for studying ecosystem functioning is discussed in terms of major advances and expectations, as well as in terms of potential hurdles and limitations. Novel avenues for improving the use of these approaches to test theory-driven ecological hypotheses are also explored. PMID:20426792

  2. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement

    PubMed Central

    Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  3. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement.

    PubMed

    Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  4. Small-RNA-Mediated Genome-wide trans-Recognition Network in Tetrahymena DNA Elimination.

    PubMed

    Noto, Tomoko; Kataoka, Kensuke; Suhren, Jan H; Hayashi, Azusa; Woolcock, Katrina J; Gorovsky, Martin A; Mochizuki, Kazufumi

    2015-07-16

    Small RNAs are used to silence transposable elements (TEs) in many eukaryotes, which use diverse evolutionary solutions to identify TEs. In ciliated protozoans, small-RNA-mediated comparison of the germline and somatic genomes underlies identification of TE-related sequences, which are then eliminated from the soma. Here, we describe an additional mechanism of small-RNA-mediated identification of TE-related sequences in the ciliate Tetrahymena. We show that a limited set of internal eliminated sequences (IESs) containing potentially active TEs produces a class of small RNAs that recognize not only the IESs from which they are derived, but also other IESs in trans. This trans recognition triggers the expression of yet another class of small RNAs that identify other IESs. Therefore, TE-related sequences in Tetrahymena are robustly targeted for elimination by a genome-wide trans-recognition network accompanied by a chain reaction of small RNA production.

  5. Small-RNA-Mediated Genome-wide trans-Recognition Network in Tetrahymena DNA Elimination

    PubMed Central

    Noto, Tomoko; Kataoka, Kensuke; Suhren, Jan H.; Hayashi, Azusa; Woolcock, Katrina J.; Gorovsky, Martin A.; Mochizuki, Kazufumi

    2015-01-01

    Summary Small RNAs are used to silence transposable elements (TEs) in many eukaryotes, which use diverse evolutionary solutions to identify TEs. In ciliated protozoans, small-RNA-mediated comparison of the germline and somatic genomes underlies identification of TE-related sequences, which are then eliminated from the soma. Here, we describe an additional mechanism of small-RNA-mediated identification of TE-related sequences in the ciliate Tetrahymena. We show that a limited set of internal eliminated sequences (IESs) containing potentially active TEs produces a class of small RNAs that recognize not only the IESs from which they are derived, but also other IESs in trans. This trans recognition triggers the expression of yet another class of small RNAs that identify other IESs. Therefore, TE-related sequences in Tetrahymena are robustly targeted for elimination by a genome-wide trans-recognition network accompanied by a chain reaction of small RNA production. PMID:26095658

  6. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  7. Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing

    PubMed Central

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E.; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology. PMID:23193293

  8. Human Ageing Genomic Resources: integrated databases and tools for the biology and genetics of ageing.

    PubMed

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology.

  9. Gene-centric approach to integrating environmental genomics and biogeochemical models.

    PubMed

    Reed, Daniel C; Algar, Christopher K; Huber, Julie A; Dick, Gregory J

    2014-02-01

    Rapid advances in molecular microbial ecology have yielded an unprecedented amount of data about the evolutionary relationships and functional traits of microbial communities that regulate global geochemical cycles. Biogeochemical models, however, are trailing in the wake of the environmental genomics revolution, and such models rarely incorporate explicit representations of bacteria and archaea, nor are they compatible with nucleic acid or protein sequence data. Here, we present a functional gene-based framework for describing microbial communities in biogeochemical models by incorporating genomics data to provide predictions that are readily testable. To demonstrate the approach in practice, nitrogen cycling in the Arabian Sea oxygen minimum zone (OMZ) was modeled to examine key questions about cryptic sulfur cycling and dinitrogen production pathways in OMZs. Simulations support previous assertions that denitrification dominates over anammox in the central Arabian Sea, which has important implications for the loss of fixed nitrogen from the oceans. Furthermore, cryptic sulfur cycling was shown to attenuate the secondary nitrite maximum often observed in OMZs owing to changes in the composition of the chemolithoautotrophic community and dominant metabolic pathways. Results underscore the need to explicitly integrate microbes into biogeochemical models rather than just the metabolisms they mediate. By directly linking geochemical dynamics to the genetic composition of microbial communities, the method provides a framework for achieving mechanistic insights into patterns and biogeochemical consequences of marine microbes. Such an approach is critical for informing our understanding of the key role microbes play in modulating Earth's biogeochemistry.

  10. Rapid and efficient clathrin-mediated endocytosis revealed in genome-edited mammalian cells.

    PubMed

    Doyon, Jeffrey B; Zeitler, Bryan; Cheng, Jackie; Cheng, Aaron T; Cherone, Jennifer M; Santiago, Yolanda; Lee, Andrew H; Vo, Thuy D; Doyon, Yannick; Miller, Jeffrey C; Paschon, David E; Zhang, Lei; Rebar, Edward J; Gregory, Philip D; Urnov, Fyodor D; Drubin, David G

    2011-03-01

    Clathrin-mediated endocytosis (CME) is the best-studied pathway by which cells selectively internalize molecules from the plasma membrane and surrounding environment. Previous live-cell imaging studies using ectopically overexpressed fluorescent fusions of endocytic proteins indicated that mammalian CME is a highly dynamic but inefficient and heterogeneous process. In contrast, studies of endocytosis in budding yeast using fluorescent protein fusions expressed at physiological levels from native genomic loci have revealed a process that is very regular and efficient. To analyse endocytic dynamics in mammalian cells in which endogenous protein stoichiometry is preserved, we targeted zinc finger nucleases (ZFNs) to the clathrin light chain A and dynamin-2 genomic loci and generated cell lines expressing fluorescent protein fusions from each locus. The genome-edited cells exhibited enhanced endocytic function, dynamics and efficiency when compared with previously studied cells, indicating that CME is highly sensitive to the levels of its protein components. Our study establishes that ZFN-mediated genome editing is a robust tool for expressing protein fusions at endogenous levels to faithfully report subcellular localization and dynamics.

  11. A Phenotype-Driven Dimension Reduction (PhDDR) Approach to Integrated Genomic Association Analyses

    PubMed Central

    Gao, Cuilan; Cheng, Cheng

    2013-01-01

    An immediate challenge in integrated genomic analysis involving several types of genomic factors all measured genome-wide is the ultra-high dimensionality. Screening all possible relationships among the genomic factors is an NP-hard problem; therefore in practice proper dimension reduction is necessary. In this paper we develop the Phenotype-Driven Dimension Reduction (PhDDR) approach to the analysis of gene co-expressions, and discuss its extensions to integration of other genetic factors. This approach is then illustrated by an application to gene co-expression analysis of treatment response of childhood leukemia. PMID:22255909

  12. Genetics of immune-mediated disorders: from genome-wide association to molecular mechanism

    PubMed Central

    Kumar, Vinod; Wijmenga, Cisca; Xavier, Ramnik J.

    2016-01-01

    Genetic association studies have identified not only hundreds of susceptibility loci to immune-mediated diseases but also pinpointed causal amino-acid variants of HLA genes that contribute to many autoimmune reactions. Majority of non-HLA genetic variants are located within non-coding regulatory region. Expression QTL studies have shown that these variants affect disease mainly by regulating gene expression. We discuss recent findings on shared genetic loci between infectious and immune-mediated diseases and provide potential clues to explore genetic associations in the context of these infectious agents. We propose that the interdisciplinary studies (genetics-genomics-immunology-infection-bioinformatics) are the future post-GWAS approaches to advance our understanding of the pathogenesis of immune-mediated diseases. PMID:25458995

  13. Stakeholder engagement: a key component of integrating genomic information into electronic health records.

    PubMed

    Hartzler, Andrea; McCarty, Catherine A; Rasmussen, Luke V; Williams, Marc S; Brilliant, Murray; Bowton, Erica A; Clayton, Ellen Wright; Faucett, William A; Ferryman, Kadija; Field, Julie R; Fullerton, Stephanie M; Horowitz, Carol R; Koenig, Barbara A; McCormick, Jennifer B; Ralston, James D; Sanderson, Saskia C; Smith, Maureen E; Trinidad, Susan Brown

    2013-10-01

    Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine.

  14. Goldmine integrates information placing genomic ranges into meaningful biological contexts

    PubMed Central

    Bhasin, Jeffrey M.; Ting, Angela H.

    2016-01-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  15. Goldmine integrates information placing genomic ranges into meaningful biological contexts.

    PubMed

    Bhasin, Jeffrey M; Ting, Angela H

    2016-07-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  16. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    PubMed

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/.

  17. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources

    PubMed Central

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/ PMID:26589635

  18. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

    PubMed Central

    Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

    2015-01-01

    Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors. PMID:26102582

  19. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. PMID:25480115

  20. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  1. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows.

    PubMed

    Poelstra, J W; Vijay, N; Bossu, C M; Lantz, H; Ryll, B; Müller, I; Baglione, V; Unneberg, P; Wikelski, M; Grabherr, M G; Wolf, J B W

    2014-06-20

    The importance, extent, and mode of interspecific gene flow for the evolution of species has long been debated. Characterization of genomic differentiation in a classic example of hybridization between all-black carrion crows and gray-coated hooded crows identified genome-wide introgression extending far beyond the morphological hybrid zone. Gene expression divergence was concentrated in pigmentation genes expressed in gray versus black feather follicles. Only a small number of narrow genomic islands exhibited resistance to gene flow. One prominent genomic region (<2 megabases) harbored 81 of all 82 fixed differences (of 8.4 million single-nucleotide polymorphisms in total) linking genes involved in pigmentation and in visual perception-a genomic signal reflecting color-mediated prezygotic isolation. Thus, localized genomic selection can cause marked heterogeneity in introgression landscapes while maintaining phenotypic divergence. PMID:24948738

  2. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows.

    PubMed

    Poelstra, J W; Vijay, N; Bossu, C M; Lantz, H; Ryll, B; Müller, I; Baglione, V; Unneberg, P; Wikelski, M; Grabherr, M G; Wolf, J B W

    2014-06-20

    The importance, extent, and mode of interspecific gene flow for the evolution of species has long been debated. Characterization of genomic differentiation in a classic example of hybridization between all-black carrion crows and gray-coated hooded crows identified genome-wide introgression extending far beyond the morphological hybrid zone. Gene expression divergence was concentrated in pigmentation genes expressed in gray versus black feather follicles. Only a small number of narrow genomic islands exhibited resistance to gene flow. One prominent genomic region (<2 megabases) harbored 81 of all 82 fixed differences (of 8.4 million single-nucleotide polymorphisms in total) linking genes involved in pigmentation and in visual perception-a genomic signal reflecting color-mediated prezygotic isolation. Thus, localized genomic selection can cause marked heterogeneity in introgression landscapes while maintaining phenotypic divergence.

  3. The Populus Genome Integrative Explorer (PopGenIE): a new resource for exploring the Populus genome.

    PubMed

    Sjödin, Andreas; Street, Nathaniel Robert; Sandberg, Göran; Gustafsson, Petter; Jansson, Stefan

    2009-06-01

    Populus has become an important model plant system. However, utilization of the increasingly extensive collection of genetics and genomics data created by the community is currently hindered by the lack of a central resource, such as a model organism database (MOD). Such MODs offer a single entry point to the collection of resources available within a model system, typically including tools for exploring and querying those resources. As a starting point to overcoming the lack of such an MOD for Populus, we present the Populus Genome Integrative Explorer (PopGenIE), an integrated set of tools for exploring the Populus genome and transcriptome. The resource includes genome, synteny and quantitative trait locus (QTL) browsers for exploring genetic data. Expression tools include an electronic fluorescent pictograph (eFP) browser, expression profile plots, co-regulation within collated transcriptomics data sets, and identification of over-represented functional categories and genomic hotspot locations. A number of collated transcriptomics data sets are made available in the eFP browser to facilitate functional exploration of gene function. Additional homology and data extraction tools are provided. PopGenIE significantly increases accessibility to Populus genomics resources and allows exploration of transcriptomics data without the need to learn or understand complex statistical analysis methods. PopGenIE is available at www.popgenie.org or via www.populusgenome.info.

  4. Collaboration of MLLT1/ENL, Polycomb and ATM for transcription and genome integrity.

    PubMed

    Ui, Ayako; Yasui, Akira

    2016-04-25

    Polycomb group (PcG) repress, whereas Trithorax group (TrxG) activate transcription for tissue development and cellular proliferation, and misregulation of these factors is often associated with cancer. ENL (MLLT1) and AF9 (MLLT3) are fusion partners of Mixed Lineage Leukemia (MLL), TrxG proteins, and are factors in Super Elongation Complex (SEC). SEC controls transcriptional elongation to release RNA polymerase II, paused around transcription start site. In MLL rearranged leukemia, several components of SEC have been found as MLL-fusion partners and the control of transcriptional elongation is misregulated leading to tumorigenesis in MLL-SEC fused Leukemia. It has been suggested that unexpected collaboration of ENL/AF9-MLL and PcG are involved in tumorigenesis in leukemia. Recently, we found that the collaboration of ENL/AF9 and PcG led to a novel mechanism of transcriptional switch from elongation to repression under ATM-signaling for genome integrity. Activated ATM phosphorylates ENL/AF9 in SEC, and the phosphorylated ENL/AF9 binds BMI1 and RING1B, a heterodimeric E3-ubiquitin-ligase complex in Polycomb Repressive complex 1 (PRC1), and recruits PRC1 at transcriptional elongation sites to rapidly repress transcription. The ENL/AF9 in SEC- and PcG-mediated transcriptional repression promotes DSB repair near transcription sites. The implication of this is that the collaboration of ENL/AF9 in SEC and PcG ensures a rapid response of transcriptional switching from elongation to repression to neighboring genotoxic stresses for DSB repair. Therefore, these results suggested that the collaboration of ENL/AF9 and PcG in transcriptional control is required to maintain genome integrity and may be link to the MLL-ENL/AF9 leukemia. PMID:27310306

  5. A high utility integrated map of the pig genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The domestic pig is being increasingly exploited as a system for modeling human disease. It also has substantial economic importance for meat-based protein production. Physical clone maps have underpinned large-scale genomic sequencing and enabled focused cloning efforts for many genome...

  6. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas

    PubMed Central

    2015-01-01

    BACKGROUND Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. METHODS We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. RESULTS Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. CONCLUSIONS The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q

  7. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes.

    PubMed

    Shirasawa, Kenta; Bertioli, David J; Varshney, Rajeev K; Moretzsohn, Marcio C; Leal-Bertioli, Soraya C M; Thudi, Mahendar; Pandey, Manish K; Rami, Jean-Francois; Foncéka, Daniel; Gowda, Makanahally V C; Qin, Hongde; Guo, Baozhu; Hong, Yanbin; Liang, Xuanqiang; Hirakawa, Hideki; Tabata, Satoshi; Isobe, Sachiko

    2013-04-01

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)(4×), were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding. PMID:23315685

  8. The genome clinic: a multidisciplinary approach to assessing the opportunities and challenges of integrating genomic analysis into clinical care.

    PubMed

    Bowdin, Sarah; Ray, Peter N; Cohn, Ronald D; Meyn, M Stephen

    2014-05-01

    Our increasing knowledge of how genomic variants affect human health and the falling costs of whole-genome sequencing are driving the development of individualized genetic medicine. This new clinical paradigm uses knowledge of an individual's genomic variants to guide health care decisions throughout life, to anticipate, diagnose, and manage disease. While individualized genetic medicine offers the promise of transformative change in health care, it forces us to reconsider existing ethical, scientific, and clinical paradigms. The potential benefits of presymptomatic identification of at risk individuals, improved diagnostics, individualized therapy, accurate prognosis, and avoidance of adverse drug reactions coexist with the potential risks of uninterpretable results, psychological harm, outmoded counseling models, and increased health care costs. Here, we review the challenges of integrating genomic analysis into clinical practice and describe a prototype for implementing genetic medicine. Our multidisciplinary team of bioinformaticians, health economists, ethicists, geneticists, genetic counselors, and clinicians has designed a "Genome Clinic" research project that addresses multiple challenges in genomic medicine-ranging from the development of bioinformatics tools for the clinical assessment of genomic variants and the discovery of disease genes to health policy inquiries, assessment of clinical care models, patient preference, and the ethics of consent.

  9. Ku-Mediated Coupling of DNA Cleavage and Repair during Programmed Genome Rearrangements in the Ciliate Paramecium tetraurelia

    PubMed Central

    Marmignon, Antoine; Bischerour, Julien; Silve, Aude; Fojcik, Clémentine; Dubois, Emeline; Arnaiz, Olivier; Kapusta, Aurélie; Malinsky, Sophie; Bétermier, Mireille

    2014-01-01

    During somatic differentiation, physiological DNA double-strand breaks (DSB) can drive programmed genome rearrangements (PGR), during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES). IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ) pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium DNA cleavage

  10. CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

    PubMed

    Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

    2015-12-17

    Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. PMID:26686651

  11. Genome-wide signatures of male-mediated migration shaping the Indian gene pool.

    PubMed

    ArunKumar, GaneshPrasad; Tatarinova, Tatiana V; Duty, Jeff; Rollo, Debra; Syama, Adhikarla; Arun, Varatharajan Santhakumari; Kavitha, Valampuri John; Triska, Petr; Greenspan, Bennett; Wells, R Spencer; Pitchappan, Ramasamy

    2015-09-01

    Multiple questions relating to contributions of cultural and demographical factors in the process of human geographical dispersal remain largely unanswered. India, a land of early human settlement and the resulting diversity is a good place to look for some of the answers. In this study, we explored the genetic structure of India using a diverse panel of 78 males genotyped using the GenoChip. Their genome-wide single-nucleotide polymorphism (SNP) diversity was examined in the context of various covariates that influence Indian gene pool. Admixture analysis of genome-wide SNP data showed high proportion of the Southwest Asian component in all of the Indian samples. Hierarchical clustering based on admixture proportions revealed seven distinct clusters correlating to geographical and linguistic affiliations. Convex hull overlay of Y-chromosomal haplogroups on the genome-wide SNP principal component analysis brought out distinct non-overlapping polygons of F*-M89, H*-M69, L1-M27, O2a-M95 and O3a3c1-M117, suggesting a male-mediated migration and expansion of the Indian gene pool. Lack of similar correlation with mitochondrial DNA clades indicated a shared genetic ancestry of females. We suggest that ancient male-mediated migratory events and settlement in various regional niches led to the present day scenario and peopling of India.

  12. CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

    PubMed

    Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

    2015-12-17

    Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases.

  13. Genome-wide signatures of male-mediated migration shaping the Indian gene pool.

    PubMed

    ArunKumar, GaneshPrasad; Tatarinova, Tatiana V; Duty, Jeff; Rollo, Debra; Syama, Adhikarla; Arun, Varatharajan Santhakumari; Kavitha, Valampuri John; Triska, Petr; Greenspan, Bennett; Wells, R Spencer; Pitchappan, Ramasamy

    2015-09-01

    Multiple questions relating to contributions of cultural and demographical factors in the process of human geographical dispersal remain largely unanswered. India, a land of early human settlement and the resulting diversity is a good place to look for some of the answers. In this study, we explored the genetic structure of India using a diverse panel of 78 males genotyped using the GenoChip. Their genome-wide single-nucleotide polymorphism (SNP) diversity was examined in the context of various covariates that influence Indian gene pool. Admixture analysis of genome-wide SNP data showed high proportion of the Southwest Asian component in all of the Indian samples. Hierarchical clustering based on admixture proportions revealed seven distinct clusters correlating to geographical and linguistic affiliations. Convex hull overlay of Y-chromosomal haplogroups on the genome-wide SNP principal component analysis brought out distinct non-overlapping polygons of F*-M89, H*-M69, L1-M27, O2a-M95 and O3a3c1-M117, suggesting a male-mediated migration and expansion of the Indian gene pool. Lack of similar correlation with mitochondrial DNA clades indicated a shared genetic ancestry of females. We suggest that ancient male-mediated migratory events and settlement in various regional niches led to the present day scenario and peopling of India. PMID:25994871

  14. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of synteny with model fish genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper we generated DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is compose...

  15. Zygote-mediated generation of genome-modified mice using Streptococcus thermophilus 1-derived CRISPR/Cas system.

    PubMed

    Fujii, Wataru; Kakuta, Shigeru; Yoshioka, Shin; Kyuwa, Shigeru; Sugiura, Koji; Naito, Kunihiko

    2016-08-26

    Mammalian zygote-mediated genome-engineering by CRISPR/Cas is currently used for the generation of genome-modified animals. Here we report that a Streptococcus thermophilus-1 derived orthologous CRISPR/Cas system, which recognizes the 5'-NNAGAA sequence as a protospacer adjacent motif (PAM), is useful in mouse zygotes and is applicable for generating knockout mice (87.5%) and targeted knock-in mice (45.5%). The induced mutation could be inherited in the next generation. This novel CRISPR/Cas can expand the feasibility of the zygote-mediated generation of genome-modified animals that require an exact mutation design. PMID:27318086

  16. Challenges in experimental data integration within genome-scale metabolic models.

    PubMed

    Bourguignon, Pierre-Yves; Samal, Areejit; Képès, François; Jost, Jürgen; Martin, Olivier C

    2010-01-01

    A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincaré, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology.

  17. Genetic and statistical study of HIV integration in the human genome

    NASA Astrophysics Data System (ADS)

    Sequeira, Inês J.; Gonçalves, Juliana; Moreira, Elsa; Mexia, João T.; Rueff, José; Brás, Aldina

    2013-10-01

    Integration of the human immunodeficiency virus (HIV) DNA into human genome is essential for HIV-induced disease. The human genome is organized into chromosomes and within these we can define the chromosomal fragile sites. Our aim is to contribute to help clarifying the integration sites preferences of HIV1 and HIV2 in fragile or non-fragile regions. Here we apply statistical techniques, namely non-parametric tests and analysis of variance for analyzing two sets of data of HIV1 and HIV2 integrations in the human genome. The results show that the integrations occur significantly with more intensity in the non-fragile regions of the human genome and that the HIV1 in particular has the major contribution to this fact. This study could have implications in human disease.

  18. Challenges in experimental data integration within genome-scale metabolic models

    PubMed Central

    2010-01-01

    A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincaré, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology. PMID:20412574

  19. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4.

    PubMed

    Gao, Song; Zhang, Liang; Rao, Venigalla B

    2016-05-19

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  20. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4

    PubMed Central

    Gao, Song; Zhang, Liang; Rao, Venigalla B.

    2016-01-01

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo. On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  1. R-loop-mediated genomic instability is caused by impairment of replication fork progression.

    PubMed

    Gan, Wenjian; Guan, Zhishuang; Liu, Jie; Gui, Ting; Shen, Keng; Manley, James L; Li, Xialu

    2011-10-01

    Transcriptional R loops are anomalous RNA:DNA hybrids that have been detected in organisms from bacteria to humans. These structures have been shown in eukaryotes to result in DNA damage and rearrangements; however, the mechanisms underlying these effects have remained largely unknown. To investigate this, we first show that R-loop formation induces chromosomal DNA rearrangements and recombination in Escherichia coli, just as it does in eukaryotes. More importantly, we then show that R-loop formation causes DNA replication fork stalling, and that this in fact underlies the effects of R loops on genomic stability. Strikingly, we found that attenuation of replication strongly suppresses R-loop-mediated DNA rearrangements in both E. coli and HeLa cells. Our findings thus provide a direct demonstration that R-loop formation impairs DNA replication and that this is responsible for the deleterious effects of R loops on genome stability from bacteria to humans.

  2. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    NEALSON, KENNETH H.

    2013-10-15

    products of dissimilatory iron reduction. Geochim. Cosmochim. Acta. 74:574-583. 10. Karpinets, T.V., A.Y Obraztsova, Y. Wang, D.D. Schmoyer, G.H. Kora, B.H. Park, M.H. Serres, M.F. Ropmine, M.L. Land, T.B. Kothe, J.K. Fredrickson, K.H. Nealson, and E.C. Uberbacher 2010. Conserved synteny at the protein family level reveals genes underlying Shewanella species? cold tolerance and predicts their novel phenotypes. Funct. Integr. Genomics 10: 97 ? 110. (DOI 10.1007/s10143-009-0142-y) 11. Bretschger, O., A.C.M. Cheung, F. Mansfeld, and K.H. Nealson. 2010. Comparative microbial fuel cell evaluations of Shewanella spp. Electroanalysis 22: 883-894. 12. McLean, J.S., G. Wanger, Y.A. Gorby, M. Wainstein, J. McQuaid, Shun?ichi Ishii, O. Bretschger, H. Beyanal, K.H. Nealson. 2010. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Env. Sci. Technol. 44:2721-2717. 13. El-Naggar, M., G. Wanger, K.M. Leung, T.D. Yuzvinsky, G. Southam, J. Yang, W.M. Lau, K.H. Nealson, and Y.A. Gorby. 2010. Electrical Transport Along Bacterial Nanowires from Shewanella oneidensis MR-1 Proc. Nat. Acad. Sci. USA 107:18127-18131. 14. Biffinger, J.C., L.A. Fitzgerald, R. Ray, B.J. Little, S.E. Lizewski, E.R. Petersen, B.R. Ringeisen, W.C. Sanders, P.E. Sheehan, J.J. Pietron, J.W. Baldwin, L.J. Nadeau, G.R. Johnson, M. Ribbens, S.E. Finkel, K.H. Nealson. 2010. The utility of Shewanella japonica for microbial fuel cells. Bioresource Technol. 102:290-297. 15. Rodionov, D. , C. Yang, X. Li, I. Rodionova, Y. Wang, A.Y. Obraztsova, O. P. Zagnitko, R. Overbeek, M. F. Romine, S. Reed, J.K. Fredrickson, K.H. Nealson, A.L. Osterman. 2010. Genomic encyclopedia of sugar utilization pathways in the Shewanella genus. BMC Genomics 2010, 11:494 16. Kan, J., L. Hsu, A.C.M. Cheung, M. Pirbazari, and K.H. Nealson. 2011. Current production by bacterial communities in microbial fuel cells enriched from wastewater sludge

  3. Methods for integrating moderation and mediation: a general analytical framework using moderated path analysis.

    PubMed

    Edwards, Jeffrey R; Lambert, Lisa Schurer

    2007-03-01

    Studies that combine moderation and mediation are prevalent in basic and applied psychology research. Typically, these studies are framed in terms of moderated mediation or mediated moderation, both of which involve similar analytical approaches. Unfortunately, these approaches have important shortcomings that conceal the nature of the moderated and the mediated effects under investigation. This article presents a general analytical framework for combining moderation and mediation that integrates moderated regression analysis and path analysis. This framework clarifies how moderator variables influence the paths that constitute the direct, indirect, and total effects of mediated models. The authors empirically illustrate this framework and give step-by-step instructions for estimation and interpretation. They summarize the advantages of their framework over current approaches, explain how it subsumes moderated mediation and mediated moderation, and describe how it can accommodate additional moderator and mediator variables, curvilinear relationships, and structural equation models with latent variables.

  4. The Encapsidated Genome of Microplitis demolitor Bracovirus Integrates into the Host Pseudoplusia includens ▿ ‡

    PubMed Central

    Beck, Markus H.; Zhang, Shu; Bitra, Kavita; Burke, Gaelen R.; Strand, Michael R.

    2011-01-01

    Polydnaviruses (PDVs) are symbionts of parasitoid wasps that function as gene delivery vehicles in the insects (hosts) that the wasps parasitize. PDVs persist in wasps as integrated proviruses but are packaged as circularized and segmented double-stranded DNAs into the virions that wasps inject into hosts. In contrast, little is known about how PDV genomic DNAs persist in host cells. Microplitis demolitor carries Microplitis demolitor bracovirus (MdBV) and parasitizes the host Pseudoplusia includens. MdBV infects primarily host hemocytes and also infects a hemocyte-derived cell line from P. includens called CiE1 cells. Here we report that all 15 genomic segments of the MdBV encapsidated genome exhibited long-term persistence in CiE1 cells. Most MdBV genes expressed in hemocytes were persistently expressed in CiE1 cells, including members of the glc gene family whose products transformed CiE1 cells into a suspension culture. PCR-based integration assays combined with cloning and sequencing of host-virus junctions confirmed that genomic segments J and C persisted in CiE1 cells by integration. These genomic DNAs also rapidly integrated into parasitized P. includens. Sequence analysis of wasp-viral junction clones showed that the integration of proviral segments in M. demolitor was associated with a wasp excision/integration motif (WIM) known from other bracoviruses. However, integration into host cells occurred in association with a previously unknown domain that we named the host integration motif (HIM). The presence of HIMs in most MdBV genomic DNAs suggests that the integration of each genomic segment into host cells occurs through a shared mechanism. PMID:21880747

  5. Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry

    PubMed Central

    Desiere, Frank; Deutsch, Eric W; Nesvizhskii, Alexey I; Mallick, Parag; King, Nichole L; Eng, Jimmy K; Aderem, Alan; Boyle, Rose; Brunner, Erich; Donohoe, Samuel; Fausto, Nelson; Hafen, Ernst; Hood, Lee; Katze, Michael G; Kennedy, Kathleen A; Kregenow, Floyd; Lee, Hookeun; Lin, Biaoyang; Martin, Dan; Ranish, Jeffrey A; Rawlings, David J; Samelson, Lawrence E; Shiio, Yuzuru; Watts, Julian D; Wollscheid, Bernd; Wright, Michael E; Yan, Wei; Yang, Lihong; Yi, Eugene C; Zhang, Hui; Aebersold, Ruedi

    2005-01-01

    A crucial aim upon the completion of the human genome is the verification and functional annotation of all predicted genes and their protein products. Here we describe the mapping of peptides derived from accurate interpretations of protein tandem mass spectrometry (MS) data to eukaryotic genomes and the generation of an expandable resource for integration of data from many diverse proteomics experiments. Furthermore, we demonstrate that peptide identifications obtained from high-throughput proteomics can be integrated on a large scale with the human genome. This resource could serve as an expandable repository for MS-derived proteome information. PMID:15642101

  6. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes.

    PubMed

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  7. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes

    PubMed Central

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  8. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes.

    PubMed

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms.

  9. Identification of cancer-related lncRNAs through integrating genome, regulome and transcriptome features.

    PubMed

    Zhao, Tingting; Xu, Jinyuan; Liu, Ling; Bai, Jing; Xu, Chaohan; Xiao, Yun; Li, Xia; Zhang, Liming

    2015-01-01

    LncRNAs have become rising stars in biology and medicine, due to their versatile functions in a wide range of important biological processes and active roles in various human cancers. Here, we developed a computational method based on the naïve Bayesian classifier method to identify cancer-related lncRNAs by integrating genome, regulome and transcriptome data, and identified 707 potential cancer-related lncRNAs. We demonstrated the performance of the method by ten-fold cross-validation, and found that integration of multi-omic data was necessary to identify cancer-related lncRNAs. We identified 707 potential cancer-related lncRNAs and our results showed that these lncRNAs tend to exhibit significant differential expression and differential DNA methylation in multiple cancer types, and prognosis effects in prostate cancer. We also found that these lncRNAs were more likely to be direct targets of TP53 family members than others. Moreover, based on 147 lncRNA knockdown data in mice, we validated that four of six mouse orthologous lncRNAs were significantly involved in many cancer-related processes, such as cell differentiation and the Wnt signaling pathway. Notably, one lncRNA, lnc-SNURF-1, which was found to be associated with TNF-mediated signaling pathways, was up-regulated in prostate cancer and the protein-coding genes affected by knockdown of the lncRNA were also significantly aberrant in prostate cancer patients, suggesting its probable importance in tumorigenesis. Taken together, our method underlines the power of integrating multi-omic data to uncover cancer-related lncRNAs.

  10. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  11. An integrated encyclopedia of DNA elements in the human genome.

    PubMed

    2012-09-01

    The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

  12. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  13. NDRG1 links p53 with proliferation-mediated centrosome homeostasis and genome stability

    PubMed Central

    Croessmann, Sarah; Wong, Hong Yuen; Zabransky, Daniel J.; Chu, David; Mendonca, Janet; Sharma, Anup; Mohseni, Morassa; Rosen, D. Marc; Scharpf, Robert B.; Cidado, Justin; Cochran, Rory L.; Parsons, Heather A.; Dalton, W. Brian; Erlanger, Bracha; Button, Berry; Cravero, Karen; Kyker-Snowman, Kelly; Beaver, Julia A.; Kachhap, Sushant; Hurley, Paula J.; Lauring, Josh; Park, Ben Ho

    2015-01-01

    The tumor protein 53 (TP53) tumor suppressor gene is the most frequently somatically altered gene in human cancers. Here we show expression of N-Myc down-regulated gene 1 (NDRG1) is induced by p53 during physiologic low proliferative states, and mediates centrosome homeostasis, thus maintaining genome stability. When placed in physiologic low-proliferating conditions, human TP53 null cells fail to increase expression of NDRG1 compared with isogenic wild-type controls and TP53 R248W knockin cells. Overexpression and RNA interference studies demonstrate that NDRG1 regulates centrosome number and amplification. Mechanistically, NDRG1 physically associates with γ-tubulin, a key component of the centrosome, with reduced association in p53 null cells. Strikingly, TP53 homozygous loss was mutually exclusive of NDRG1 overexpression in over 96% of human cancers, supporting the broad applicability of these results. Our study elucidates a mechanism of how TP53 loss leads to abnormal centrosome numbers and genomic instability mediated by NDRG1. PMID:26324937

  14. RNase H2 roles in genome integrity revealed by unlinking its activities

    PubMed Central

    Chon, Hyongi; Sparks, Justin L.; Rychlik, Monika; Nowotny, Marcin; Burgers, Peter M.; Crouch, Robert J.; Cerritelli, Susana M.

    2013-01-01

    Ribonuclease H2 (RNase H2) protects genome integrity by its dual roles of resolving transcription-related R-loops and ribonucleotides incorporated in DNA during replication. To unlink these two functions, we generated a Saccharomyces cerevisiae RNase H2 mutant that can resolve R-loops but cannot cleave single ribonucleotides in DNA. This mutant definitively correlates the 2–5 bp deletions observed in rnh201Δ strains with single rNMPs in DNA. It also establishes a connection between R-loops and Sgs1-mediated replication reinitiation at stalled forks and identifies R-loops uniquely processed by RNase H2. In mouse, deletion of any of the genes coding for RNase H2 results in embryonic lethality, and in humans, RNase H2 hypomorphic mutations cause Aicardi–Goutières syndrome (AGS), a neuroinflammatory disorder. To determine the contribution of R-loops and rNMP in DNA to the defects observed in AGS, we characterized in yeast an AGS-related mutation, which is impaired in processing both substrates, but has sufficient R-loop degradation activity to complement the defects of rnh201Δ sgs1Δ strains. However, this AGS-related mutation accumulates 2–5 bp deletions at a very similar rate as the deletion strain. PMID:23355612

  15. Integrated genome-based studies of Shewanella ecophysiology

    SciTech Connect

    Segre Daniel; Beg Qasim

    2012-02-14

    This project was a component of the Shewanella Federation and, as such, contributed to the overall goal of applying the genomic tools to better understand eco-physiology and speciation of respiratory-versatile members of Shewanella genus. Our role at Boston University was to perform bioreactor and high throughput gene expression microarrays, and combine dynamic flux balance modeling with experimentally obtained transcriptional and gene expression datasets from different growth conditions. In the first part of project, we designed the S. oneidensis microarray probes for Affymetrix Inc. (based in California), then we identified the pathways of carbon utilization in the metal-reducing marine bacterium Shewanella oneidensis MR-1, using our newly designed high-density oligonucleotide Affymetrix microarray on Shewanella cells grown with various carbon sources. Next, using a combination of experimental and computational approaches, we built algorithm and methods to integrate the transcriptional and metabolic regulatory networks of S. oneidensis. Specifically, we combined mRNA microarray and metabolite measurements with statistical inference and dynamic flux balance analysis (dFBA) to study the transcriptional response of S. oneidensis MR-1 as it passes through exponential, stationary, and transition phases. By measuring time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different nutrient compositions (minimal lactate and nutritionally rich LB medium), we obtain detailed snapshots of the regulatory strategies used by this bacterium to cope with gradually changing nutrient availability. In addition to traditional clustering, which provides a first indication of major regulatory trends and transcription factors activities, we developed and implemented a new computational approach for Dynamic Detection of Transcriptional Triggers (D2T2). This new method allows us to infer a putative topology of transcriptional dependencies

  16. Integrated genomics of Mucorales reveals novel therapeutic targets

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  17. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    TIEDJE, JAMES M; KONSTANTINIDIS, KOSTAS; WORDEN, MARK

    2014-01-08

    The aim of the work reported is to study Shewanella population genomics, and to understand the evolution, ecophysiology, and speciation of Shewanella. The tasks supporting this aim are: to study genetic and ecophysiological bases defining the core and diversification of Shewanella species; to determine gene content patterns along redox gradients; and to Investigate the evolutionary processes, patterns and mechanisms of Shewanella.

  18. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Cucurbitaceae includes important crops as cucumber, melon, watermelon, and squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular li...

  19. The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse.

    PubMed

    Blake, Judith A; Bult, Carol J; Eppig, Janan T; Kadin, James A; Richardson, Joel E

    2014-01-01

    The Mouse Genome Database (MGD) (http://www.informatics.jax.org) is the community model organism database resource for the laboratory mouse, a premier animal model for the study of genetic and genomic systems relevant to human biology and disease. MGD maintains a comprehensive catalog of genes, functional RNAs and other genome features as well as heritable phenotypes and quantitative trait loci. The genome feature catalog is generated by the integration of computational and manual genome annotations generated by NCBI, Ensembl and Vega/HAVANA. MGD curates and maintains the comprehensive listing of functional annotations for mouse genes using the Gene Ontology, and MGD curates and integrates comprehensive phenotype annotations including associations of mouse models with human diseases. Recent improvements include integration of the latest mouse genome build (GRCm38), improved access to comparative and functional annotations for mouse genes with expanded representation of comparative vertebrate genomes and new loads of phenotype data from high-throughput phenotyping projects. All MGD resources are freely available to the research community.

  20. Enhanced targeted integration mediated by translocated I-SceI during the Agrobacterium mediated transformation of yeast.

    PubMed

    Rolloos, Martijn; Hooykaas, Paul J J; van der Zaal, Bert J

    2015-02-09

    Agrobacterium mediated transformation (AMT) has been embraced by biotechnologists as the technology of choice to introduce or alter genetic traits of plants. However, in plants it is virtually impossible to predetermine the integration site of the transferred T-strand unless one is able to generate a double stranded break (DSB) in the DNA at the site of interest. In this study, we used the model organism Saccharomyces cerevisiae to investigate whether the Agrobacterium mediated translocation of site-specific endonucleases via the type IV secretion system (T4SS), concomitantly with T-DNA transfer is possible and whether this can improve the gene targeting efficiency. In addition to that, the effect of different chromatin states on targeted integration, was investigated. It was found that Agrobacterium mediated translocation of the homing endonuclease I-SceI has a positive effect on the integration of T-DNA via the homologous repair (HR) pathway. Furthermore, we obtained evidence that nucleosome removal has a positive effect on I-SceI facilitated T-DNA integration by HR. Reversely; inducing nucleosome formation at the site of integration removes the positive effect of translocated I-SceI on T-DNA integration.

  1. Heritable CRISPR/Cas9-mediated genome editing in the yellow fever mosquito, Aedes aegypti.

    PubMed

    Dong, Shengzhang; Lin, Jingyi; Held, Nicole L; Clem, Rollie J; Passarelli, A Lorena; Franz, Alexander W E

    2015-01-01

    In vivo targeted gene disruption is a powerful tool to study gene function. Thus far, two tools for genome editing in Aedes aegypti have been applied, zinc-finger nucleases (ZFN) and transcription activator-like effector nucleases (TALEN). As a promising alternative to ZFN and TALEN, which are difficult to produce and validate using standard molecular biological techniques, the clustered regularly interspaced short palindromic repeats/CRISPR-associated sequence 9 (CRISPR/Cas9) system has recently been discovered as a "do-it-yourself" genome editing tool. Here, we describe the use of CRISPR/Cas9 in the mosquito vector, Aedes aegypti. In a transgenic mosquito line expressing both Dsred and enhanced cyan fluorescent protein (ECFP) from the eye tissue-specific 3xP3 promoter in separated but tightly linked expression cassettes, we targeted the ECFP nucleotide sequence for disruption. When supplying the Cas9 enzyme and two sgRNAs targeting different regions of the ECFP gene as in vitro transcribed mRNAs for germline transformation, we recovered four different G1 pools (5.5% knockout efficiency) where individuals still expressed DsRed but no longer ECFP. PCR amplification, cloning, and sequencing of PCR amplicons revealed indels in the ECFP target gene ranging from 2-27 nucleotides. These results show for the first time that CRISPR/Cas9 mediated gene editing is achievable in Ae. aegypti, paving the way for further functional genomics related studies in this mosquito species. PMID:25815482

  2. Integrating cytogenetics and genomics in comparative evolutionary studies of cichlid fish

    PubMed Central

    2012-01-01

    Background The availability of a large number of recently sequenced vertebrate genomes opens new avenues to integrate cytogenetics and genomics in comparative and evolutionary studies. Cytogenetic mapping can offer alternative means to identify conserved synteny shared by distinct genomes and also to define genome regions that are still not fine characterized even after wide-ranging nucleotide sequence efforts. An efficient way to perform comparative cytogenetic mapping is based on BAC clones mapping by fluorescence in situ hybridization. In this report, to address the knowledge gap on the genome evolution in cichlid fishes, BAC clones of an Oreochromis niloticus library covering the linkage groups (LG) 1, 3, 5, and 7 were mapped onto the chromosomes of 9 African cichlid species. The cytogenetic mapping data were also integrated with BAC-end sequences information of O. niloticus and comparatively analyzed against the genome of other fish species and vertebrates. Results The location of BACs from LG1, 3, 5, and 7 revealed a strong chromosomal conservation among the analyzed cichlid species genomes, which evidenced a synteny of the markers of each LG. Comparative in silico analysis also identified large genomic blocks that were conserved in distantly related fish groups and also in other vertebrates. Conclusions Although it has been suggested that fishes contain plastic genomes with high rates of chromosomal rearrangements and probably low rates of synteny conservation, our results evidence that large syntenic chromosome segments have been maintained conserved during evolution, at least for the considered markers. Additionally, our current cytogenetic mapping efforts integrated with genomic approaches conduct to a new perspective to address important questions involving chromosome evolution in fishes. PMID:22958299

  3. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    PubMed Central

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  4. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

    PubMed

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  5. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells.

    PubMed

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H

    2015-09-22

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis.

  6. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells

    PubMed Central

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H.

    2015-01-01

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  7. Childhood Acute Lymphoblastic Leukemia: Integrating Genomics into Therapy

    PubMed Central

    Tasian, Sarah K; Loh, Mignon L; Hunger, Stephen P

    2015-01-01

    Acute lymphoblastic leukemia (ALL), the most common malignancy of childhood, is a genetically complex entity that remains a major cause of childhood cancer-related mortality. Major advances in genomic and epigenomic profiling during the past decade have appreciably enhanced knowledge of the biology of de novo and relapsed ALL and have facilitated more precise risk stratification of patients. These achievements have also provided critical insights regarding potentially targetable lesions for development of new therapeutic approaches in the era of precision medicine. This review delineates the current genetic landscape of childhood ALL with emphasis upon patient outcomes with contemporary treatment regimens, as well as therapeutic implications of newly identified genomic alterations in specific subsets of ALL. PMID:26194091

  8. Towards integration of population and comparative genomics in forest trees.

    PubMed

    Ingvarsson, Pär K; Hvidsten, Torgeir R; Street, Nathaniel R

    2016-10-01

    Contents 338 I. 338 II. 339 III. 340 IV. 342 343 References 343 SUMMARY: The past decade saw the initiation of an ongoing revolution in sequencing technologies that is transforming all fields of biology. This has been driven by the advent and widespread availability of high-throughput, massively parallel short-read sequencing (MPS) platforms. These technologies have enabled previously unimaginable studies, including draft assemblies of the massive genomes of coniferous species and population-scale resequencing. Transcriptomics studies have likewise been transformed, with RNA-sequencing enabling studies in nonmodel organisms, the discovery of previously unannotated genes (novel transcripts), entirely new classes of RNAs and previously unknown regulatory mechanisms. Here we touch upon current developments in the areas of genome assembly, comparative regulomics and population genetics as they relate to studies of forest tree species. PMID:27575589

  9. Integrative environmental genomics of Cod (Gadus morhua): the proteomics approach.

    PubMed

    Karlsen, Odd André; Bjørneklett, Silje; Berg, Karin; Brattås, Marianne; Bohne-Kjersem, Anneli; Grøsvik, Bjørn Einar; Goksøyr, Anders

    2011-01-01

    Atlantic cod (Gadus morhua) is an essential species in North Atlantic fisheries and increasingly relevant as an aquaculture species. However, potential conflicts with both coastal industry and petroleum industry expanding into northern waters make it important to understand how effluents (produced water, pharmaceuticals, food contaminants, and feed contaminants) affect the growth, reproduction, and health of this species in order to maintain a sustainable cod population and a healthy human food source, and to discover biomarkers for environmental monitoring and risk assessment. The ongoing genome sequencing effort of Atlantic cod has opened the possibility for a systems biology approach to elucidate molecular mechanisms of toxicity. Our study aims to be a first step toward such a systems toxicology understanding of genomic responses to environmental insults. A toxicogenomic approach was initiated that is combining data generated from proteomics analyses and transcriptomics analyses, and the concurrent development of searchable expressed sequence tags (EST) databases and genomic databases. This interdisciplinary study may also open new possibilities of gene annotation and pathway analyses.

  10. Genome maintenance and transcription integrity in aging and disease

    PubMed Central

    Wolters, Stefanie; Schumacher, Björn

    2013-01-01

    DNA damage contributes to cancer development and aging. Congenital syndromes that affect DNA repair processes are characterized by cancer susceptibility, developmental defects, and accelerated aging (Schumacher et al., 2008). DNA damage interferes with DNA metabolism by blocking replication and transcription. DNA polymerase blockage leads to replication arrest and can gives rise to genome instability. Transcription, on the other hand, is an essential process for utilizing the information encoded in the genome. DNA damage that interferes with transcription can lead to apoptosis and cellular senescence. Both processes are powerful tumor suppressors (Bartek and Lukas, 2007). Cellular response mechanisms to stalled RNA polymerase II complexes have only recently started to be uncovered. Transcription-coupled DNA damage responses might thus play important roles for the adjustments to DNA damage accumulation in the aging organism (Garinis et al., 2009). Here we review human disorders that are caused by defects in genome stability to explore the role of DNA damage in aging and disease. We discuss how the nucleotide excision repair system functions at the interface of transcription and repair and conclude with concepts how therapeutic targeting of transcription might be utilized in the treatment of cancer. PMID:23443494

  11. A pilot bridging data integration and analytics: BioMediator and R?

    PubMed

    Jeng, S; Wang, K; Barbero, J; Brinkley, J; Tarczy-Hornoch, P

    2005-01-01

    Biological research today involves aggregating and analyzing large amounts of data from disparate sources. Tools such as the University of Washington's BioMediator system integrate heterogeneous data. Analytic packages such as the R environment have a rich set of tools to analyze biomedical research data. Our pilot project bridged data integration and analytics in a general way by successfully incorporating the BioMediator system into the R platform for specific analyses on neurophysiologic research data.

  12. HIV-1 Integrates Widely throughout the Genome of the Human Blood Fluke Schistosoma mansoni

    PubMed Central

    Mann, Victoria H.; Dubrovsky, Larisa; Yan, Hong-bin; Huckvale, Thomas; Protasio, Anna V.; Pushkarsky, Tatiana; Iordanskiy, Sergey; Bukrinsky, Michael I.

    2016-01-01

    Schistosomiasis is the most important helminthic disease of humanity in terms of morbidity and mortality. Facile manipulation of schistosomes using lentiviruses would enable advances in functional genomics in these and related neglected tropical diseases pathogens including tapeworms, and including their non-dividing cells. Such approaches have hitherto been unavailable. Blood stream forms of the human blood fluke, Schistosoma mansoni, the causative agent of the hepatointestinal schistosomiasis, were infected with the human HIV-1 isolate NL4-3 pseudotyped with vesicular stomatitis virus glycoprotein. The appearance of strong stop and positive strand cDNAs indicated that virions fused to schistosome cells, the nucleocapsid internalized and the RNA genome reverse transcribed. Anchored PCR analysis, sequencing HIV-1-specific anchored Illumina libraries and Whole Genome Sequencing (WGS) of schistosomes confirmed chromosomal integration; >8,000 integrations were mapped, distributed throughout the eight pairs of chromosomes including the sex chromosomes. The rate of integrations in the genome exceeded five per 1,000 kb and HIV-1 integrated into protein-encoding loci and elsewhere with integration bias dissimilar to that of human T cells. We estimated ~ 2,100 integrations per schistosomulum based on WGS, i.e. about two or three events per cell, comparable to integration rates in human cells. Accomplishment in schistosomes of post-entry processes essential for HIV-1replication, including integrase-catalyzed integration, was remarkable given the phylogenetic distance between schistosomes and primates, the natural hosts of the genus Lentivirus. These enigmatic findings revealed that HIV-1 was active within cells of S. mansoni, and provided the first demonstration that HIV-1 can integrate into the genome of an invertebrate. PMID:27764257

  13. Genomic characterization of viral integration sites in HPV-related cancers.

    PubMed

    Bodelon, Clara; Untereiner, Michael E; Machiela, Mitchell J; Vinokurova, Svetlana; Wentzensen, Nicolas

    2016-11-01

    Persistent infection with carcinogenic human papillomaviruses (HPV) causes the majority of anogenital cancers and a subset of head and neck cancers. The HPV genome is frequently found integrated into the host genome of invasive cancers. The mechanisms of how it may promote disease progression are not well understood. Thoroughly characterizing integration events can provide insights into HPV carcinogenesis. Individual studies have reported limited number of integration sites in cell lines and human samples. We performed a systematic review of published integration sites in HPV-related cancers and conducted a pooled analysis to formally test for integration hotspots and genomic features enriched in integration events using data from the Encyclopedia of DNA Elements (ENCODE). Over 1,500 integration sites were reported in the literature, of which 90.8% (N = 1,407) were in human tissues. We found 10 cytobands enriched for integration events, three previously reported ones (3q28, 8q24.21 and 13q22.1) and seven additional ones (2q22.3, 3p14.2, 8q24.22, 14q24.1, 17p11.1, 17q23.1 and 17q23.2). Cervical infections with HPV18 were more likely to have breakpoints in 8q24.21 (p = 7.68 × 10(-4) ) than those with HPV16. Overall, integration sites were more likely to be in gene regions than expected by chance (p = 6.93 × 10(-9) ). They were also significantly closer to CpG regions, fragile sites, transcriptionally active regions and enhancers. Few integration events occurred within 50 Kb of known cervical cancer driver genes. This suggests that HPV integrates in accessible regions of the genome, preferentially genes and enhancers, which may affect the expression of target genes. PMID:27343048

  14. Integrated pathway-genome databases and their role in drug discovery.

    PubMed

    Karp, P D; Krummenacker, M; Paley, S; Wagg, J

    1999-07-01

    Integrated pathway-genome databases describe the genes and genome of an organism, as well as its predicted pathways, reactions, enzymes and metabolites. In conjunction with visualization and analysis software, these databases provide a framework for improved understanding of microbial physiology and for antimicrobial drug discovery. We describe pathway-based analyses of the genomes of a number of medically relevant microorganisms and a novel software tool that visualizes gene-expression data on a diagram showing the whole metabolic network of the microorganism.

  15. ITEP: An integrated toolkit for exploration of microbial pan-genomes

    PubMed Central

    2014-01-01

    Background Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. Results We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP’s capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network

  16. Integrating Genomic Resources with Electronic Health Records using the HL7 Infobutton Standard

    PubMed Central

    Overby, Casey Lynnette; Del Fiol, Guilherme; Rubinstein, Wendy S.; Maglott, Donna R.; Nelson, Tristan H.; Milosavljevic, Aleksandar; Martin, Christa L.; Goehringer, Scott R.; Freimuth, Robert R.; Williams, Marc S.

    2016-01-01

    Summary Background The Clinical Genome Resource (ClinGen) Electronic Health Record (EHR) Workgroup aims to integrate ClinGen resources with EHRs. A promising option to enable this integration is through the Health Level Seven (HL7) Infobutton Standard. EHR systems that are certified according to the US Meaningful Use program provide HL7-compliant infobutton capabilities, which can be leveraged to support clinical decision-making in genomics. Objectives To integrate genomic knowledge resources using the HL7 infobutton standard. Two tactics to achieve this objective were: (1) creating an HL7-compliant search interface for ClinGen, and (2) proposing guidance for genomic resources on achieving HL7 Infobutton standard accessibility and compliance. Methods We built a search interface utilizing OpenInfobutton, an open source reference implementation of the HL7 Infobutton standard. ClinGen resources were assessed for readiness towards HL7 compliance. Finally, based upon our experiences we provide recommendations for publishers seeking to achieve HL7 compliance. Results Eight genomic resources and two sub-resources were integrated with the ClinGen search engine via OpenInfobutton and the HL7 infobutton standard. Resources we assessed have varying levels of readiness towards HL7-compliance. Furthermore, we found that adoption of standard terminologies used by EHR systems is the main gap to achieve compliance. Conclusion Genomic resources can be integrated with EHR systems via the HL7 Infobutton standard using OpenInfobutton. Full compliance of genomic resources with the Infobutton standard would further enhance interoperability with EHR systems. PMID:27579472

  17. Integration of genomic medicine into pathology residency training: the stanford open curriculum.

    PubMed

    Schrijver, Iris; Natkunam, Yasodha; Galli, Stephen; Boyd, Scott D

    2013-03-01

    Next-generation sequencing methods provide an opportunity for molecular pathology laboratories to perform genomic testing that is far more comprehensive than single-gene analyses. Genome-based test results are expected to develop into an integral component of diagnostic clinical medicine and to provide the basis for individually tailored health care. To achieve these goals, rigorous interpretation of high-quality data must be informed by the medical history and the phenotype of the patient. The discipline of pathology is well positioned to implement genome-based testing and to interpret its results, but new knowledge and skills must be included in the training of pathologists to develop expertise in this area. Pathology residents should be trained in emerging technologies to integrate genomic test results appropriately with more traditional testing, to accelerate clinical studies using genomic data, and to help develop appropriate standards of data quality and evidence-based interpretation of these test results. We have created a genomic pathology curriculum as a first step in helping pathology residents build a foundation for the understanding of genomic medicine and its implications for clinical practice. This curriculum is freely accessible online.

  18. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system

    SciTech Connect

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D.; Markowitz, Victor M.; Kyrpides, Nikos C.

    2009-05-01

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across a statistically significant and phylogeneticaly diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate and explore gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  19. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de). PMID:15888676

  20. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  1. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  2. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  3. High-throughput genomic mapping of vector integration sites in gene therapy studies.

    PubMed

    Beard, Brian C; Adair, Jennifer E; Trobridge, Grant D; Kiem, Hans-Peter

    2014-01-01

    Gene therapy has enormous potential to treat a variety of infectious and genetic diseases. To date hundreds of patients worldwide have received hematopoietic cell products that have been gene-modified with retrovirus vectors carrying therapeutic transgenes, and many patients have been cured or demonstrated disease stabilization as a result (Adair et al., Sci Transl Med 4:133ra57, 2012; Biffi et al., Science 341:1233158, 2013; Aiuti et al., Science 341:1233151, 2013; Fischer et al., Gene 525:170-173, 2013). Unfortunately, for some patients the provirus integration dysregulated the expression of nearby genes leading to clonal outgrowth and, in some cases, cancer. Thus, the unwanted side effect of insertional mutagenesis has become a major concern for retrovirus gene therapy. The careful study of retrovirus integration sites (RIS) and the contribution of individual gene-modified clones to hematopoietic repopulating cells is of crucial importance for all gene therapy studies. Supporting this, the US Food and Drug Administration (FDA) has mandated the careful monitoring of RIS in all clinical trials of gene therapy. An invaluable method was developed: linear amplification mediated-polymerase chain reaction (LAM-PCR) capable of analyzing in vitro and complex in vivo samples, capturing valuable genomic information directly flanking the site of provirus integration. Linking this method and similar methods to high-throughput sequencing has now made possible an unprecedented understanding of the integration profile of various retrovirus vectors, and allows for sensitive monitoring of their safety. It also allows for a detailed comparison of improved safety-enhanced gene therapy vectors. An important readout of safety is the relative contribution of individual gene-modified repopulating clones. One limitation of LAM-PCR is that the ability to capture the relative contribution of individual clones is compromised because of the initial linear PCR common to all current methods

  4. VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites.

    PubMed

    Calabria, Andrea; Leo, Simone; Benedicenti, Fabrizio; Cesana, Daniela; Spinozzi, Giulio; Orsini, Massimilano; Merella, Stefania; Stupka, Elia; Zanetti, Gianluigi; Montini, Eugenio

    2014-01-01

    The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings. PMID:25342980

  5. Integrated genome-wide chromatin occupancy and expression analyses identify key myeloid pro-differentiation transcription factors repressed by Myb.

    PubMed

    Zhao, Liang; Glazov, Evgeny A; Pattabiraman, Diwakar R; Al-Owaidi, Faisal; Zhang, Ping; Brown, Matthew A; Leo, Paul J; Gonda, Thomas J

    2011-06-01

    To gain insight into the mechanisms by which the Myb transcription factor controls normal hematopoiesis and particularly, how it contributes to leukemogenesis, we mapped the genome-wide occupancy of Myb by chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) in ERMYB myeloid progenitor cells. By integrating the genome occupancy data with whole genome expression profiling data, we identified a Myb-regulated transcriptional program. Gene signatures for leukemia stem cells, normal hematopoietic stem/progenitor cells and myeloid development were overrepresented in 2368 Myb regulated genes. Of these, Myb bound directly near or within 793 genes. Myb directly activates some genes known critical in maintaining hematopoietic stem cells, such as Gfi1 and Cited2. Importantly, we also show that, despite being usually considered as a transactivator, Myb also functions to repress approximately half of its direct targets, including several key regulators of myeloid differentiation, such as Sfpi1 (also known as Pu.1), Runx1, Junb and Cebpb. Furthermore, our results demonstrate that interaction with p300, an established coactivator for Myb, is unexpectedly required for Myb-mediated transcriptional repression. We propose that the repression of the above mentioned key pro-differentiation factors may contribute essentially to Myb's ability to suppress differentiation and promote self-renewal, thus maintaining progenitor cells in an undifferentiated state and promoting leukemic transformation.

  6. Site-specific in situ amplification of the integrated polyomavirus genome: a case for a context-specific over-replication model of gene amplification.

    PubMed

    Syu, L J; Fluck, M M

    1997-08-01

    The fate of the genome of the polyoma (Py) tumor virus following integration in the chromosomes of transformed rat FR3T3 cells was re-examined. The viral sequences were integrated at a single transformant-specific chromosomal site in each of 22 transformants tested. In situ amplification of the viral sequences was observed in 24 of 34 transformants analyzed. Large T antigen, the unique viral function involved in initiating DNA replication from the viral origin, was essential for the amplification process. There was an absolute requirement for a reiteration of viral sequences and the extent of the reiteration affected the degree of amplification. The reiteration may be important for homologous recombination-mediated resolution of in situ amplified sequences. Among 11 transformants harboring a 1 to 2 kb repeat, the degree of amplification was transformant-specific and varied over a wide range. At the high end of the spectrum, the genome copy number increased 1300-fold at steady state, while at the low end, amplification was below twofold. Some aspect of the host chromatin at the site integration that affected viral gene expression, also directly or indirectly modulated the amplification. Use of high-resolution electrophoresis for the analysis of the integrated amplified sequences revealed a recurring novel pattern, consisting of a ladder with numerous bands separated by a constant distance approximately the size of the Py genome. We suggest that this pattern was generated by conversion of the amplified viral genomes to head to tail linear arrays with cell to cell variations in the number of genome repeats at single, transformant-specific, chromosomal sites. In light of the known "out of schedule" firing of the Py origin, we propose an "onion skin" structure intermediate and present a homologous recombination model for the conversion from onion skins to linear arrays. The relevance of the in situ amplification of the Py genome to cellular gene amplification is

  7. From integrative genomics to systems genetics in the rat to link genotypes to phenotypes

    PubMed Central

    Moreno-Moral, Aida

    2016-01-01

    ABSTRACT Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. PMID:27736746

  8. Cerebral White Matter Integrity Mediates Adult Age Differences in Cognitive Performance

    ERIC Educational Resources Information Center

    Madden, David J.; Spaniol, Julia; Costello, Matthew C.; Bucur, Barbara; White, Leonard E.; Cabeza, Roberto; Davis, Simon W.; Dennis, Nancy A.; Provenzale, James M.; Huettel, Scott A.

    2009-01-01

    Previous research has established that age-related decline occurs in measures of cerebral white matter integrity, but the role of this decline in age-related cognitive changes is not clear. To conclude that white matter integrity has a mediating (causal) contribution, it is necessary to demonstrate that statistical control of the white…

  9. Increasing the Efficiency of CRISPR/Cas9-mediated Precise Genome Editing of HSV-1 Virus in Human Cells

    PubMed Central

    Lin, Chaolong; Li, Huanhuan; Hao, Mengru; Xiong, Dan; Luo, Yong; Huang, Chenghao; Yuan, Quan; Zhang, Jun; Xia, Ningshao

    2016-01-01

    Genetically modified HSV-1 viruses serve as promising vectors for tumour therapy and vaccine development. The CRISPR/Cas9 system is one of the most powerful tools for precise gene editing of the genomes of organisms. However, whether the CRISPR/Cas9 system can precisely and efficiently make gene replacements in the genome of HSV-1 remains essentially unknown. Here, we reported CRISPR/Cas9-mediated editing of the HSV-1 genome in human cells, including the knockout and replacement of large genes. In established cells stably expressing CRISPR/Cas9, gRNA in coordination with Cas9 could direct a precise cleavage within a pre-defined target region, and foreign genes were successfully used to replace the target gene seamlessly by HDR-mediated gene replacement. Introducing the NHEJ inhibitor SCR7 to the CRISPR/Cas9 system greatly facilitated HDR-mediated gene replacement in the HSV-1 genome. We provided the first genetic evidence that two copies of the ICP0 gene in different locations on the same HSV-1 genome could be simultaneously modified with high efficiency and with no off-target modifications. We also developed a revolutionized isolation platform for desired recombinant viruses using single-cell sorting. Together, our work provides a significantly improved method for targeted editing of DNA viruses, which will facilitate the development of anti-cancer oncolytic viruses and vaccines. PMID:27713537

  10. Integration of physical and genetic maps in apple confirms whole-genome and segmental duplications in the apple genome

    PubMed Central

    Han, Yuepeng; Zheng, Danman; Vimolmangkang, Sornkanok; Khan, Muhammad A.; Beever, Jonathan E.; Korban, Schuyler S.

    2011-01-01

    A total of 355 simple sequence repeat (SSR) markers were developed, based on expressed sequence tag (EST) and bacterial artificial chromosome (BAC)-end sequence databases, and successfully used to construct an SSR-based genetic linkage map of the apple. The consensus linkage map spanned 1143 cM, with an average density of 2.5 cM per marker. Newly developed SSR markers along with 279 SSR markers previously published by the HiDRAS project were further used to integrate physical and genetic maps of the apple using a PCR-based BAC library screening approach. A total of 470 contigs were unambiguously anchored onto all 17 linkage groups of the apple genome, and 158 contigs contained two or more molecular markers. The genetically mapped contigs spanned ∼421 Mb in cumulative physical length, representing 60.0% of the genome. The sizes of anchored contigs ranged from 97 kb to 4.0 Mb, with an average of 995 kb. The average physical length of anchored contigs on each linkage group was ∼24.8 Mb, ranging from 17.0 Mb to 37.73 Mb. Using BAC DNA as templates, PCR screening of the BAC library amplified fragments of highly homologous sequences from homoeologous chromosomes. Upon integrating physical and genetic maps of the apple, the presence of not only homoeologous chromosome pairs, but also of multiple locus markers mapped to adjacent sites on the same chromosome was detected. These findings demonstrated the presence of both genome-wide and segmental duplications in the apple genome and provided further insights into the complex polyploid ancestral origin of the apple. PMID:21743103

  11. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology.

    PubMed

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho; Park, Kwang-Wook

    2016-03-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs. PMID:27051344

  12. A genome-wide analysis of promoter-mediated phenotypic noise in Escherichia coli.

    PubMed

    Silander, Olin K; Nikolic, Nela; Zaslaver, Alon; Bren, Anat; Kikoin, Ilya; Alon, Uri; Ackermann, Martin

    2012-01-01

    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as "phenotypic noise." In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone.

  13. A Genome-Wide Analysis of Promoter-Mediated Phenotypic Noise in Escherichia coli

    PubMed Central

    Silander, Olin K.; Nikolic, Nela; Zaslaver, Alon; Bren, Anat; Kikoin, Ilya; Alon, Uri; Ackermann, Martin

    2012-01-01

    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as “phenotypic noise.” In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone. PMID:22275871

  14. Functional validation of tensin2 SH2-PTB domain by CRISPR/Cas9-mediated genome editing

    PubMed Central

    MARUSUGI, Kiyoma; NAKANO, Kenta; SASAKI, Hayato; KIMURA, Junpei; YANOBU-TAKANASHI, Rieko; OKAMURA, Tadashi; SASAKI, Nobuya

    2016-01-01

    Podocytes are terminally differentiated and highly specialized cells in the glomerulus, and they form a crucial component of the glomerular filtration barrier. The ICGN mouse is a model of glomerular dysfunction that shows gross morphological changes in the podocyte foot process, accompanied by proteinuria. Previously, we demonstrated that proteinuria in ICR-derived glomerulonephritis mouse ICGN mice might be caused by a deletion mutation in the tensin2 (Tns2) gene (designated Tns2nph). To test whether this mutation causes the mutant phenotype, we created knockout (KO) mice carrying a Tns2 protein deletion in the C-terminal Src homology and phosphotyrosine binding (SH2-PTB) domains (designated Tns2ΔC) via CRISPR/Cas9-mediated genome editing. Tns2nph/Tns2ΔC compound heterozygotes and Tns2ΔC/Tns2ΔC homozygous KO mice displayed podocyte abnormalities and massive proteinuria similar to ICGN mice, indicating that these two mutations are allelic. Further, this result suggests that the SH2-PTB domain of Tns2 is required for podocyte integrity. Tns2 knockdown in a mouse podocyte cell line significantly enhanced actin stress fiber formation and cell migration. Thus, this study provides evidence that alteration of actin remodeling resulting from Tns2 deficiency causes morphological changes in podocytes and subsequent proteinuria. PMID:27246398

  15. Tc1-like Transposase Thm3 of Silver Carp (Hypophthalmichthys molitrix) Can Mediate Gene Transposition in the Genome of Blunt Snout Bream (Megalobrama amblycephala)

    PubMed Central

    Guo, Xiu-Ming; Zhang, Qian-Qian; Sun, Yi-Wen; Jiang, Xia-Yun; Zou, Shu-Ming

    2015-01-01

    Tc1-like transposons consist of an inverted repeat sequence flanking a transposase gene that exhibits similarity to the mobile DNA element, Tc1, of the nematode, Caenorhabditis elegans. They are widely distributed within vertebrate genomes including teleost fish; however, few active Tc1-like transposases have been discovered. In this study, 17 Tc1-like transposon sequences were isolated from 10 freshwater fish species belonging to the families Cyprinidae, Adrianichthyidae, Cichlidae, and Salmonidae. We conducted phylogenetic analyses of these sequences using previously isolated Tc1-like transposases and report that 16 of these elements comprise a new subfamily of Tc1-like transposons. In particular, we show that one transposon, Thm3 from silver carp (Hypophthalmichthys molitrix; Cyprinidae), can encode a 335-aa transposase with apparently intact domains, containing three to five copies in its genome. We then coinjected donor plasmids harboring 367 bp of the left end and 230 bp of the right end of the nonautonomous silver carp Thm1 cis-element along with capped Thm3 transposase RNA into the embryos of blunt snout bream (Megalobrama amblycephala; one- to two-cell embryos). This experiment revealed that the average integration rate could reach 50.6% in adult fish. Within the blunt snout bream genome, the TA dinucleotide direct repeat, which is the signature of Tc1-like family of transposons, was created adjacent to both ends of Thm1 at the integration sites. Our results indicate that the silver carp Thm3 transposase can mediate gene insertion by transposition within the genome of blunt snout bream genome, and that this occurs with a TA position preference. PMID:26438298

  16. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia.

    PubMed

    Williams, Anna V; Miller, Joseph T; Small, Ian; Nevill, Paul G; Boykin, Laura M

    2016-03-01

    Combining whole genome data with previously obtained amplicon sequences has the potential to increase the resolution of phylogenetic analyses, particularly at low taxonomic levels or where recent divergence, rapid speciation or slow genome evolution has resulted in limited sequence variation. However, the integration of these types of data for large scale phylogenetic studies has rarely been investigated. Here we conduct a phylogenetic analysis of the whole chloroplast genome and two nuclear ribosomal loci for 65 Acacia species from across the most recent Acacia phylogeny. We then combine this data with previously generated amplicon sequences (four chloroplast loci and two nuclear ribosomal loci) for 508 Acacia species. We use several phylogenetic methods, including maximum likelihood bootstrapping (with and without constraint) and ExaBayes, in order to determine the success of combining a dataset of 4000bp with one of 189,000bp. The results of our study indicate that the inclusion of whole genome data gave a far better resolved and well supported representation of the phylogenetic relationships within Acacia than using only amplicon sequences, with the greatest support observed when using a whole genome phylogeny as a constraint on the amplicon sequences. Our study therefore provides methods for optimal integration of genomic and amplicon sequences.

  17. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia.

    PubMed

    Williams, Anna V; Miller, Joseph T; Small, Ian; Nevill, Paul G; Boykin, Laura M

    2016-03-01

    Combining whole genome data with previously obtained amplicon sequences has the potential to increase the resolution of phylogenetic analyses, particularly at low taxonomic levels or where recent divergence, rapid speciation or slow genome evolution has resulted in limited sequence variation. However, the integration of these types of data for large scale phylogenetic studies has rarely been investigated. Here we conduct a phylogenetic analysis of the whole chloroplast genome and two nuclear ribosomal loci for 65 Acacia species from across the most recent Acacia phylogeny. We then combine this data with previously generated amplicon sequences (four chloroplast loci and two nuclear ribosomal loci) for 508 Acacia species. We use several phylogenetic methods, including maximum likelihood bootstrapping (with and without constraint) and ExaBayes, in order to determine the success of combining a dataset of 4000bp with one of 189,000bp. The results of our study indicate that the inclusion of whole genome data gave a far better resolved and well supported representation of the phylogenetic relationships within Acacia than using only amplicon sequences, with the greatest support observed when using a whole genome phylogeny as a constraint on the amplicon sequences. Our study therefore provides methods for optimal integration of genomic and amplicon sequences. PMID:26702955

  18. WIT : integrated system for high-throughput genome sequence analysis and metabolic reconstruction.

    SciTech Connect

    Overbeek, R.; Larsen, N.; Pusch, G. D.; D'Souza, M.; Selkov, E., Jr.; Kyrpides, N.; Fonstein, M.; Maltsev, N.; Selkov, S.; Mathematics and Computer Science; Integrated Genomics, Inc.

    2000-01-01

    The WIT (What Is There) (http://wit.mcs.anl.gov/WIT2/ ) system has been designed to support comparative analysis of sequenced genomes and to generate metabolic reconstructions based on chromosomal sequences and metabolic modules from the EMP/MPW family of databases. This system contains data derived from about 40 completed or nearly completed genomes. Sequence homologies, various ORF-clustering algorithms, relative gene positions on the chromosome and placement of gene products in metabolic pathways (metabolic reconstruction) can be used for the assignment of gene functions and for development of overviews of genomes within WIT. The integration of a large number of phylogenetically diverse genomes in WIT facilitates the understanding of the physiology of different organisms.

  19. [Prolonging the vase life of carnation "Mabel" through integrating repeated ACC oxidase genes into its genome].

    PubMed

    Yu, Yi-Xun; Bao, Man-Zhu

    2004-10-01

    Carnation (Dianthus caryophyllus L.) is one of the most important cut flowers. The cultivar "Mabel" of carnation was transformed with direct repeat gene of ACC oxidase, the key enzyme in ethylene synthesis, driven by the CaMV35S promoter mediated by Agrobacterium tumefacien. Hygromycin phosphotransferase (HPT) gene was used as selection marker. Leaf explants were pre-cultured on shoot-inducing medium for 2 d, then immersed in Agrobacterium suspension for 8-12 min. Co-cultivation was carried out on the medium (MS+BA 1.0 mg/L+NAA 0.3 mg/L +Acetosyringone 100 micromol/L, pH 5.8-6.0) for 3 d. After that transformants were obtained by transferring explants to selection medium supplemented with 5 mg/L hygromycin (Hyg) and 400 mg/L cefotaxime (Cef). Southern blotting detection showed that a foreign gene was integrated into the carnation genome and 3 transgenic lines (T257, T299 and T273 line) obtained. Addition of acetosyringone and the time of co-culture were the main factors that influenced transformation frequency. After being transplanted to soil, transgenic plants were grew normally in greenhouse. Ethylene production of cut flower of transgenic T257 line was 95% lower than that of the control, and that of T299 line was reduced by 90% than that of the control, while that of transgenic T273 line has no of significantly different from control. Vase life of transgenic T257 line was 5 d longer than that of the control line at 25 degrees C.

  20. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    PubMed

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  1. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    PubMed

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076

  2. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

    PubMed Central

    Irizarry, Kristopher J. L.; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L.; Barrett, Gini; Barr, Margaret C.

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076

  3. Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

    PubMed

    Bibby, Kyle

    2014-02-01

    The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.

  4. Barriers and potential solutions for Critical Zone data integration between environmental genomics and the geosciences

    NASA Astrophysics Data System (ADS)

    Aronson, E. L.; Meyer, F.; Packman, A. I.; Mayorga, E.

    2015-12-01

    The Earth's permeable near-surface layer from bedrock to canopy is referred to as the Critical Zone (CZ). Integration of bio- and geoscience data is critical for understanding physical, biological and chemical interactions in the CZ. Genomic and meta-genomic scientists study organisms both in laboratory settings and in the environment, in order to understand the interactions of organisms with the environment. Geoscientists are using environmental data to describe and model dynamics of physical and chemical properties. Yet, there is no agreed upon method for integrating genomic and environmental data to address interactions of living and non-living components of the CZ. There are standards for data interchange being developed in the geosciences and genomics sciences, via standards organization such as the Open Geospatial Consortium (OGC), as well as by research communities in biogeochemistry, hydrology, climatology, and other fields. These are in parallel to, but typically not in coordination with the standards the Genomics Standards Consortium (GSC) is developing for genomics. In addition, efforts are being made to allow for intercompatability of these CZ data with data generated by NEON, Inc. The interoperability of these types of data is limited with current software and cyberinfrastructure. A group of CZ geoscientists, environmental genomic scientists and cyberinfrastructure scientists are coming together to develop a set of common data collection and integration methods and sets of common standards. The data generated by this effort across multiple CZ sites (including the US CZ Observatories, or CZOs) around the world, along with NEON facility data, will be used to test EarthCube (an NSF initiative to develop cyberinfrastructure for the geosciences) cyberinfrastructure, with the goal of bridging this gap in standards and interoperability. Potential solutions to these issues of interoperability will be presented, and a way forward will be described.

  5. Filling the knowledge gap: Integrating quantitative genetics and genomics in graduate education and outreach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomics revolution provides vital tools to address global food security. Yet to be incorporated into livestock breeding, molecular techniques need to be integrated into a quantitative genetics framework. Within the U.S., with shrinking faculty numbers with the requisite skills, the capacity to ...

  6. Integrated and translational genomics for analysis of complex traits in crops

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report here on integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of translating gems from these resources into useable DNA markers in the ...

  7. Microhomology-mediated end-joining-dependent integration of donor DNA in cells and animals using TALENs and CRISPR/Cas9.

    PubMed

    Nakade, Shota; Tsubota, Takuya; Sakane, Yuto; Kume, Satoshi; Sakamoto, Naoaki; Obara, Masanobu; Daimon, Takaaki; Sezutsu, Hideki; Yamamoto, Takashi; Sakuma, Tetsushi; Suzuki, Ken-ichi T

    2014-01-01

    Genome engineering using programmable nucleases enables homologous recombination (HR)-mediated gene knock-in. However, the labour used to construct targeting vectors containing homology arms and difficulties in inducing HR in some cell type and organisms represent technical hurdles for the application of HR-mediated knock-in technology. Here, we introduce an alternative strategy for gene knock-in using transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) mediated by microhomology-mediated end-joining, termed the PITCh (Precise Integration into Target Chromosome) system. TALEN-mediated PITCh, termed TAL-PITCh, enables efficient integration of exogenous donor DNA in human cells and animals, including silkworms and frogs. We further demonstrate that CRISPR/Cas9-mediated PITCh, termed CRIS-PITCh, can be applied in human cells without carrying the plasmid backbone sequence. Thus, our PITCh-ing strategies will be useful for a variety of applications, not only in cultured cells, but also in various organisms, including invertebrates and vertebrates. PMID:25410609

  8. Unexpected inheritance: multiple integrations of ancient bornavirus and ebolavirus/marburgvirus sequences in vertebrate genomes.

    PubMed

    Belyi, Vladimir A; Levine, Arnold J; Skalka, Anna Marie

    2010-01-01

    Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological

  9. Non-Random Integration of the HPV Genome in Cervical Cancer

    PubMed Central

    Schmitz, Martina; Driesch, Corina; Jansen, Lars; Runnebaum, Ingo B.; Dürst, Matthias

    2012-01-01

    HPV DNA integration into the host genome is a characteristic but not an exclusive step during cervical carcinogenesis. It is still a matter of debate whether viral integration contributes to the transformation process beyond ensuring the constitutive expression of the viral oncogenes. There is mounting evidence for a non-random distribution of integration loci and the direct involvement of cellular cancer-related genes. In this study we addressed this topic by extending the existing data set by an additional 47 HPV16 and HPV18 positive cervical carcinoma. We provide supportive evidence for previously defined integration hotspots and have revealed another cluster of integration sites within the cytogenetic band 3q28. Moreover, in the vicinity of these hotspots numerous microRNAs (miRNAs) are located and may be influenced by the integrated HPV DNA. By compiling our data and published reports 9 genes could be identified which were affected by HPV integration at least twice in independent tumors. In some tumors the viral-cellular fusion transcripts were even identical with respect to the viral donor and cellular acceptor sites used. However, the exact integration sites are likely to differ since none of the integration sites analysed thus far have shown more than a few nucleotides of homology between viral and host sequences. Therefore, DNA recombination involving large stretches of homology at the integration site can be ruled out. It is however intriguing that by sequence alignment several regions of the HPV16 genome were found to have highly homologous stretches of up to 50 nucleotides to the aforementioned genes and the integration hotspots. One common region of homologies with cellular sequences is between the viral gene E5 and L2 (nucleotides positions 4100 to 4240). We speculate that this and other regions of homology are involved in the integration process. Our observations suggest that targeted disruption, possibly also of critical cellular genes, by HPV

  10. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

    PubMed

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  11. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    PubMed

    Du, Jiang; Bjornson, Robert D; Zhang, Zhengdong D; Kong, Yong; Snyder, Michael; Gerstein, Mark B

    2009-07-01

    The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at

  12. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    PubMed Central

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  13. The REST remodeling complex protects genomic integrity during embryonic neurogenesis

    PubMed Central

    Nechiporuk, Tamilla; McGann, James; Mullendorff, Karin; Hsieh, Jenny; Wurst, Wolfgang; Floss, Thomas; Mandel, Gail

    2016-01-01

    The timely transition from neural progenitor to post-mitotic neuron requires down-regulation and loss of the neuronal transcriptional repressor, REST. Here, we have used mice containing a gene trap in the Rest gene, eliminating transcription from all coding exons, to remove REST prematurely from neural progenitors. We find that catastrophic DNA damage occurs during S-phase of the cell cycle, with long-term consequences including abnormal chromosome separation, apoptosis, and smaller brains. Persistent effects are evident by latent appearance of proneural glioblastoma in adult mice deleted additionally for the tumor suppressor p53 protein (p53). A previous line of mice deleted for REST in progenitors by conventional gene targeting does not exhibit these phenotypes, likely due to a remaining C-terminal peptide that still binds chromatin and recruits co-repressors. Our results suggest that REST-mediated chromatin remodeling is required in neural progenitors for proper S-phase dynamics, as part of its well-established role in repressing neuronal genes until terminal differentiation. DOI: http://dx.doi.org/10.7554/eLife.09584.001 PMID:26745185

  14. The REST remodeling complex protects genomic integrity during embryonic neurogenesis.

    PubMed

    Nechiporuk, Tamilla; McGann, James; Mullendorff, Karin; Hsieh, Jenny; Wurst, Wolfgang; Floss, Thomas; Mandel, Gail

    2016-01-01

    The timely transition from neural progenitor to post-mitotic neuron requires down-regulation and loss of the neuronal transcriptional repressor, REST. Here, we have used mice containing a gene trap in the Rest gene, eliminating transcription from all coding exons, to remove REST prematurely from neural progenitors. We find that catastrophic DNA damage occurs during S-phase of the cell cycle, with long-term consequences including abnormal chromosome separation, apoptosis, and smaller brains. Persistent effects are evident by latent appearance of proneural glioblastoma in adult mice deleted additionally for the tumor suppressor p53 protein (p53). A previous line of mice deleted for REST in progenitors by conventional gene targeting does not exhibit these phenotypes, likely due to a remaining C-terminal peptide that still binds chromatin and recruits co-repressors. Our results suggest that REST-mediated chromatin remodeling is required in neural progenitors for proper S-phase dynamics, as part of its well-established role in repressing neuronal genes until terminal differentiation.

  15. The nucleolus—guardian of cellular homeostasis and genome integrity.

    PubMed

    Grummt, Ingrid

    2013-12-01

    All organisms sense and respond to conditions that stress their homeostasis by downregulating the synthesis of rRNA and ribosome biogenesis, thus designating the nucleolus as the central hub in coordinating the cellular stress response. One of the most intriguing roles of the nucleolus, long regarded as a mere ribosome-producing factory, is its participation in monitoring cellular stress signals and transmitting them to the RNA polymerase I (Pol I) transcription machinery. As rRNA synthesis is a most energy-consuming process, switching off transcription of rRNA genes is an effective way of saving the energy required to maintain cellular homeostasis during acute stress. The Pol I transcription machinery is the key convergence point that collects and integrates a vast array of information from cellular signaling cascades to regulate ribosome production which, in turn, guides cell growth and proliferation. This review focuses on the mechanisms that link cell physiology to rDNA silencing, a prerequisite for nucleolar integrity and cell survival.

  16. Neuroscience Data Integration through Mediation: An (F)BIRN Case Study

    PubMed Central

    Ashish, Naveen; Ambite, José Luis; Muslea, Maria; Turner, Jessica A.

    2010-01-01

    We describe an application of the BIRN mediator to the integration of neuroscience experimental data sources. The BIRN mediator is a general purpose solution to the problem of providing integrated, semantically-consistent access to biomedical data from multiple, distributed, heterogeneous data sources. The system follows the mediation approach, where the data remains at the sources, providers maintain control of the data, and the integration system retrieves data from the sources in real-time in response to client queries. Our aim with this paper is to illustrate how domain-specific data integration applications can be developed quickly and in a principled way by using our general mediation technology. We describe in detail the integration of two leading, but radically different, experimental neuroscience sources, namely, the human imaging database, a relational database, and the eXtensible neuroimaging archive toolkit, an XML web services system. We discuss the steps, sources of complexity, effort, and time required to build such applications, as well as outline directions of ongoing and future research on biomedical data integration. PMID:21228907

  17. iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data

    PubMed Central

    Wang, Wenting; Baladandayuthapani, Veerabhadran; Morris, Jeffrey S.; Broom, Bradley M.; Manyam, Ganiraju; Do, Kim-Anh

    2013-01-01

    Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current data integration approaches are limited in that they do not consider the fundamental biological relationships that exist among the data obtained from different platforms. Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses hierarchical modeling to combine the data obtained from multiple platforms into one model. Results: We assess the performance of our methods using several synthetic and real examples. Simulations show our integrative methods to have higher power to detect disease-related genes than non-integrative methods. Using the Cancer Genome Atlas glioblastoma dataset, we apply the iBAG model to integrate gene expression and methylation data to study their associations with patient survival. Our proposed method discovers multiple methylation-regulated genes that are related to patient survival, most of which have important biological functions in other diseases but have not been previously studied in glioblastoma. Availability: http://odin.mdacc.tmc.edu/∼vbaladan/. Contact: veera@mdanderson.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23142963

  18. Component-based mediation services for the integration of medical applications.

    PubMed

    Xu, Y; Sauquet, D; Degoulet, P; Jaulent, M-C

    2003-03-01

    Allowing exchange of information and cooperation among network-wide distributed and heterogeneous applications is a major need of current health-care information systems. The European project SynEx aims at developing an integration platform for both new and legacy applications on each partner's site. We developed, in this project, mediation services based on the generic and reusable software components that facilitate the construction of an integration platform and ease the communication and the meaningful transformation among distributed and heterogeneous applications. The main component of the mediation services is named Pilot, which serves as an intelligent broker. It uses a multi-agents service model allowing the integration platform to be multi-servers. It transforms a client request into a valid high level service on the platform. Each service is broken up into several elementary steps by the Pilot. For each step, the Pilot uses an agent to realize the operation configured by the step. At runtime, the Pilot synchronizes the execution of different steps. To ease the communication and the interaction with the heterogeneous systems, an agent can integrate a Mediator. The Mediators are the communication and interpretation tools within the mediation services. We have developed a generic model that can be specialized for creating specific mediators for the different use cases. The mediator model uses two interfaces to connect the mediator with two systems that need to communicate. Each interface deals with the three aspects through three managers (the Communication Manager, the Syntax Manager and the Semantic Manager). Some ready-to-use specializations are developed for some well defined cases which can reduce the development effort. Once a manager is specialized, it can be used in different combinations with other managers to resolve different problems. The meaningful transformation is ensured on a semantic level in each mediator through the Semantic Model

  19. Detecting DNA Double-Stranded Breaks in Mammalian Genomes by Linear Amplification-mediated High-Throughput Genome-wide Translocation Sequencing (LAM-HTGTS)

    PubMed Central

    Hu, Jiazhi; Meyers, Robin M.; Dong, Junchao; Panchakshari, Rohit A.; Alt, Frederick W.; Frock, Richard L.

    2016-01-01

    Unbiased, high-throughput assays to detect and quantify DNA double-stranded breaks (DSBs) genome-wide in mammalian cells will facilitate basic studies of mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as evaluating on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for detecting genome-wide “prey” DSBs via their translocation in cultured mammalian cells to a fixed “bait” DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina paired-end Miseq sequencing. A custom bioinformatic pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis are necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable, and straightforward to implement with a turnaround time of less than one week. PMID:27031497

  20. New Insights into the Classification and Integration Specificity of Streptococcus Integrative Conjugative Elements through Extensive Genome Exploration

    PubMed Central

    Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie

    2016-01-01

    Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent. PMID:26779141

  1. Brd4 Is Required for E2-Mediated Transcriptional Activation but Not Genome Partitioning of All Papillomaviruses†

    PubMed Central

    McPhillips, M. G.; Oliveira, J. G.; Spindler, J. E.; Mitra, R.; McBride, A. A.

    2006-01-01

    Bromodomain protein 4 (Brd4) has been identified as the cellular binding target through which the E2 protein of bovine papillomavirus type 1 links the viral genome to mitotic chromosomes. This tethering ensures retention and efficient partitioning of genomes to daughter cells following cell division. E2 is also a regulator of viral gene expression and a replication factor, in association with the viral E1 protein. In this study, we show that E2 proteins from a wide range of papillomaviruses interact with Brd4, albeit with variations in efficiency. Moreover, disruption of the E2-Brd4 interaction abrogates the transactivation function of E2, indicating that Brd4 is required for E2-mediated transactivation of all papillomaviruses. However, the interaction of E2 and Brd4 is not required for genome partitioning of all papillomaviruses since a number of papillomavirus E2 proteins associate with mitotic chromosomes independently of Brd4 binding. Furthermore, mutations in E2 that disrupt the interaction with Brd4 do not affect the ability of these E2s to associate with chromosomes. Thus, while all papillomaviruses attach their genomes to cellular chromosomes to facilitate genome segregation, they target different cellular binding partners. In summary, the E2 proteins from many papillomaviruses, including the clinically important alpha genus human papillomaviruses, interact with Brd4 to mediate transcriptional activation function but not all depend on this interaction to efficiently associate with mitotic chromosomes. PMID:16973557

  2. Brd4 is required for e2-mediated transcriptional activation but not genome partitioning of all papillomaviruses.

    PubMed

    McPhillips, M G; Oliveira, J G; Spindler, J E; Mitra, R; McBride, A A

    2006-10-01

    Bromodomain protein 4 (Brd4) has been identified as the cellular binding target through which the E2 protein of bovine papillomavirus type 1 links the viral genome to mitotic chromosomes. This tethering ensures retention and efficient partitioning of genomes to daughter cells following cell division. E2 is also a regulator of viral gene expression and a replication factor, in association with the viral E1 protein. In this study, we show that E2 proteins from a wide range of papillomaviruses interact with Brd4, albeit with variations in efficiency. Moreover, disruption of the E2-Brd4 interaction abrogates the transactivation function of E2, indicating that Brd4 is required for E2-mediated transactivation of all papillomaviruses. However, the interaction of E2 and Brd4 is not required for genome partitioning of all papillomaviruses since a number of papillomavirus E2 proteins associate with mitotic chromosomes independently of Brd4 binding. Furthermore, mutations in E2 that disrupt the interaction with Brd4 do not affect the ability of these E2s to associate with chromosomes. Thus, while all papillomaviruses attach their genomes to cellular chromosomes to facilitate genome segregation, they target different cellular binding partners. In summary, the E2 proteins from many papillomaviruses, including the clinically important alpha genus human papillomaviruses, interact with Brd4 to mediate transcriptional activation function but not all depend on this interaction to efficiently associate with mitotic chromosomes.

  3. Integrated and sequence-ordered BAC- and YAC-based physical maps for the rat genome.

    PubMed

    Krzywinski, Martin; Wallis, John; Gösele, Claudia; Bosdet, Ian; Chiu, Readman; Graves, Tina; Hummel, Oliver; Layman, Dan; Mathewson, Carrie; Wye, Natasja; Zhu, Baoli; Albracht, Derek; Asano, Jennifer; Barber, Sarah; Brown-John, Mabel; Chan, Susanna; Chand, Steve; Cloutier, Alison; Davito, Jonathon; Fjell, Chris; Gaige, Tony; Ganten, Detlev; Girn, Noreen; Guggenheimer, Kurtis; Himmelbauer, Heinz; Kreitler, Thomas; Leach, Stephen; Lee, Darlene; Lehrach, Hans; Mayo, Michael; Mead, Kelly; Olson, Teika; Pandoh, Pawan; Prabhu, Anna-Liisa; Shin, Heesun; Tänzer, Simone; Thompson, Jason; Tsai, Miranda; Walker, Jason; Yang, George; Sekhon, Mandeep; Hillier, LaDeana; Zimdahl, Heike; Marziali, Andre; Osoegawa, Kazutoyo; Zhao, Shaying; Siddiqui, Asim; de Jong, Pieter J; Warren, Wes; Mardis, Elaine; McPherson, John D; Wilson, Richard; Hübner, Norbert; Jones, Steven; Marra, Marco; Schein, Jacqueline

    2004-04-01

    As part of the effort to sequence the genome of Rattus norvegicus, we constructed a physical map comprised of fingerprinted bacterial artificial chromosome (BAC) clones from the CHORI-230 BAC library. These BAC clones provide approximately 13-fold redundant coverage of the genome and have been assembled into 376 fingerprint contigs. A yeast artificial chromosome (YAC) map was also constructed and aligned with the BAC map via fingerprinted BAC and P1 artificial chromosome clones (PACs) sharing interspersed repetitive sequence markers with the YAC-based physical map. We have annotated 95% of the fingerprint map clones in contigs with coordinates on the version 3.1 rat genome sequence assembly, using BAC-end sequences and in silico mapping methods. These coordinates have allowed anchoring 358 of the 376 fingerprint map contigs onto the sequence assembly. Of these, 324 contigs are anchored to rat genome sequences localized to chromosomes, and 34 contigs are anchored to unlocalized portions of the rat sequence assembly. The remaining 18 contigs, containing 54 clones, still require placement. The fingerprint map is a high-resolution integrative data resource that provides genome-ordered associations among BAC, YAC, and PAC clones and the assembled sequence of the rat genome. PMID:15060021

  4. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    PubMed

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  5. Integration of Brassica A genome genetic linkage map between Brassica napus and B. rapa.

    PubMed

    Suwabe, Keita; Morgan, Colin; Bancroft, Ian

    2008-03-01

    An integrated linkage map between B. napus and B. rapa was constructed based on a total of 44 common markers comprising 41 SSR (33 BRMS, 6 Saskatoon, and 2 BBSRC) and 3 SNP/indel markers. Between 3 and 7 common markers were mapped onto each of the linkage groups A1 to A10. The position and order of most common markers revealed a high level of colinearity between species, although two small regions on A4, A5, and A10 revealed apparent local inversions between them. These results indicate that the A genome of Brassica has retained a high degree of colinearity between species, despite each species having evolved independently after the integration of the A and C genomes in the amphidiploid state. Our results provide a genetic integration of the Brassica A genome between B. napus and B. rapa. As the analysis employed sequence-based molecular markers, the information will accelerate the exploitation of the B. rapa genome sequence for the improvement of oilseed rape.

  6. GeneWeaver: a web-based system for integrative functional genomics.

    PubMed

    Baker, Erich J; Jay, Jeremy J; Bubier, Jason A; Langston, Michael A; Chesler, Elissa J

    2012-01-01

    High-throughput genome technologies have produced a wealth of data on the association of genes and gene products to biological functions. Investigators have discovered value in combining their experimental results with published genome-wide association studies, quantitative trait locus, microarray, RNA-sequencing and mutant phenotyping studies to identify gene-function associations across diverse experiments, species, conditions, behaviors or biological processes. These experimental results are typically derived from disparate data repositories, publication supplements or reconstructions from primary data stores. This leaves bench biologists with the complex and unscalable task of integrating data by identifying and gathering relevant studies, reanalyzing primary data, unifying gene identifiers and applying ad hoc computational analysis to the integrated set. The freely available GeneWeaver (http://www.GeneWeaver.org) powered by the Ontological Discovery Environment is a curated repository of genomic experimental results with an accompanying tool set for dynamic integration of these data sets, enabling users to interactively address questions about sets of biological functions and their relations to sets of genes. Thus, large numbers of independently published genomic results can be organized into new conceptual frameworks driven by the underlying, inferred biological relationships rather than a pre-existing semantic framework. An empirical 'ontology' is discovered from the aggregate of experimental knowledge around user-defined areas of biological inquiry.

  7. Transcriptional stalling in B-lymphocytes: a mechanism for antibody diversification and maintenance of genomic integrity.

    PubMed

    Sun, Jianbo; Rothschild, Gerson; Pefanis, Evangelos; Basu, Uttiya

    2013-01-01

    B cells utilize three DNA alteration strategies-V(D)J recombination, somatic hypermutation (SHM) and class switch recombination (CSR)-to somatically mutate their genome, thereby expressing a plethora of antibodies tailor-made against the innumerable antigens they encounter while in circulation. Of these three events, the single-strand DNA cytidine deaminase, Activation Induced cytidine Deaminase (AID), is responsible for SHM and CSR. Recent advances, discussed in this review article, point toward various components of RNA polymerase II "stalling" machinery as regulators of AID activity during antibody diversification and maintenance of B cell genome integrity. PMID:23584095

  8. The integrated landscape of driver genomic alterations in glioblastoma

    PubMed Central

    Frattini, Veronique; Trifonov, Vladimir; Chan, Joseph Minhow; Castano, Angelica; Lia, Marie; Abate, Francesco; Keir, Stephen T.; Ji, Alan X.; Zoppoli, Pietro; Niola, Francesco; Danussi, Carla; Dolgalev, Igor; Porrati, Paola; Pellegatta, Serena; Heguy, Adriana; Gupta, Gaurav; Pisapia, David J.; Canoll, Peter; Bruce, Jeffrey N.; McLendon, Roger E.; Yan, Hai; Aldape, Ken; Finocchiaro, Gaetano; Mikkelsen, Tom; Privé, Gilbert G.; Bigner, Darell D.; Lasorella, Anna; Rabadan, Raul; Iavarone, Antonio

    2013-01-01

    Glioblastoma remains one of the most challenging forms of cancer to treat. Here, we develop a computational platform that integrates the analysis of copy number variations and somatic mutations and unravels the landscape of in-frame gene fusions in glioblastoma. We find mutations with loss of heterozygosity of LZTR-1, an adaptor of Cul3-containing E3 ligase complexes. Mutations and deletions disrupt LZTR-1 function, which restrains self-renewal and growth of glioma spheres retaining stem cell features. Loss-of-function mutations of CTNND2 target a neural-specific gene and are associated with transformation of glioma cells along the very aggressive mesenchymal phenotype. We also report recurrent translocations that fuse the coding sequence of EGFR to several partners, with EGFR-SEPT14 as the most frequent functional gene fusion in human glioblastoma. EGFR-SEPT14 fusions activate Stat3 signaling and confer mitogen independency and sensitivity to EGFR inhibition. These results provide important insights into the pathogenesis of glioblastoma and highlight new targets for therapeutic intervention. PMID:23917401

  9. Integrating Hormone- and Micromolecule-Mediated Signaling with Plasmodesmal Communication.

    PubMed

    Han, Xiao; Kim, Jae-Yean

    2016-01-01

    Intercellular and supracellular communications through plasmodesmata are involved in vital processes for plant development and physiological responses. Micro- and macromolecules, including hormones, RNA, and proteins, serve as biological information vectors that traffic through the plasmodesmata between cells. Previous studies demonstrated that the plasmodesmata are elaborately regulated, whereby a long queue of multiple signaling molecules forms. However, the mechanism by which these signals are coupled or coordinated in terms of simultaneous transport in a single channel remains a puzzle. In the last few years, several phytohormones that could function as both non-cell-autonomous signals and plasmodesmal regulators have been disclosed. Plasmodesmal regulators such as auxin, salicylic acid, reactive oxygen species, gibberellic acids, chitin, and jasmonic acid could regulate intercellular trafficking by adjusting plasmodesmal permeability. Here, callose, along with β-glucan synthase and β-glucanase, plays a critical role in regulating plasmodesmal permeability. Interestingly, most of the previously identified regulators are capable of diffusing through the plasmodesmata. Given the small sizes of these molecules, the plasmodesmata are prominent intercellular channels that allow diffusion-based movement of those signaling molecules. Obviously, intercellular communication is under the control of a major mechanism, named a feedback loop, at the plasmodesmata, which mediates complicated biological behaviors. Prospective research on the mechanism of coupling micromolecules at the plasmodesmata for developmental signaling and nutrient provision will help us to understand how plants coordinate their development and photosynthetic assimilation, which is important for agriculture.

  10. The RAG2 C-terminus and ATM protect genome integrity by controlling antigen receptor gene cleavage.

    PubMed

    Chaumeil, Julie; Micsinai, Mariann; Ntziachristos, Panagiotis; Roth, David B; Aifantis, Iannis; Kluger, Yuval; Deriano, Ludovic; Skok, Jane A

    2013-01-01

    Tight control of antigen-receptor gene rearrangement is required to preserve genome integrity and prevent the occurrence of leukaemia and lymphoma. Nonetheless, mistakes can happen, leading to the generation of aberrant rearrangements, such as Tcra/d-Igh inter-locus translocations that are a hallmark of ataxia telangiectasia-mutated (ATM) deficiency. Current evidence indicates that these translocations arise from the persistence of unrepaired breaks converging at different stages of thymocyte differentiation. Here we show that a defect in feedback control of RAG2 activity gives rise to bi-locus breaks and damage on Tcra/d and Igh in the same T cell at the same developmental stage, which provides a direct mechanism for generating these inter-locus rearrangements. Both the RAG2 C-terminus and ATM prevent bi-locus RAG-mediated cleavage through modulation of three-dimensional conformation (higher-order loops) and nuclear organization of the two loci. This limits the number of potential substrates for translocation and provides an important mechanism for protecting genome stability. PMID:23900513

  11. Salt Stress in Desulfovibrio vulgaris Hildenborough: an Integrated Genomics Approach

    PubMed Central

    Mukhopadhyay, Aindrila; He, Zhili; Alm, Eric J.; Arkin, Adam P.; Baidoo, Edward E.; Borglin, Sharon C.; Chen, Wenqiong; Hazen, Terry C.; He, Qiang; Holman, Hoi-Ying; Huang, Katherine; Huang, Rick; Joyner, Dominique C.; Katz, Natalie; Keller, Martin; Oeller, Paul; Redding, Alyssa; Sun, Jun; Wall, Judy; Wei, Jing; Yang, Zamin; Yen, Huei-Che; Zhou, Jizhong; Keasling, Jay D.

    2006-01-01

    The ability of Desulfovibrio vulgaris Hildenborough to reduce, and therefore contain, toxic and radioactive metal waste has made all factors that affect the physiology of this organism of great interest. Increased salinity is an important and frequent fluctuation faced by D. vulgaris in its natural habitat. In liquid culture, exposure to excess salt resulted in striking elongation of D. vulgaris cells. Using data from transcriptomics, proteomics, metabolite assays, phospholipid fatty acid profiling, and electron microscopy, we used a systems approach to explore the effects of excess NaCl on D. vulgaris. In this study we demonstrated that import of osmoprotectants, such as glycine betaine and ectoine, is the primary mechanism used by D. vulgaris to counter hyperionic stress. Several efflux systems were also highly up-regulated, as was the ATP synthesis pathway. Increases in the levels of both RNA and DNA helicases suggested that salt stress affected the stability of nucleic acid base pairing. An overall increase in the level of branched fatty acids indicated that there were changes in cell wall fluidity. The immediate response to salt stress included up-regulation of chemotaxis genes, although flagellar biosynthesis was down-regulated. Other down-regulated systems included lactate uptake permeases and ABC transport systems. The results of an extensive NaCl stress analysis were compared with microarray data from a KCl stress analysis, and unlike many other bacteria, D. vulgaris responded similarly to the two stresses. Integration of data from multiple methods allowed us to develop a conceptual model for the salt stress response in D. vulgaris that can be compared to those in other microorganisms. PMID:16707698

  12. Genomic Access to Monarch Migration Using TALEN and CRISPR/Cas9-Mediated Targeted Mutagenesis

    PubMed Central

    Markert, Matthew J.; Zhang, Ying; Enuameh, Metewo S.; Reppert, Steven M.; Wolfe, Scot A.; Merlin, Christine

    2016-01-01

    The eastern North American monarch butterfly, Danaus plexippus, is an emerging model system to study the neural, molecular, and genetic basis of animal long-distance migration and animal clockwork mechanisms. While genomic studies have provided new insight into migration-associated and circadian clock genes, the general lack of simple and versatile reverse-genetic methods has limited in vivo functional analysis of candidate genes in this species. Here, we report the establishment of highly efficient and heritable gene mutagenesis methods in the monarch butterfly using transcriptional activator-like effector nucleases (TALENs) and CRISPR-associated RNA-guided nuclease Cas9 (CRISPR/Cas9). Using two clock gene loci, cryptochrome 2 and clock (clk), as candidates, we show that both TALENs and CRISPR/Cas9 generate high-frequency nonhomologous end-joining (NHEJ)-mediated mutations at targeted sites (up to 100%), and that injecting fewer than 100 eggs is sufficient to recover mutant progeny and generate monarch knockout lines in about 3 months. Our study also genetically defines monarch CLK as an essential component of the transcriptional activation complex of the circadian clock. The methods presented should not only greatly accelerate functional analyses of many aspects of monarch biology, but are also anticipated to facilitate the development of these tools in other nontraditional insect species as well as the development of homology-directed knock-ins. PMID:26837953

  13. Interplay between arginine methylation and ubiquitylation regulates KLF4-mediated genome stability and carcinogenesis.

    PubMed

    Hu, Dong; Gur, Mert; Zhou, Zhuan; Gamper, Armin; Hung, Mien-Chie; Fujita, Naoya; Lan, Li; Bahar, Ivet; Wan, Yong

    2015-01-01

    KLF4 is an important regulator of cell-fate decision, including DNA damage response and apoptosis. We identify a novel interplay between protein modifications in regulating KLF4 function. Here we show that arginine methylation of KLF4 by PRMT5 inhibits KLF4 ubiquitylation by VHL and thereby reduces KLF4 turnover, resulting in the elevation of KLF4 protein levels concomitant with increased transcription of KLF4-dependent p21 and reduced expression of KLF4-repressed Bax. Structure-based modelling and simulations provide insight into the molecular mechanisms of KLF4 recognition and catalysis by PRMT5. Following genotoxic stress, disruption of PRMT5-mediated KLF4 methylation leads to abrogation of KLF4 accumulation, which, in turn, attenuates cell cycle arrest. Mutating KLF4 methylation sites suppresses breast tumour initiation and progression, and immunohistochemical stain shows increased levels of both KLF4 and PRMT5 in breast cancer tissues. Taken together, our results point to a critical role for aberrant KLF4 regulation by PRMT5 in genome stability and breast carcinogenesis. PMID:26420673

  14. Genomic Access to Monarch Migration Using TALEN and CRISPR/Cas9-Mediated Targeted Mutagenesis.

    PubMed

    Markert, Matthew J; Zhang, Ying; Enuameh, Metewo S; Reppert, Steven M; Wolfe, Scot A; Merlin, Christine

    2016-01-01

    The eastern North American monarch butterfly, Danaus plexippus, is an emerging model system to study the neural, molecular, and genetic basis of animal long-distance migration and animal clockwork mechanisms. While genomic studies have provided new insight into migration-associated and circadian clock genes, the general lack of simple and versatile reverse-genetic methods has limited in vivo functional analysis of candidate genes in this species. Here, we report the establishment of highly efficient and heritable gene mutagenesis methods in the monarch butterfly using transcriptional activator-like effector nucleases (TALENs) and CRISPR-associated RNA-guided nuclease Cas9 (CRISPR/Cas9). Using two clock gene loci, cryptochrome 2 and clock (clk), as candidates, we show that both TALENs and CRISPR/Cas9 generate high-frequency nonhomologous end-joining (NHEJ)-mediated mutations at targeted sites (up to 100%), and that injecting fewer than 100 eggs is sufficient to recover mutant progeny and generate monarch knockout lines in about 3 months. Our study also genetically defines monarch CLK as an essential component of the transcriptional activation complex of the circadian clock. The methods presented should not only greatly accelerate functional analyses of many aspects of monarch biology, but are also anticipated to facilitate the development of these tools in other nontraditional insect species as well as the development of homology-directed knock-ins. PMID:26837953

  15. Identification of metastasis-associated genes in colorectal cancer through an integrated genomic and transcriptomic analysis

    PubMed Central

    Peng, Sihua

    2013-01-01

    Objective Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of microarray data was presented, by combined with evidence acquired from comparative genomic hybridization (CGH) data. Methods Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify metastasis-associated genes in CRC. Results A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions Our results demonstrated that integration analysis is an effective strategy for mining cancer-associated genes. PMID:24385689

  16. Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project

    PubMed Central

    Gerstein, Mark B.; Lu, Zhi John; Van Nostrand, Eric L.; Cheng, Chao; Arshinoff, Bradley I.; Liu, Tao; Yip, Kevin Y.; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K.; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P.; Barber, Galt; Brdlik, Cathleen M.; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O.; Dernburg, Abby F.; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C.; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A.; Gassmann, Reto; Good, Peter J.; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S.; Habegger, Lukas; Han, Ting; Henikoff, Jorja G.; Henz, Stefan R.; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A. Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W. James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K.; Kolasinska-Zwierz, Paulina; Lai, Eric C.; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F.; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D.; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M.; Muroyama, Andrew; Murray, John I.; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A.; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J.; Slightam, Cindie; Smith, Richard; Spencer, William C.; Stinson, E. O.; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L.; Whittle, Christina M.; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C.; Micklem, Gos; Liu, X. Shirley; Reinke, Valerie; Kim, Stuart K.; Hillier, LaDeana W.; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D.; Waterston, Robert H.

    2011-01-01

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome. PMID:21177976

  17. The Multifunctions of WD40 Proteins in Genome Integrity and Cell Cycle Progression

    PubMed Central

    Zhang, Caiguo; Zhang, Fan

    2015-01-01

    Eukaryotic genome encodes numerous WD40 repeat proteins, which generally function as platforms of protein-protein interactions and are involved in numerous biological process, such as signal transduction, gene transcriptional regulation, protein modifications, cytoskeleton assembly, vesicular trafficking, DNA damage and repair, cell death and cell cycle progression. Among these diverse functions, genome integrity maintenance and cell cycle progression are extremely important as deregulation of them is clinically linked to uncontrolled proliferative diseases such as cancer. Thus, we mainly summarize and discuss the recent understanding of WD40 proteins and their molecular mechanisms linked to genome stability and cell cycle progression in this review, thereby demonstrating their pervasiveness and importance in cellular networks. PMID:25653723

  18. Efficient generation of knock-in transgenic zebrafish carrying reporter/driver genes by CRISPR/Cas9-mediated genome engineering.

    PubMed

    Kimura, Yukiko; Hisano, Yu; Kawahara, Atsuo; Higashijima, Shin-ichi

    2014-01-01

    The type II bacterial CRISPR/Cas9 system is rapidly becoming popular for genome-engineering due to its simplicity, flexibility, and high efficiency. Recently, targeted knock-in of a long DNA fragment via homology-independent DNA repair has been achieved in zebrafish using CRISPR/Cas9 system. This raised the possibility that knock-in transgenic zebrafish could be efficiently generated using CRISPR/Cas9. However, how widely this method can be applied for the targeting integration of foreign genes into endogenous genomic loci is unclear. Here, we report efficient generation of knock-in transgenic zebrafish that have cell-type specific Gal4 or reporter gene expression. A donor plasmid containing a heat-shock promoter was co-injected with a short guide RNA (sgRNA) targeted for genome digestion, a sgRNA targeted for donor plasmid digestion, and Cas9 mRNA. We have succeeded in establishing stable knock-in transgenic fish with several different constructs for 4 genetic loci at a frequency being exceeding 25%. Due to its simplicity, design flexibility, and high efficiency, we propose that CRISPR/Cas9-mediated knock-in will become a standard method for the generation transgenic zebrafish.

  19. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing.

    PubMed

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75-98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing. PMID:27606350

  20. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing

    PubMed Central

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75–98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing. PMID:27606350

  1. Sendai virus, an RNA virus with no risk of genomic integration, delivers CRISPR/Cas9 for efficient gene editing

    PubMed Central

    Park, Arnold; Hong, Patrick; Won, Sohui T; Thibault, Patricia A; Vigant, Frederic; Oguntuyo, Kasopefoluwa Y; Taft, Justin D; Lee, Benhur

    2016-01-01

    The advent of RNA-guided endonuclease (RGEN)-mediated gene editing, specifically via CRISPR/Cas9, has spurred intensive efforts to improve the efficiency of both RGEN delivery and targeted mutagenesis. The major viral vectors in use for delivery of Cas9 and its associated guide RNA, lentiviral and adeno-associated viral systems, have the potential for undesired random integration into the host genome. Here, we repurpose Sendai virus, an RNA virus with no viral DNA phase and that replicates solely in the cytoplasm, as a delivery system for efficient Cas9-mediated gene editing. The high efficiency of Sendai virus infection resulted in high rates of on-target mutagenesis in cell lines (75–98% at various endogenous and transgenic loci) and primary human monocytes (88% at the ccr5 locus) in the absence of any selection. In conjunction with extensive former work on Sendai virus as a promising gene therapy vector that can infect a wide range of cell types including hematopoietic stem cells, this proof-of-concept study opens the door to using Sendai virus as well as other related paramyxoviruses as versatile and efficient tools for gene editing.

  2. Genomic Analysis of Sleeping Beauty Transposon Integration in Human Somatic Cells

    PubMed Central

    Turchiano, Giandomenico; Latella, Maria Carmela; Gogol-Döring, Andreas; Cattoglio, Claudia; Mavilio, Fulvio; Izsvák, Zsuzsanna; Ivics, Zoltán; Recchia, Alessandra

    2014-01-01

    The Sleeping Beauty (SB) transposon is a non-viral integrating vector system with proven efficacy for gene transfer and functional genomics. However, integration efficiency is negatively affected by the length of the transposon. To optimize the SB transposon machinery, the inverted repeats and the transposase gene underwent several modifications, resulting in the generation of the hyperactive SB100X transposase and of the high-capacity “sandwich” (SA) transposon. In this study, we report a side-by-side comparison of the SA and the widely used T2 arrangement of transposon vectors carrying increasing DNA cargoes, up to 18 kb. Clonal analysis of SA integrants in human epithelial cells and in immortalized keratinocytes demonstrates stability and integrity of the transposon independently from the cargo size and copy number-dependent expression of the cargo cassette. A genome-wide analysis of unambiguously mapped SA integrations in keratinocytes showed an almost random distribution, with an overrepresentation in repetitive elements (satellite, LINE and small RNAs) compared to a library representing insertions of the first-generation transposon vector and to gammaretroviral and lentiviral libraries. The SA transposon/SB100X integrating system therefore shows important features as a system for delivering large gene constructs for gene therapy applications. PMID:25390293

  3. AACR precision medicine series: Highlights of the integrating clinical genomics and cancer therapy meeting.

    PubMed

    Maggi, Elaine; Montagna, Cristina

    2015-12-01

    The American Association for Cancer Research (AACR) Precision Medicine Series "Integrating Clinical Genomics and Cancer Therapy" took place June 13-16, 2015 in Salt Lake City, Utah. The conference was co-chaired by Charles L. Sawyers form Memorial Sloan Kettering Cancer Center in New York, Elaine R. Mardis form Washington University School of Medicine in St. Louis, and Arul M. Chinnaiyan from University of Michigan in Ann Arbor. About 500 clinicians, basic science investigators, bioinformaticians, and postdoctoral fellows joined together to discuss the current state of Clinical Genomics and the advances and challenges of integrating Next Generation Sequencing (NGS) technologies into clinical practice. The plenary sessions and panel discussions covered current platforms and sequencing approaches adopted for NGS assays of cancer genome at several national and international institutions, different approaches used to map and classify targetable sequence variants, and how information acquired with the sequencing of the cancer genome is used to guide treatment options. While challenges still exist from a technological perspective, it emerged that there exists considerable need for the development of tools to aid the identification of the therapy most suitable based on the mutational profile of the somatic cancer genome. The process to match patients to ongoing clinical trials is still complex. In addition, the need for centralized data repositories, preferably linked to well annotated clinical records, that aid sharing of sequencing information is central to begin understanding the contribution of variants of unknown significance to tumor etiology and response to therapy. Here we summarize the highlights of this stimulating four-day conference with a major emphasis on the open problems that the clinical genomics community is currently facing and the tools most needed for advancing this field. PMID:26554403

  4. Epiviz: a view inside the design of an integrated visual analysis software for genomics

    PubMed Central

    2015-01-01

    Background Computational and visual data analysis for genomics has traditionally involved a combination of tools and resources, of which the most ubiquitous consist of genome browsers, focused mainly on integrative visualization of large numbers of big datasets, and computational environments, focused on data modeling of a small number of moderately sized datasets. Workflows that involve the integration and exploration of multiple heterogeneous data sources, small and large, public and user specific have been poorly addressed by these tools. In our previous work, we introduced Epiviz, which bridges the gap between the two types of tools, simplifying these workflows. Results In this paper we expand on the design decisions behind Epiviz, and introduce a series of new advanced features that further support the type of interactive exploratory workflow we have targeted. We discuss three ways in which Epiviz advances the field of genomic data analysis: 1) it brings code to interactive visualizations at various different levels; 2) takes the first steps in the direction of collaborative data analysis by incorporating user plugins from source control providers, as well as by allowing analysis states to be shared among the scientific community; 3) combines established analysis features that have never before been available simultaneously in a genome browser. In our discussion section, we present security implications of the current design, as well as a series of limitations and future research steps. Conclusions Since many of the design choices of Epiviz are novel in genomics data analysis, this paper serves both as a document of our own approaches with lessons learned, as well as a start point for future efforts in the same direction for the genomics community. PMID:26328750

  5. AACR precision medicine series: Highlights of the integrating clinical genomics and cancer therapy meeting.

    PubMed

    Maggi, Elaine; Montagna, Cristina

    2015-12-01

    The American Association for Cancer Research (AACR) Precision Medicine Series "Integrating Clinical Genomics and Cancer Therapy" took place June 13-16, 2015 in Salt Lake City, Utah. The conference was co-chaired by Charles L. Sawyers form Memorial Sloan Kettering Cancer Center in New York, Elaine R. Mardis form Washington University School of Medicine in St. Louis, and Arul M. Chinnaiyan from University of Michigan in Ann Arbor. About 500 clinicians, basic science investigators, bioinformaticians, and postdoctoral fellows joined together to discuss the current state of Clinical Genomics and the advances and challenges of integrating Next Generation Sequencing (NGS) technologies into clinical practice. The plenary sessions and panel discussions covered current platforms and sequencing approaches adopted for NGS assays of cancer genome at several national and international institutions, different approaches used to map and classify targetable sequence variants, and how information acquired with the sequencing of the cancer genome is used to guide treatment options. While challenges still exist from a technological perspective, it emerged that there exists considerable need for the development of tools to aid the identification of the therapy most suitable based on the mutational profile of the somatic cancer genome. The process to match patients to ongoing clinical trials is still complex. In addition, the need for centralized data repositories, preferably linked to well annotated clinical records, that aid sharing of sequencing information is central to begin understanding the contribution of variants of unknown significance to tumor etiology and response to therapy. Here we summarize the highlights of this stimulating four-day conference with a major emphasis on the open problems that the clinical genomics community is currently facing and the tools most needed for advancing this field.

  6. TALE nickase mediates high efficient targeted transgene integration at the human multi-copy ribosomal DNA locus.

    PubMed

    Wu, Yong; Gao, Tieli; Wang, Xiaolin; Hu, Youjin; Hu, Xuyun; Hu, Zhiqing; Pang, Jialun; Li, Zhuo; Xue, Jinfeng; Feng, Mai; Wu, Lingqian; Liang, Desheng

    2014-03-28

    Although targeted gene addition could be stimulated strikingly by a DNA double strand break (DSB) created by either zinc finger nucleases (ZFNs) or TALE nucleases (TALENs), the DSBs are really mutagenic and toxic to human cells. As a compromised solution, DNA single-strand break (SSB) or nick has been reported to mediate high efficient gene addition but with marked reduction of random mutagenesis. We previously demonstrated effective targeted gene addition at the human multicopy ribosomal DNA (rDNA) locus, a genomic safe harbor for the transgene with therapeutic potential. To improve the transgene integration efficiency by using TALENs while lowering the cytotoxicity of DSBs, we created both TALENs and TALE nickases (TALENickases) targeting this multicopy locus. A targeting vector which could integrate a GFP cassette at the rDNA locus was constructed and co-transfected with TALENs or TALENickases. Although the fraction of GFP positive cells using TALENs was greater than that using TALENickases during the first few days after transfection, it reduced to a level less than that using TALENickases after continuous culture. Our findings showed that the TALENickases were more effective than their TALEN counterparts at the multi-copy rDNA locus, though earlier studies using ZFNs and ZFNickases targeting the single-copy loci showed the reverse. Besides, TALENickases mediated the targeted integration of a 5.4 kb fragment at a frequency of up to 0.62% in HT1080 cells after drug selection, suggesting their potential application in targeted gene modification not being limited at the rDNA locus.

  7. Drosophila Sld5 is essential for normal cell cycle progression and maintenance of genomic integrity

    SciTech Connect

    Gouge, Catherine A.; Christensen, Tim W.

    2010-09-10

    Research highlights: {yields} Drosophila Sld5 interacts with Psf1, PPsf2, and Mcm10. {yields} Haploinsufficiency of Sld5 leads to M-phase delay and genomic instability. {yields} Sld5 is also required for normal S phase progression. -- Abstract: Essential for the normal functioning of a cell is the maintenance of genomic integrity. Failure in this process is often catastrophic for the organism, leading to cell death or mis-proliferation. Central to genomic integrity is the faithful replication of DNA during S phase. The GINS complex has recently come to light as a critical player in DNA replication through stabilization of MCM2-7 and Cdc45 as a member of the CMG complex which is likely responsible for the processivity of helicase activity during S phase. The GINS complex is made up of 4 members in a 1:1:1:1 ratio: Psf1, Psf2, Psf3, And Sld5. Here we present the first analysis of the function of the Sld5 subunit in a multicellular organism. We show that Drosophila Sld5 interacts with Psf1, Psf2, and Mcm10 and that mutations in Sld5 lead to M and S phase delays with chromosomes exhibiting hallmarks of genomic instability.

  8. Integration of genome scale data for identifying new players in colorectal cancer

    PubMed Central

    Sokolova, Viktorija; Crippa, Elisabetta; Gariboldi, Manuela

    2016-01-01

    Colorectal cancers (CRCs) display a wide variety of genomic aberrations that may be either causally linked to their development and progression, or might serve as biomarkers for their presence. Recent advances in rapid high-throughput genetic and genomic analysis have helped to identify a plethora of alterations that can potentially serve as new cancer biomarkers, and thus help to improve CRC diagnosis, prognosis, and treatment. Each distinct data type (copy number variations, gene and microRNAs expression, CpG island methylation) provides an investigator with a different, partially independent, and complementary view of the entire genome. However, elucidation of gene function will require more information than can be provided by analyzing a single type of data. The integration of knowledge obtained from different sources is becoming increasingly essential for obtaining an interdisciplinary view of large amounts of information, and also for cross-validating experimental results. The integration of numerous types of genetic and genomic data derived from public sources, and via the use of ad-hoc bioinformatics tools and statistical methods facilitates the discovery and validation of novel, informative biomarkers. This combinatory approach will also enable researchers to more accurately and comprehensively understand the associations between different biologic pathways, mechanisms, and phenomena, and gain new insights into the etiology of CRC. PMID:26811605

  9. A comprehensive whole-genome integrated cytogenetic map for the alpaca (Lama pacos).

    PubMed

    Avila, Felipe; Baily, Malorie P; Perelman, Polina; Das, Pranab J; Pontius, Joan; Chowdhary, Renuka; Owens, Elaine; Johnson, Warren E; Merriwether, David A; Raudsepp, Terje

    2014-01-01

    Genome analysis of the alpaca (Lama pacos, LPA) has progressed slowly compared to other domestic species. Here, we report the development of the first comprehensive whole-genome integrated cytogenetic map for the alpaca using fluorescence in situ hybridization (FISH) and CHORI-246 BAC library clones. The map is comprised of 230 linearly ordered markers distributed among all 36 alpaca autosomes and the sex chromosomes. For the first time, markers were assigned to LPA14, 21, 22, 28, and 36. Additionally, 86 genes from 15 alpaca chromosomes were mapped in the dromedary camel (Camelus dromedarius, CDR), demonstrating exceptional synteny and linkage conservation between the 2 camelid genomes. Cytogenetic mapping of 191 protein-coding genes improved and refined the known Zoo-FISH homologies between camelids and humans: we discovered new homologous synteny blocks (HSBs) corresponding to HSA1-LPA/CDR11, HSA4-LPA/CDR31 and HSA7-LPA/CDR36, and revised the location of breakpoints for others. Overall, gene mapping was in good agreement with the Zoo-FISH and revealed remarkable evolutionary conservation of gene order within many human-camelid HSBs. Most importantly, 91 FISH-mapped markers effectively integrated the alpaca whole-genome sequence and the radiation hybrid maps with physical chromosomes, thus facilitating the improvement of the sequence assembly and the discovery of genes of biological importance.

  10. Integrated syntenic and phylogenomic analyses reveal an ancient genome duplication in monocots.

    PubMed

    Jiao, Yuannian; Li, Jingping; Tang, Haibao; Paterson, Andrew H

    2014-07-01

    Unraveling widespread polyploidy events throughout plant evolution is a necessity for inferring the impacts of whole-genome duplication (WGD) on speciation, functional innovations, and to guide identification of true orthologs in divergent taxa. Here, we employed an integrated syntenic and phylogenomic analyses to reveal an ancient WGD that shaped the genomes of all commelinid monocots, including grasses, bromeliads, bananas (Musa acuminata), ginger, palms, and other plants of fundamental, agricultural, and/or horticultural interest. First, comprehensive phylogenomic analyses revealed 1421 putative gene families that retained ancient duplication shared by Musa (Zingiberales) and grass (Poales) genomes, indicating an ancient WGD in monocots. Intergenomic synteny blocks of Musa and Oryza were investigated, and 30 blocks were shown to be duplicated before Musa-Oryza divergence an estimated 120 to 150 million years ago. Synteny comparisons of four monocot (rice [Oryza sativa], sorghum [Sorghum bicolor], banana, and oil palm [Elaeis guineensis]) and two eudicot (grape [Vitis vinifera] and sacred lotus [Nelumbo nucifera]) genomes also support this additional WGD in monocots, herein called Tau (τ). Integrating synteny and phylogenomic comparisons achieves better resolution of ancient polyploidy events than either approach individually, a principle that is exemplified in the disambiguation of a WGD series of rho (ρ)-sigma (σ)-tau (τ) in the grass lineages that echoes the alpha (α)-beta (β)-gamma (γ) series previously revealed in the Arabidopsis thaliana lineage. PMID:25082857

  11. Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences

    PubMed Central

    Zhang, Jianwei; Kudrna, Dave; Mu, Ting; Li, Weiming; Copetti, Dario; Yu, Yeisoo; Goicoechea, Jose Luis; Lei, Yang; Wing, Rod A.

    2016-01-01

    Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool—Genome Puzzle Master (GPM)—that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. Availability and Implementation: The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS Contacts: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27318200

  12. Maintaining Pedagogical Integrity of a Computer Mediated Course Delivery in Social Foundations

    ERIC Educational Resources Information Center

    Stewart, Shelley; Cobb-Roberts, Deirdre; Shircliffe, Barbara J.

    2013-01-01

    Transforming a face to face course to a computer mediated format in social foundations (interdisciplinary field in education), while maintaining pedagogical integrity, involves strategic collaboration between instructional technologists and content area experts. This type of planned partnership requires open dialogue and a mutual respect for prior…

  13. Bioinformatics visualization and integration with open standards: the Bluejay genomic browser.

    PubMed

    Turinsky, Andrei L; Ah-Seng, Andrew C; Gordon, Paul M K; Stromer, Julie N; Taschuk, Morgan L; Xu, Emily W; Sensen, Christoph W

    2005-01-01

    We have created a new Java-based integrated computational environment for the exploration of genomic data, called Bluejay. The system is capable of using almost any XML file related to genomic data. Non-XML data sources can be accessed via a proxy server. Bluejay has several features, which are new to Bioinformatics, including an unlimited semantic zoom capability, coupled with Scalable Vector Graphics (SVG) outputs; an implementation of the XLink standard, which features access to MAGPIE Genecards as well as any BioMOBY service accessible over the Internet; and the integration of gene chip analysis tools with the functional assignments. The system can be used as a signed web applet, Web Start, and a local stand-alone application, with or without connection to the Internet. It is available free of charge and as open source via http://bluejay.ucalgary.ca. PMID:15972014

  14. Genome, integration, and transduction of a novel temperate phage of Helicobacter pylori.

    PubMed

    Luo, Cheng-Hung; Chiou, Pei-Yu; Yang, Chiou-Ying; Lin, Nien-Tsung

    2012-08-01

    Helicobacter pylori is a common human pathogen that has been identified to be carcinogenic. This study isolated the temperate bacteriophage 1961P from the lysate of a clinical strain of H. pylori isolated in Taiwan. The bacteriophage has an icosahedral head and a short tail, typical of the Podoviridae family. Its double-stranded DNA genome is 26,836 bp long and has 33 open reading frames. Only 9 of the predicted proteins have homologs of known functions, while the remaining 24 are only similar to unknown proteins encoded by Helicobacter prophages and remnants. Analysis of sequences proximal to the phage-host junctions suggests that 1961P may integrate into the host chromosome via a mechanism similar to that of bacteriophage lambda. In addition, 1961P is capable of generalized transduction. To the best of our knowledge, this is the first report of the isolation, characterization, genome analysis, integration, and transduction of a Helicobacter pylori phage.

  15. Genome, Integration, and Transduction of a Novel Temperate Phage of Helicobacter pylori

    PubMed Central

    Luo, Cheng-Hung; Chiou, Pei-Yu; Yang, Chiou-Ying

    2012-01-01

    Helicobacter pylori is a common human pathogen that has been identified to be carcinogenic. This study isolated the temperate bacteriophage 1961P from the lysate of a clinical strain of H. pylori isolated in Taiwan. The bacteriophage has an icosahedral head and a short tail, typical of the Podoviridae family. Its double-stranded DNA genome is 26,836 bp long and has 33 open reading frames. Only 9 of the predicted proteins have homologs of known functions, while the remaining 24 are only similar to unknown proteins encoded by Helicobacter prophages and remnants. Analysis of sequences proximal to the phage-host junctions suggests that 1961P may integrate into the host chromosome via a mechanism similar to that of bacteriophage lambda. In addition, 1961P is capable of generalized transduction. To the best of our knowledge, this is the first report of the isolation, characterization, genome analysis, integration, and transduction of a Helicobacter pylori phage. PMID:22696647

  16. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    SciTech Connect

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  17. Integrated genomics and molecular breeding approaches for dissecting the complex quantitative traits in crop plants.

    PubMed

    Kujur, Alice; Saxena, Maneesha S; Bajaj, Deepak; Laxmi; Parida, Swarup K

    2013-12-01

    The enormous population growth, climate change and global warming are now considered major threats to agriculture and world's food security. To improve the productivity and sustainability of agriculture, the development of highyielding and durable abiotic and biotic stress-tolerant cultivars and/climate resilient crops is essential. Henceforth, understanding the molecular mechanism and dissection of complex quantitative yield and stress tolerance traits is the prime objective in current agricultural biotechnology research. In recent years, tremendous progress has been made in plant genomics and molecular breeding research pertaining to conventional and next-generation whole genome, transcriptome and epigenome sequencing efforts, generation of huge genomic, transcriptomic and epigenomic resources and development of modern genomics-assisted breeding approaches in diverse crop genotypes with contrasting yield and abiotic stress tolerance traits. Unfortunately, the detailed molecular mechanism and gene regulatory networks controlling such complex quantitative traits is not yet well understood in crop plants. Therefore, we propose an integrated strategies involving available enormous and diverse traditional and modern -omics (structural, functional, comparative and epigenomics) approaches/resources and genomics-assisted breeding methods which agricultural biotechnologist can adopt/utilize to dissect and decode the molecular and gene regulatory networks involved in the complex quantitative yield and stress tolerance traits in crop plants. This would provide clues and much needed inputs for rapid selection of novel functionally relevant molecular tags regulating such complex traits to expedite traditional and modern marker-assisted genetic enhancement studies in target crop species for developing high-yielding stress-tolerant varieties.

  18. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    PubMed

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  19. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    PubMed

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  20. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    PubMed Central

    King, Zachary A.; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  1. Prediction of clinical phenotypes in invasive breast carcinomas from the integration of radiomics and genomics data

    PubMed Central

    Guo, Wentian; Li, Hui; Zhu, Yitan; Lan, Li; Yang, Shengjie; Drukker, Karen; Morris, Elizabeth; Burnside, Elizabeth; Whitman, Gary; Giger, Maryellen L.; Ji, Yuan; TCGA Breast Phenotype Research Group

    2015-01-01

    Abstract. Genomic and radiomic imaging profiles of invasive breast carcinomas from The Cancer Genome Atlas and The Cancer Imaging Archive were integrated and a comprehensive analysis was conducted to predict clinical outcomes using the radiogenomic features. Variable selection via LASSO and logistic regression were used to select the most-predictive radiogenomic features for the clinical phenotypes, including pathological stage, lymph node metastasis, and status of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). Cross-validation with receiver operating characteristic (ROC) analysis was performed and the area under the ROC curve (AUC) was employed as the prediction metric. Higher AUCs were obtained in the prediction of pathological stage, ER, and PR status than for lymph node metastasis and HER2 status. Overall, the prediction performances by genomics alone, radiomics alone, and combined radiogenomics features showed statistically significant correlations with clinical outcomes; however, improvement on the prediction performance by combining genomics and radiomics data was not found to be statistically significant, most likely due to the small sample size of 91 cancer cases with 38 radiomic features and 144 genomic features. PMID:26835491

  2. MEGANTE: a web-based system for integrated plant genome annotation.

    PubMed

    Numa, Hisataka; Itoh, Takeshi

    2014-01-01

    The recent advancement of high-throughput genome sequencing technologies has resulted in a considerable increase in demands for large-scale genome annotation. While annotation is a crucial step for downstream data analyses and experimental studies, this process requires substantial expertise and knowledge of bioinformatics. Here we present MEGANTE, a web-based annotation system that makes plant genome annotation easy for researchers unfamiliar with bioinformatics. Without any complicated configuration, users can perform genomic sequence annotations simply by uploading a sequence and selecting the species to query. MEGANTE automatically runs several analysis programs and integrates the results to select the appropriate consensus exon-intron structures and to predict open reading frames (ORFs) at each locus. Functional annotation, including a similarity search against known proteins and a functional domain search, are also performed for the predicted ORFs. The resultant annotation information is visualized with a widely used genome browser, GBrowse. For ease of analysis, the results can be downloaded in Microsoft Excel format. All of the query sequences and annotation results are stored on the server side so that users can access their own data from virtually anywhere on the web. The current release of MEGANTE targets 24 plant species from the Brassicaceae, Fabaceae, Musaceae, Poaceae, Salicaceae, Solanaceae, Rosaceae and Vitaceae families, and it allows users to submit a sequence up to 10 Mb in length and to save up to 100 sequences with the annotation information on the server. The MEGANTE web service is available at https://megante.dna.affrc.go.jp/.

  3. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    SciTech Connect

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  4. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DOE PAGESBeta

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scalemore » metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.« less

  5. Heterogeneity revealed by integrated genomic analysis uncovers a molecular switch in malignant uveal melanoma.

    PubMed

    de Lange, Mark J; van Pelt, Sake I; Versluis, Mieke; Jordanova, Ekaterina S; Kroes, Wilma G M; Ruivenkamp, Claudia; van der Burg, Sjoerd H; Luyten, Grégorius P M; van Hall, Thorbald; Jager, Martine J; van der Velden, Pieter A

    2015-11-10

    Gene expression profiles as well as genomic imbalances are correlated with disease progression in uveal melanoma (UM). We integrated expression and genomic profiles to obtain insight into the oncogenic mechanisms in development and progression of UM. We used tumor tissue from 64 enucleated eyes of UM patients for profiling. Mutations and genomic imbalances were quantified with digital PCR to study tumor heterogeneity and molecular pathogenesis. Gene expression analysis divided the UM panel into three classes. Class I presented tumors with a good prognosis and a distinct genomic make up that is characterized by 6p gain. The UM with a bad prognosis were subdivided into class IIa and class IIb. These classes presented similar survival risks but could be distinguished by tumor heterogeneity. Class IIa presented homogeneous tumors while class IIb tumors, on average, contained 30% of non-mutant cells. Tumor heterogeneity coincided with expression of a set of immune genes revealing an extensive immune infiltrate in class IIb tumors. Molecularly, class IIa and IIb presented the same genomic configuration and could only be distinguished by 8q copy number. Moreover, UM establish in the void of the immune privileged eye indicating that in IIb tumors the infiltrate is attracted by the UM. Combined our data show that chromosome 8q contains the locus that causes the immune phentotype of UM. UM thereby provides an unique opportunity to study immune attraction by tumors. PMID:26462151

  6. Impact of Nucleoporin-Mediated Chromatin Localization and Nuclear Architecture on HIV Integration Site Selection.

    PubMed

    Wong, Richard W; Mamede, João I; Hope, Thomas J

    2015-10-01

    It has been known for a number of years that integration sites of human immunodeficiency virus type 1 (HIV-1) DNA show a preference for actively expressed chromosomal locations. A number of viral and cellular proteins are implicated in this process, but the underlying mechanism is not clear. Two recent breakthrough publications advance our understanding of HIV integration site selection by focusing on the localization of the preferred target genes of integration. These studies reveal that knockdown of certain nucleoporins and components of nucleocytoplasmic trafficking alter integration site preference, not by altering the trafficking of the viral genome but by altering the chromatin subtype localization relative to the structure of the nucleus. Here, we describe the link between the nuclear basket nucleoporins (Tpr and Nup153) and chromatin organization and how altering the host environment by manipulating nuclear structure may have important implications for the preferential integration of HIV into actively transcribed genes, facilitating efficient viral replication. PMID:26136574

  7. Impact of Nucleoporin-Mediated Chromatin Localization and Nuclear Architecture on HIV Integration Site Selection.

    PubMed

    Wong, Richard W; Mamede, João I; Hope, Thomas J

    2015-10-01

    It has been known for a number of years that integration sites of human immunodeficiency virus type 1 (HIV-1) DNA show a preference for actively expressed chromosomal locations. A number of viral and cellular proteins are implicated in this process, but the underlying mechanism is not clear. Two recent breakthrough publications advance our understanding of HIV integration site selection by focusing on the localization of the preferred target genes of integration. These studies reveal that knockdown of certain nucleoporins and components of nucleocytoplasmic trafficking alter integration site preference, not by altering the trafficking of the viral genome but by altering the chromatin subtype localization relative to the structure of the nucleus. Here, we describe the link between the nuclear basket nucleoporins (Tpr and Nup153) and chromatin organization and how altering the host environment by manipulating nuclear structure may have important implications for the preferential integration of HIV into actively transcribed genes, facilitating efficient viral replication.

  8. Impact of Nucleoporin-Mediated Chromatin Localization and Nuclear Architecture on HIV Integration Site Selection

    PubMed Central

    Mamede, João I.

    2015-01-01

    It has been known for a number of years that integration sites of human immunodeficiency virus type 1 (HIV-1) DNA show a preference for actively expressed chromosomal locations. A number of viral and cellular proteins are implicated in this process, but the underlying mechanism is not clear. Two recent breakthrough publications advance our understanding of HIV integration site selection by focusing on the localization of the preferred target genes of integration. These studies reveal that knockdown of certain nucleoporins and components of nucleocytoplasmic trafficking alter integration site preference, not by altering the trafficking of the viral genome but by altering the chromatin subtype localization relative to the structure of the nucleus. Here, we describe the link between the nuclear basket nucleoporins (Tpr and Nup153) and chromatin organization and how altering the host environment by manipulating nuclear structure may have important implications for the preferential integration of HIV into actively transcribed genes, facilitating efficient viral replication. PMID:26136574

  9. NAHR-mediated copy-number variants in a clinical population: mechanistic insights into both genomic disorders and Mendelizing traits.

    PubMed

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E; Scott, Daryl A; Schaaf, Christian P; Torres-Martinez, Wilfredo; Stevens, Abby K; Rosenfeld, Jill A; Agadi, Satish; Francis, David; Kang, Sung-Hae L; Breman, Amy; Lalani, Seema R; Bacino, Carlos A; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L; Patel, Ankita; Shaw, Chad A; Lupski, James R; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-09-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3', correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease. PMID:23657883

  10. NAHR-mediated copy-number variants in a clinical population: Mechanistic insights into both genomic disorders and Mendelizing traits

    PubMed Central

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y.; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E.; Scott, Daryl A.; Schaaf, Christian P.; Torres-Martinez, Wilfredo; Stevens, Abby K.; Rosenfeld, Jill A.; Agadi, Satish; Francis, David; Kang, Sung-Hae L.; Breman, Amy; Lalani, Seema R.; Bacino, Carlos A.; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L.; Patel, Ankita; Shaw, Chad A.; Lupski, James R.; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-01-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5′-CCNCCNTNNCCNC-3′, correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease. PMID:23657883

  11. NAHR-mediated copy-number variants in a clinical population: mechanistic insights into both genomic disorders and Mendelizing traits.

    PubMed

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E; Scott, Daryl A; Schaaf, Christian P; Torres-Martinez, Wilfredo; Stevens, Abby K; Rosenfeld, Jill A; Agadi, Satish; Francis, David; Kang, Sung-Hae L; Breman, Amy; Lalani, Seema R; Bacino, Carlos A; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L; Patel, Ankita; Shaw, Chad A; Lupski, James R; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-09-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3', correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease.

  12. The RNAPII-CTD Maintains Genome Integrity through Inhibition of Retrotransposon Gene Expression and Transposition

    PubMed Central

    Aristizabal, Maria J.; Negri, Gian Luca; Kobor, Michael S.

    2015-01-01

    RNA polymerase II (RNAPII) contains a unique C-terminal domain that is composed of heptapeptide repeats and which plays important regulatory roles during gene expression. RNAPII is responsible for the transcription of most protein-coding genes, a subset of non-coding genes, and retrotransposons. Retrotransposon transcription is the first step in their multiplication cycle, given that the RNA intermediate is required for the synthesis of cDNA, the material that is ultimately incorporated into a new genomic location. Retrotransposition can have grave consequences to genome integrity, as integration events can change the gene expression landscape or lead to alteration or loss of genetic information. Given that RNAPII transcribes retrotransposons, we sought to investigate if the RNAPII-CTD played a role in the regulation of retrotransposon gene expression. Importantly, we found that the RNAPII-CTD functioned to maintaining genome integrity through inhibition of retrotransposon gene expression, as reducing CTD length significantly increased expression and transposition rates of Ty1 elements. Mechanistically, the increased Ty1 mRNA levels in the rpb1-CTD11 mutant were partly due to Cdk8-dependent alterations to the RNAPII-CTD phosphorylation status. In addition, Cdk8 alone contributed to Ty1 gene expression regulation by altering the occupancy of the gene-specific transcription factor Ste12. Loss of STE12 and TEC1 suppressed growth phenotypes of the RNAPII-CTD truncation mutant. Collectively, our results implicate Ste12 and Tec1 as general and important contributors to the Cdk8, RNAPII-CTD regulatory circuitry as it relates to the maintenance of genome integrity. PMID:26496706

  13. Combining qualitative and quantitative imaging evaluation for the assessment of genomic DNA integrity: The SPIDIA experience.

    PubMed

    Ciniselli, Chiara Maura; Pizzamiglio, Sara; Malentacchi, Francesca; Gelmini, Stefania; Pazzagli, Mario; Hartmann, Christina C; Ibrahim-Gawel, Hady; Verderio, Paolo

    2015-06-15

    In this note, we present an ad hoc procedure that combines qualitative (visual evaluation) and quantitative (ImageJ software) evaluations of Pulsed-Field Gel Electrophoresis (PFGE) images to assess the genomic DNA (gDNA) integrity of analyzed samples. This procedure could be suitable for the analysis of a large number of images by taking into consideration both the expertise of researchers and the objectiveness of the software. We applied this procedure on the first SPIDIA DNA External Quality Assessment (EQA) samples. Results show that the classification obtained by this ad hoc procedure allows a more accurate evaluation of gDNA integrity with respect to a single approach.

  14. An integrated rat genome map based on genetic and cytogenetic data.

    PubMed

    Kitada, K; Voigt, B; Kondo, Y; Serikawa, T

    2000-04-01

    In this study we combined three major rat genome maps, by adding 66 markers to the Kyoto Laboratory Animal Science map (KLAS map), and constructed an integrated map. The resultant integrated map consists of 5,682 redundant markers, spanning a genetic length of 2,028 cM. Eighty genetic markers were anchored to the cytogenetic map, fixing all the genetic maps in the physically correct orientation. This map encapsulates the progress in rat mapping studies in past years and offers useful information for QTL analysis. The map figures are available at http:/(/)www.anim.med.kyoto-u.ac.jp/.

  15. Anti-infectious drug repurposing using an integrated chemical genomics and structural systems biology approach.

    PubMed

    Ng, Clara; Hauptman, Ruth; Zhang, Yinliang; Bourne, Philip E; Xie, Lei

    2014-01-01

    The emergence of multi-drug and extensive drug resistance of microbes to antibiotics poses a great threat to human health. Although drug repurposing is a promising solution for accelerating the drug development process, its application to anti-infectious drug discovery is limited by the scope of existing phenotype-, ligand-, or target-based methods. In this paper we introduce a new computational strategy to determine the genome-wide molecular targets of bioactive compounds in both human and bacterial genomes. Our method is based on the use of a novel algorithm, ligand Enrichment of Network Topological Similarity (ligENTS), to map the chemical universe to its global pharmacological space. ligENTS outperforms the state-of-the-art algorithms in identifying novel drug-target relationships. Furthermore, we integrate ligENTS with our structural systems biology platform to identify drug repurposing opportunities via target similarity profiling. Using this integrated strategy, we have identified novel P. falciparum targets of drug-like active compounds from the Malaria Box, and suggest that a number of approved drugs may be active against malaria. This study demonstrates the potential of an integrative chemical genomics and structural systems biology approach to drug repurposing.

  16. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    PubMed

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  17. A series of conditional shuttle vectors for targeted genomic integration in budding yeast.

    PubMed

    Chou, Chia-Ching; Patel, Michael T; Gartenberg, Marc R

    2015-05-01

    The capacity of Saccharomyces cerevisiae to repair exposed DNA ends by homologous recombination has long been used by experimentalists to assemble plasmids from DNA fragments in vivo. While this approach works well for engineering extrachromosomal vectors, it is not well suited to the generation, recovery and reuse of integrative vectors. Here, we describe the creation of a series of conditional centromeric shuttle vectors, termed pXR vectors, that can be used for both plasmid assembly in vivo and targeted genomic integration. The defining feature of pXR vectors is that the DNA segment bearing the centromere and origin of replication, termed CEN/ARS, is flanked by a pair of loxP sites. Passaging the vectors through bacteria that express Cre recombinase reduces the loxP-CEN/ARS-loxP module to a single loxP site, thereby eliminating the ability to replicate autonomously in yeast. Each vector also contains a selectable marker gene, as well as a fragment of the HO locus, which permits targeted integration at a neutral genomic site. The pXR vectors provide a convenient and robust method to assemble DNAs for targeted genomic modifications.

  18. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    PubMed

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.

  19. Incidence of Genome Structure, DNA Asymmetry, and Cell Physiology on T-DNA Integration in Chromosomes of the Phytopathogenic Fungus Leptosphaeria maculans

    PubMed Central

    Bourras, Salim; Meyer, Michel; Grandaubert, Jonathan; Lapalu, Nicolas; Fudal, Isabelle; Linglin, Juliette; Ollivier, Benedicte; Blaise, Françoise; Balesdent, Marie-Hélène; Rouxel, Thierry

    2012-01-01

    The ever-increasing generation of sequence data is accompanied by unsatisfactory functional annotation, and complex genomes, such as those of plants and filamentous fungi, show a large number of genes with no predicted or known function. For functional annotation of unknown or hypothetical genes, the production of collections of mutants using Agrobacterium tumefaciens–mediated transformation (ATMT) associated with genotyping and phenotyping has gained wide acceptance. ATMT is also widely used to identify pathogenicity determinants in pathogenic fungi. A systematic analysis of T-DNA borders was performed in an ATMT-mutagenized collection of the phytopathogenic fungus Leptosphaeria maculans to evaluate the features of T-DNA integration in its particular transposable element-rich compartmentalized genome. A total of 318 T-DNA tags were recovered and analyzed for biases in chromosome and genic compartments, existence of CG/AT skews at the insertion site, and occurrence of microhomologies between the T-DNA left border (LB) and the target sequence. Functional annotation of targeted genes was done using the Gene Ontology annotation. The T-DNA integration mainly targeted gene-rich, transcriptionally active regions, and it favored biological processes consistent with the physiological status of a germinating spore. T-DNA integration was strongly biased toward regulatory regions, and mainly promoters. Consistent with the T-DNA intranuclear-targeting model, the density of T-DNA insertion correlated with CG skew near the transcription initiation site. The existence of microhomologies between promoter sequences and the T-DNA LB flanking sequence was also consistent with T-DNA integration to host DNA mediated by homologous recombination based on the microhomology-mediated end-joining pathway. PMID:22908038

  20. Integrated physical, genetic and genome map of chickpea (Cicer arietinum L.).

    PubMed

    Varshney, Rajeev K; Mir, Reyazul Rouf; Bhatia, Sabhyata; Thudi, Mahendar; Hu, Yuqin; Azam, Sarwar; Zhang, Yong; Jaganathan, Deepa; You, Frank M; Gao, Jinliang; Riera-Lizarazu, Oscar; Luo, Ming-Cheng

    2014-03-01

    Physical map of chickpea was developed for the reference chickpea genotype (ICC 4958) using bacterial artificial chromosome (BAC) libraries targeting 71,094 clones (~12× coverage). High information content fingerprinting (HICF) of these clones gave high-quality fingerprinting data for 67,483 clones, and 1,174 contigs comprising 46,112 clones and 3,256 singletons were defined. In brief, 574 Mb genome size was assembled in 1,174 contigs with an average of 0.49 Mb per contig and 3,256 singletons represent 407 Mb genome. The physical map was linked with two genetic maps with the help of 245 BAC-end sequence (BES)-derived simple sequence repeat (SSR) markers. This allowed locating some of the BACs in the vicinity of some important quantitative trait loci (QTLs) for drought tolerance and reistance to Fusarium wilt and Ascochyta blight. In addition, fingerprinted contig (FPC) assembly was also integrated with the draft genome sequence of chickpea. As a result, ~965 BACs including 163 minimum tilling path (MTP) clones could be mapped on eight pseudo-molecules of chickpea forming 491 hypothetical contigs representing 54,013,992 bp (~54 Mb) of the draft genome. Comprehensive analysis of markers in abiotic and biotic stress tolerance QTL regions led to identification of 654, 306 and 23 genes in drought tolerance "QTL-hotspot" region, Ascochyta blight resistance QTL region and Fusarium wilt resistance QTL region, respectively. Integrated physical, genetic and genome map should provide a foundation for cloning and isolation of QTLs/genes for molecular dissection of traits as well as markers for molecular breeding for chickpea improvement.

  1. Identification of genes for complex diseases using integrated analysis of multiple types of genomic data.

    PubMed

    Cao, Hongbao; Lei, Shufeng; Deng, Hong-Wen; Wang, Yu-Ping

    2012-01-01

    Various types of genomic data (e.g., SNPs and mRNA transcripts) have been employed to identify risk genes for complex diseases. However, the analysis of these data has largely been performed in isolation. Combining these multiple data for integrative analysis can take advantage of complementary information and thus can have higher power to identify genes (and/or their functions) that would otherwise be impossible with individual data analysis. Due to the different nature, structure, and format of diverse sets of genomic data, multiple genomic data integration is challenging. Here we address the problem by developing a sparse representation based clustering (SRC) method for integrative data analysis. As an example, we applied the SRC method to the integrative analysis of 376821 SNPs in 200 subjects (100 cases and 100 controls) and expression data for 22283 genes in 80 subjects (40 cases and 40 controls) to identify significant genes for osteoporosis (OP). Comparing our results with previous studies, we identified some genes known related to OP risk (e.g., 'THSD4', 'CRHR1', 'HSD11B1', 'THSD7A', 'BMPR1B' 'ADCY10', 'PRL', 'CA8','ESRRA', 'CALM1', 'CALM1', 'SPARC', and 'LRP1'). Moreover, we uncovered novel osteoporosis susceptible genes ('DICER1', 'PTMA', etc.) that were not found previously but play functionally important roles in osteoporosis etiology from existing studies. In addition, the SRC method identified genes can lead to higher accuracy for the diagnosis/classification of osteoporosis subjects when compared with the traditional T-test and Fisher-exact test, which further validates the proposed SRC approach for integrative analysis.

  2. Construction of an Ortholog Database Using the Semantic Web Technology for Integrative Analysis of Genomic Data

    PubMed Central

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis. PMID:25875762

  3. Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data

    PubMed Central

    Cao, Hongbao; Lei, Shufeng; Deng, Hong-Wen; Wang, Yu-Ping

    2012-01-01

    Various types of genomic data (e.g., SNPs and mRNA transcripts) have been employed to identify risk genes for complex diseases. However, the analysis of these data has largely been performed in isolation. Combining these multiple data for integrative analysis can take advantage of complementary information and thus can have higher power to identify genes (and/or their functions) that would otherwise be impossible with individual data analysis. Due to the different nature, structure, and format of diverse sets of genomic data, multiple genomic data integration is challenging. Here we address the problem by developing a sparse representation based clustering (SRC) method for integrative data analysis. As an example, we applied the SRC method to the integrative analysis of 376821 SNPs in 200 subjects (100 cases and 100 controls) and expression data for 22283 genes in 80 subjects (40 cases and 40 controls) to identify significant genes for osteoporosis (OP). Comparing our results with previous studies, we identified some genes known related to OP risk (e.g., ‘THSD4’, ‘CRHR1’, ‘HSD11B1’, ‘THSD7A’, ‘BMPR1B’ ‘ADCY10’, ‘PRL’, ‘CA8’,’ESRRA’, ‘CALM1’, ‘CALM1’, ‘SPARC’, and ‘LRP1’). Moreover, we uncovered novel osteoporosis susceptible genes (‘DICER1’, ‘PTMA’, etc.) that were not found previously but play functionally important roles in osteoporosis etiology from existing studies. In addition, the SRC method identified genes can lead to higher accuracy for the diagnosis/classification of osteoporosis subjects when compared with the traditional T-test and Fisher-exact test, which further validates the proposed SRC approach for integrative analysis. PMID:22957024

  4. Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data.

    PubMed

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis.

  5. Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data.

    PubMed

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis. PMID:25875762

  6. A geminivirus-based guide RNA delivery system for CRISPR/Cas9 mediated plant genome editing

    PubMed Central

    Yin, Kangquan; Han, Ting; Liu, Guang; Chen, Tianyuan; Wang, Ying; Yu, Alice Yunzi L.; Liu, Yule

    2015-01-01

    CRISPR/Cas has emerged as potent genome editing technology and has successfully been applied in many organisms, including several plant species. However, delivery of genome editing reagents remains a challenge in plants. Here, we report a virus-based guide RNA (gRNA) delivery system for CRISPR/Cas9 mediated plant genome editing (VIGE) that can be used to precisely target genome locations and cause mutations. VIGE is performed by using a modified Cabbage Leaf Curl virus (CaLCuV) vector to express gRNAs in stable transgenic plants expressing Cas9. DNA sequencing confirmed VIGE of endogenous NbPDS3 and NbIspH genes in non-inoculated leaves because CaLCuV can infect plants systemically. Moreover, VIGE of NbPDS3 and NbIspH in newly developed leaves caused photo-bleached phenotype. These results demonstrate that geminivirus-based VIGE could be a powerful tool in plant genome editing. PMID:26450012

  7. A geminivirus-based guide RNA delivery system for CRISPR/Cas9 mediated plant genome editing.

    PubMed

    Yin, Kangquan; Han, Ting; Liu, Guang; Chen, Tianyuan; Wang, Ying; Yu, Alice Yunzi L; Liu, Yule

    2015-01-01

    CRISPR/Cas has emerged as potent genome editing technology and has successfully been applied in many organisms, including several plant species. However, delivery of genome editing reagents remains a challenge in plants. Here, we report a virus-based guide RNA (gRNA) delivery system for CRISPR/Cas9 mediated plant genome editing (VIGE) that can be used to precisely target genome locations and cause mutations. VIGE is performed by using a modified Cabbage Leaf Curl virus (CaLCuV) vector to express gRNAs in stable transgenic plants expressing Cas9. DNA sequencing confirmed VIGE of endogenous NbPDS3 and NbIspH genes in non-inoculated leaves because CaLCuV can infect plants systemically. Moreover, VIGE of NbPDS3 and NbIspH in newly developed leaves caused photo-bleached phenotype. These results demonstrate that geminivirus-based VIGE could be a powerful tool in plant genome editing. PMID:26450012

  8. Agrobacterium proteins VirD2 and VirE2 mediate precise integration of synthetic T-DNA complexes in mammalian cells.

    PubMed

    Pelczar, Pawel; Kalck, Véronique; Gomez, Divina; Hohn, Barbara

    2004-06-01

    Agrobacterium tumefaciens-mediated plant transformation, a unique example of interkingdom gene transfer, has been widely adopted for the generation of transgenic plants. In vitro synthesized transferred DNA (T-DNA) complexes comprising single-stranded DNA and Agrobacterium virulence proteins VirD2 and VirE2, essential for plant transformation, were used to stably transfect HeLa cells. Both proteins positively influenced efficiency and precision of transgene integration by increasing overall transformation rates and by promoting full-length single-copy integration events. These findings demonstrate that the virulence proteins are sufficient for the integration of a T-DNA into a eukaryotic genome in the absence of other bacterial or plant factors. Synthetic T-DNA complexes are therefore unique protein:DNA delivery vectors with potential applications in the field of mammalian transgenesis. PMID:15153934

  9. Integrated Database And Knowledge Base For Genomic Prospective Cohort Study In Tohoku Medical Megabank Toward Personalized Prevention And Medicine.

    PubMed

    Ogishima, Soichi; Takai, Takako; Shimokawa, Kazuro; Nagaie, Satoshi; Tanaka, Hiroshi; Nakaya, Jun

    2015-01-01

    The Tohoku Medical Megabank project is a national project to revitalization of the disaster area in the Tohoku region by the Great East Japan Earthquake, and have conducted large-scale prospective genome-cohort study. Along with prospective genome-cohort study, we have developed integrated database and knowledge base which will be key database for realizing personalized prevention and medicine.

  10. Genome-wide variant analysis of simplex autism families with an integrative clinical-bioinformatics pipeline

    PubMed Central

    Jiménez-Barrón, Laura T.; O'Rawe, Jason A.; Wu, Yiyang; Yoon, Margaret; Fang, Han; Iossifov, Ivan; Lyon, Gholson J.

    2015-01-01

    Autism spectrum disorders (ASDs) are a group of developmental disabilities that affect social interaction and communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASDs, in which many different loci are involved. Although many current population-scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole-genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de novo, autosomal recessive, X-linked, mitochondrial, and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous copy-number variations (CNVs), a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole-genome sequencing data can generate reliable results for use in downstream investigations. PMID:27148569

  11. Different Foreign Genes Incidentally Integrated into the Same Locus of the Streptococcus suis Genome

    PubMed Central

    Sekizaki, Tsutomu; Takamatsu, Daisuke; Osaki, Makoto; Shimoji, Yoshihiro

    2005-01-01

    Some strains of Streptococcus suis possess a type II restriction-modification (RM) system, whose genes are thought to be inserted into the genome between purH and purD from a foreign source by illegitimate recombination. In this study, we characterized the purHD locus of the S. suis genomes of 28 serotype reference strains by DNA sequencing. Four strains contained the RM genes in the locus, as described before, whereas 11 strains possessed other genetic regions of seven classes. The genetic regions contained a single gene or multiple genes that were either unknown or similar to hypothetical genes of other bacteria. The mutually exclusive localization of the genetic regions with the atypical G+C contents indicated that these regions were also acquired from foreign sources. No transposable element or long-repeat sequence was found in the neighboring regions. An alignment of the nucleotide sequences, including the RM gene regions, suggested that the foreign regions were integrated by illegitimate recombination via short stretches of nucleotide identity. By using a thermosensitive suicide plasmid, the RM genes were experimentally introduced into an S. suis strain that did not contain any foreign genes in that locus. Integration of the plasmid into the S. suis genome did not occur in the purHD locus but occurred at various chromosomal loci, where there were 2 to 10 bp of nucleotide identity between the chromosome and the plasmid. These results suggest that various foreign genes described here were incidentally integrated into the same locus of the S. suis genome. PMID:15659665

  12. Loss of p53-mediated cell-cycle arrest, senescence and apoptosis promotes genomic instability and premature aging

    PubMed Central

    Li, Tongyuan; Liu, Xiangyu; Jiang, Le; Manfredi, James; Zha, Shan; Gu, Wei

    2016-01-01

    Although p53-mediated cell cycle arrest, senescence and apoptosis are well accepted as major tumor suppression mechanisms, the loss of these functions does not directly lead to tumorigenesis, suggesting that the precise roles of these canonical activities of p53 need to be redefined. Here, we report that the cells derived from the mutant mice expressing p533KR, an acetylation-defective mutant that fails to induce cell-cycle arrest, senescence and apoptosis, exhibit high levels of aneuploidy upon DNA damage. Moreover, the embryonic lethality caused by the deficiency of XRCC4, a key DNA double strand break repair factor, can be fully rescued in the p533KR/3KR background. Notably, despite high levels of genomic instability, p533KR/3KRXRCC4−/− mice, unlike p53−/− XRCC4−/− mice, are not succumbed to pro-B-cell lymphomas. Nevertheless, p533KR/3KR XRCC4−/− mice display aging-like phenotypes including testicular atrophy, kyphosis, and premature death. Further analyses demonstrate that SLC7A11 is downregulated and that p53-mediated ferroptosis is significantly induced in spleens and testis of p533KR/3KRXRCC4−/− mice. These results demonstrate that the direct role of p53-mediated cell cycle arrest, senescence and apoptosis is to control genomic stability in vivo. Our study not only validates the importance of ferroptosis in p53-mediated tumor suppression in vivo but also reveals that the combination of genomic instability and activation of ferroptosis may promote aging-associated phenotypes. PMID:26943586

  13. Importance of Mediator complex in the regulation and integration of diverse signaling pathways in plants

    PubMed Central

    Samanta, Subhasis; Thakur, Jitendra K.

    2015-01-01

    Basic transcriptional machinery in eukaryotes is assisted by a number of cofactors, which either increase or decrease the rate of transcription. Mediator complex is one such cofactor, and recently has drawn a lot of interest because of its integrative power to converge different signaling pathways before channeling the transcription instructions to the RNA polymerase II machinery. Like yeast and metazoans, plants do possess the Mediator complex across the kingdom, and its isolation and subunit analyses have been reported from the model plant, Arabidopsis. Genetic, and molecular analyses have unraveled important regulatory roles of Mediator subunits at every stage of plant life cycle starting from flowering to embryo and organ development, to even size determination. It also contributes immensely to the survival of plants against different environmental vagaries by the timely activation of its resistance mechanisms. Here, we have provided an overview of plant Mediator complex starting from its discovery to regulation of stoichiometry of its subunits. We have also reviewed involvement of different Mediator subunits in different processes and pathways including defense response pathways evoked by diverse biotic cues. Wherever possible, attempts have been made to provide mechanistic insight of Mediator's involvement in these processes. PMID:26442070

  14. Importance of Mediator complex in the regulation and integration of diverse signaling pathways in plants.

    PubMed

    Samanta, Subhasis; Thakur, Jitendra K

    2015-01-01

    Basic transcriptional machinery in eukaryotes is assisted by a number of cofactors, which either increase or decrease the rate of transcription. Mediator complex is one such cofactor, and recently has drawn a lot of interest because of its integrative power to converge different signaling pathways before channeling the transcription instructions to the RNA polymerase II machinery. Like yeast and metazoans, plants do possess the Mediator complex across the kingdom, and its isolation and subunit analyses have been reported from the model plant, Arabidopsis. Genetic, and molecular analyses have unraveled important regulatory roles of Mediator subunits at every stage of plant life cycle starting from flowering to embryo and organ development, to even size determination. It also contributes immensely to the survival of plants against different environmental vagaries by the timely activation of its resistance mechanisms. Here, we have provided an overview of plant Mediator complex starting from its discovery to regulation of stoichiometry of its subunits. We have also reviewed involvement of different Mediator subunits in different processes and pathways including defense response pathways evoked by diverse biotic cues. Wherever possible, attempts have been made to provide mechanistic insight of Mediator's involvement in these processes.

  15. Messenger RNA- Versus Retrovirus-Based Induced Pluripotent Stem Cell Reprogramming Strategies: Analysis of Genomic Integrity

    PubMed Central

    Steichen, Clara; Luce, Eléanor; Maluenda, Jérôme; Tosca, Lucie; Moreno-Gimeno, Inmaculada; Desterke, Christophe; Dianat, Noushin; Goulinet-Mainot, Sylvie; Awan-Toor, Sarah; Burks, Deborah; Marie, Joëlle; Weber, Anne; Tachdjian, Gérard; Melki, Judith

    2014-01-01

    The use of synthetic messenger RNAs to generate human induced pluripotent stem cells (iPSCs) is particularly appealing for potential regenerative medicine applications, because it overcomes the common drawbacks of DNA-based or virus-based reprogramming strategies, including transgene integration in particular. We compared the genomic integrity of mRNA-derived iPSCs with that of retrovirus-derived iPSCs generated in strictly comparable conditions, by single-nucleotide polymorphism (SNP) and copy number variation (CNV) analyses. We showed that mRNA-derived iPSCs do not differ significantly from the parental fibroblasts in SNP analysis, whereas retrovirus-derived iPSCs do. We found that the number of CNVs seemed independent of the reprogramming method, instead appearing to be clone-dependent. Furthermore, differentiation studies indicated that mRNA-derived iPSCs differentiated efficiently into hepatoblasts and that these cells did not load additional CNVs during differentiation. The integration-free hepatoblasts that were generated constitute a new tool for the study of diseased hepatocytes derived from patients’ iPSCs and their use in the context of stem cell-derived hepatocyte transplantation. Our findings also highlight the need to conduct careful studies on genome integrity for the selection of iPSC lines before using them for further applications. PMID:24736403

  16. Messenger RNA- versus retrovirus-based induced pluripotent stem cell reprogramming strategies: analysis of genomic integrity.

    PubMed

    Steichen, Clara; Luce, Eléanor; Maluenda, Jérôme; Tosca, Lucie; Moreno-Gimeno, Inmaculada; Desterke, Christophe; Dianat, Noushin; Goulinet-Mainot, Sylvie; Awan-Toor, Sarah; Burks, Deborah; Marie, Joëlle; Weber, Anne; Tachdjian, Gérard; Melki, Judith; Dubart-Kupperschmitt, Anne

    2014-06-01

    The use of synthetic messenger RNAs to generate human induced pluripotent stem cells (iPSCs) is particularly appealing for potential regenerative medicine applications, because it overcomes the common drawbacks of DNA-based or virus-based reprogramming strategies, including transgene integration in particular. We compared the genomic integrity of mRNA-derived iPSCs with that of retrovirus-derived iPSCs generated in strictly comparable conditions, by single-nucleotide polymorphism (SNP) and copy number variation (CNV) analyses. We showed that mRNA-derived iPSCs do not differ significantly from the parental fibroblasts in SNP analysis, whereas retrovirus-derived iPSCs do. We found that the number of CNVs seemed independent of the reprogramming method, instead appearing to be clone-dependent. Furthermore, differentiation studies indicated that mRNA-derived iPSCs differentiated efficiently into hepatoblasts and that these cells did not load additional CNVs during differentiation. The integration-free hepatoblasts that were generated constitute a new tool for the study of diseased hepatocytes derived from patients' iPSCs and their use in the context of stem cell-derived hepatocyte transplantation. Our findings also highlight the need to conduct careful studies on genome integrity for the selection of iPSC lines before using them for further applications.

  17. Large-scale metabolome analysis and quantitative integration with genomics and proteomics data in Mycoplasma pneumoniae.

    PubMed

    Maier, Tobias; Marcos, Josep; Wodke, Judith A H; Paetzold, Bernhard; Liebeke, Manuel; Gutiérrez-Gallego, Ricardo; Serrano, Luis

    2013-07-01

    Systems metabolomics, the identification and quantification of cellular metabolites and their integration with genomics and proteomics data, promises valuable functional insights into cellular biology. However, technical constraints, sample complexity issues and the lack of suitable complementary quantitative data sets prevented accomplishing such studies in the past. Here, we present an integrative metabolomics study of the genome-reduced bacterium Mycoplasma pneumoniae. We experimentally analysed its metabolome using a cross-platform approach. We explain intracellular metabolite homeostasis by quantitatively integrating our results with the cellular inventory of proteins, DNA and other macromolecules, as well as with available building blocks from the growth medium. We calculated in vivo catalytic parameters of glycolytic enzymes, making use of measured reaction velocities, as well as enzyme and metabolite pool sizes. A quantitative, inter-species comparison of absolute and relative metabolite abundances indicated that metabolic pathways are regulated as functional units, thereby simplifying adaptive responses. Our analysis demonstrates the potential for new scientific insight by integrating different types of large-scale experimental data from a single biological source.

  18. Integration of HIV in the Human Genome: Which Sites Are Preferential? A Genetic and Statistical Assessment

    PubMed Central

    Gonçalves, Juliana; Moreira, Elsa; Sequeira, Inês J.; Rodrigues, António S.; Rueff, José; Brás, Aldina

    2016-01-01

    Chromosomal fragile sites (FSs) are loci where gaps and breaks may occur and are preferential integration targets for some viruses, for example, Hepatitis B, Epstein-Barr virus, HPV16, HPV18, and MLV vectors. However, the integration of the human immunodeficiency virus (HIV) in Giemsa bands and in FSs is not yet completely clear. This study aimed to assess the integration preferences of HIV in FSs and in Giemsa bands using an in silico study. HIV integration positions from Jurkat cells were used and two nonparametric tests were applied to compare HIV integration in dark versus light bands and in FS versus non-FS (NFSs). The results show that light bands are preferential targets for integration of HIV-1 in Jurkat cells and also that it integrates with equal intensity in FSs and in NFSs. The data indicates that HIV displays different preferences for FSs compared to other viruses. The aim was to develop and apply an approach to predict the conditions and constraints of HIV insertion in the human genome which seems to adequately complement empirical data. PMID:27294106

  19. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis.

    PubMed

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273

  20. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  1. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  2. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis

    PubMed Central

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md. Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273

  3. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis.

    PubMed

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates.

  4. Functional visualization and disruption of targeted genes using CRISPR/Cas9-mediated eGFP reporter integration in zebrafish

    PubMed Central

    Ota, Satoshi; Taimatsu, Kiyohito; Yanagi, Kanoko; Namiki, Tomohiro; Ohga, Rie; Higashijima, Shin-ichi; Kawahara, Atsuo

    2016-01-01

    The CRISPR/Cas9 complex, which is composed of a guide RNA (gRNA) and the Cas9 nuclease, is useful for carrying out genome modifications in various organisms. Recently, the CRISPR/Cas9-mediated locus-specific integration of a reporter, which contains the Mbait sequence targeted using Mbait-gRNA, the hsp70 promoter and the eGFP gene, has allowed the visualization of the target gene expression. However, it has not been ascertained whether the reporter integrations at both targeted alleles cause loss-of-function phenotypes in zebrafish. In this study, we have inserted the Mbait-hs-eGFP reporter into the pax2a gene because the disruption of pax2a causes the loss of the midbrain-hindbrain boundary (MHB) in zebrafish. In the heterozygous Tg[pax2a-hs:eGFP] embryos, MHB formed normally and the eGFP expression recapitulated the endogenous pax2a expression, including the MHB. We observed the loss of the MHB in homozygous Tg[pax2a-hs:eGFP] embryos. Furthermore, we succeeded in integrating the Mbait-hs-eGFP reporter into an uncharacterized gene epdr1. The eGFP expression in heterozygous Tg[epdr1-hs:eGFP] embryos overlapped the epdr1 expression, whereas the distribution of eGFP-positive cells was disorganized in the MHB of homozygous Tg[epdr1-hs:eGFP] embryos. We propose that the locus-specific integration of the Mbait-hs-eGFP reporter is a powerful method to investigate both gene expression profiles and loss-of-function phenotypes. PMID:27725766

  5. The Fanconi Anemia Pathway Protects Genome Integrity from R-loops

    PubMed Central

    García-Rubio, María L.; Pérez-Calero, Carmen; Barroso, Sonia I.; Tumini, Emanuela; Herrera-Moyano, Emilia; Rosado, Iván V.; Aguilera, Andrés

    2015-01-01

    Co-transcriptional RNA-DNA hybrids (R loops) cause genome instability. To prevent harmful R loop accumulation, cells have evolved specific eukaryotic factors, one being the BRCA2 double-strand break repair protein. As BRCA2 also protects stalled replication forks and is the FANCD1 member of the Fanconi Anemia (FA) pathway, we investigated the FA role in R loop-dependent genome instability. Using human and murine cells defective in FANCD2 or FANCA and primary bone marrow cells from FANCD2 deficient mice, we show that the FA pathway removes R loops, and that many DNA breaks accumulated in FA cells are R loop-dependent. Importantly, FANCD2 foci in untreated and MMC-treated cells are largely R loop dependent, suggesting that the FA functions at R loop-containing sites. We conclude that co-transcriptional R loops and R loop-mediated DNA damage greatly contribute to genome instability and that one major function of the FA pathway is to protect cells from R loops. PMID:26584049

  6. Npl3, a new link between RNA-binding proteins and the maintenance of genome integrity

    PubMed Central

    Santos-Pereira, José M; Herrero, Ana B; Moreno, Sergio; Aguilera, Andrés

    2014-01-01

    The mRNA is co-transcriptionally bound by a number of RNA-binding proteins (RBPs) that contribute to its processing and formation of an export-competent messenger ribonucleoprotein particle (mRNP). In the last few years, increasing evidence suggests that RBPs play a key role in preventing transcription-associated genome instability. Part of this instability is mediated by the accumulation of co-transcriptional R loops, which may impair replication fork (RF) progression due to collisions between transcription and replication machineries. In addition, some RBPs have been implicated in DNA repair and/or the DNA damage response (DDR). Recently, the Npl3 protein, one of the most abundant heterogeneous nuclear ribonucleoproteins (hnRNPs) in yeast, has been shown to prevent transcription-associated genome instability and accumulation of RF obstacles, partially associated with R-loop formation. Interestingly, Npl3 seems to have additional functions in DNA repair, and npl3∆ mutants are highly sensitive to genotoxic agents, such as the antitumor drug trabectedin. Here we discuss the role of Npl3 in particular, and RBPs in general, in the connection of transcription with replication and genome instability, and its effect on the DDR. PMID:24694687

  7. High-resolution linkage and quantitative trait locus mapping aided by genome survey sequencing: building up an integrative genomic framework for a bivalve mollusc.

    PubMed

    Jiao, Wenqian; Fu, Xiaoteng; Dou, Jinzhuang; Li, Hengde; Su, Hailin; Mao, Junxia; Yu, Qian; Zhang, Lingling; Hu, Xiaoli; Huang, Xiaoting; Wang, Yangfan; Wang, Shi; Bao, Zhenmin

    2014-02-01

    Genetic linkage maps are indispensable tools in genetic and genomic studies. Recent development of genotyping-by-sequencing (GBS) methods holds great promise for constructing high-resolution linkage maps in organisms lacking extensive genomic resources. In the present study, linkage mapping was conducted for a bivalve mollusc (Chlamys farreri) using a newly developed GBS method-2b-restriction site-associated DNA (2b-RAD). Genome survey sequencing was performed to generate a preliminary reference genome that was utilized to facilitate linkage and quantitative trait locus (QTL) mapping in C. farreri. A high-resolution linkage map was constructed with a marker density (3806) that has, to our knowledge, never been achieved in any other molluscs. The linkage map covered nearly the whole genome (99.5%) with a resolution of 0.41 cM. QTL mapping and association analysis congruously revealed two growth-related QTLs and one potential sex-determination region. An important candidate QTL gene named PROP1, which functions in the regulation of growth hormone production in vertebrates, was identified from the growth-related QTL region detected on the linkage group LG3. We demonstrate that this linkage map can serve as an important platform for improving genome assembly and unifying multiple genomic resources. Our study, therefore, exemplifies how to build up an integrative genomic framework in a non-model organism.

  8. High-Resolution Linkage and Quantitative Trait Locus Mapping Aided by Genome Survey Sequencing: Building Up An Integrative Genomic Framework for a Bivalve Mollusc

    PubMed Central

    Jiao, Wenqian; Fu, Xiaoteng; Dou, Jinzhuang; Li, Hengde; Su, Hailin; Mao, Junxia; Yu, Qian; Zhang, Lingling; Hu, Xiaoli; Huang, Xiaoting; Wang, Yangfan; Wang, Shi; Bao, Zhenmin

    2014-01-01

    Genetic linkage maps are indispensable tools in genetic and genomic studies. Recent development of genotyping-by-sequencing (GBS) methods holds great promise for constructing high-resolution linkage maps in organisms lacking extensive genomic resources. In the present study, linkage mapping was conducted for a bivalve mollusc (Chlamys farreri) using a newly developed GBS method—2b-restriction site-associated DNA (2b-RAD). Genome survey sequencing was performed to generate a preliminary reference genome that was utilized to facilitate linkage and quantitative trait locus (QTL) mapping in C. farreri. A high-resolution linkage map was constructed with a marker density (3806) that has, to our knowledge, never been achieved in any other molluscs. The linkage map covered nearly the whole genome (99.5%) with a resolution of 0.41 cM. QTL mapping and association analysis congruously revealed two growth-related QTLs and one potential sex-determination region. An important candidate QTL gene named PROP1, which functions in the regulation of growth hormone production in vertebrates, was identified from the growth-related QTL region detected on the linkage group LG3. We demonstrate that this linkage map can serve as an important platform for improving genome assembly and unifying multiple genomic resources. Our study, therefore, exemplifies how to build up an integrative genomic framework in a non-model organism. PMID:24107803

  9. Matrix Factorization-Based Prediction of Novel Drug Indications by Integrating Genomic Space.

    PubMed

    Dai, Wen; Liu, Xi; Gao, Yibo; Chen, Lin; Song, Jianglong; Chen, Di; Gao, Kuo; Jiang, Yongshi; Yang, Yiping; Chen, Jianxin; Lu, Peng

    2015-01-01

    There has been rising interest in the discovery of novel drug indications because of high costs in introducing new drugs. Many computational techniques have been proposed to detect potential drug-disease associations based on the creation of explicit profiles of drugs and diseases, while seldom research takes advantage of the immense accumulation of interaction data. In this work, we propose a matrix factorization model based on known drug-disease associations to predict novel drug indications. In addition, genomic space is also integrated into our framework. The introduction of genomic space, which includes drug-gene interactions, disease-gene interactions, and gene-gene interactions, is aimed at providing molecular biological information for prediction of drug-disease associations. The rationality lies in our belief that association between drug and disease has its evidence in the interactome network of genes. Experiments show that the integration of genomic space is indeed effective. Drugs, diseases, and genes are described with feature vectors of the same dimension, which are retrieved from the interaction data. Then a matrix factorization model is set up to quantify the association between drugs and diseases. Finally, we use the matrix factorization model to predict novel indications for drugs.

  10. Advances in the integration of transcriptional regulatory information into genome-scale metabolic models.

    PubMed

    Vivek-Ananth, R P; Samal, Areejit

    2016-09-01

    A major goal of systems biology is to build predictive computational models of cellular metabolism. Availability of complete genome sequences and wealth of legacy biochemical information has led to the reconstruction of genome-scale metabolic networks in the last 15 years for several organisms across the three domains of life. Due to paucity of information on kinetic parameters associated with metabolic reactions, the constraint-based modelling approach, flux balance analysis (FBA), has proved to be a vital alternative to investigate the capabilities of reconstructed metabolic networks. In parallel, advent of high-throughput technologies has led to the generation of massive amounts of omics data on transcriptional regulation comprising mRNA transcript levels and genome-wide binding profile of transcriptional regulators. A frontier area in metabolic systems biology has been the development of methods to integrate the available transcriptional regulatory information into constraint-based models of reconstructed metabolic networks in order to increase the predictive capabilities of computational models and understand the regulation of cellular metabolism. Here, we review the existing methods to integrate transcriptional regulatory information into constraint-based models of metabolic networks.

  11. The importance of safeguarding genome integrity in germination and seed longevity.

    PubMed

    Waterworth, Wanda M; Bray, Clifford M; West, Christopher E

    2015-06-01

    Seeds are important to agriculture and conservation of plant biodiversity. In agriculture, seed germination performance is an important determinant of crop yield, in particular under adverse climatic conditions. Deterioration in seed quality is associated with the accumulation of cellular damage to macromolecules including lipids, protein, and DNA. Mechanisms that mitigate the deleterious cellular damage incurred in the quiescent state and in cycles of desiccation-hydration are crucial for the maintenance of seed viability and germination vigour. In early-imbibing seeds, damage to the embryo genome must be repaired prior to initiation of cell division to minimize growth inhibition and mutation of genetic information. Here we review recent advances that have established molecular links between genome integrity and seed quality. These studies identified that maintenance of genome integrity is particularly important to the seed stage of the plant lifecycle, revealing new insight into the physiological roles of plant DNA repair and recombination mechanisms. The high conservation of DNA repair and recombination factors across plant species underlines their potential as promising targets for the improvement of crop performance and development of molecular markers for prediction of seed vigour.

  12. Telomeric repeat-containing RNA TERRA: a noncoding RNA connecting telomere biology to genome integrity.

    PubMed

    Cusanelli, Emilio; Chartrand, Pascal

    2015-01-01

    Telomeres are dynamic nucleoprotein structures that protect the ends of chromosomes from degradation and activation of DNA damage response. For this reason, telomeres are essential to genome integrity. Chromosome ends are enriched in heterochromatic marks and proper organization of telomeric chromatin is important to telomere stability. Despite their heterochromatic state, telomeres are transcribed giving rise to long noncoding RNAs (lncRNA) called TERRA (telomeric repeat-containing RNA). TERRA molecules play critical roles in telomere biology, including regulation of telomerase activity and heterochromatin formation at chromosome ends. Emerging evidence indicate that TERRA transcripts form DNA-RNA hybrids at chromosome ends which can promote homologous recombination among telomeres, delaying cellular senescence and sustaining genome instability. Intriguingly, TERRA RNA-telomeric DNA hybrids are involved in telomere length homeostasis of telomerase-negative cancer cells. Furthermore, TERRA transcripts play a role in the DNA damage response (DDR) triggered by dysfunctional telomeres. We discuss here recent developments on TERRA's role in telomere biology and genome integrity, and its implication in cancer.

  13. Opposite transcriptional regulation of integrated vs unintegrated HIV genomes by the NF-κB pathway

    PubMed Central

    Thierry, Sylvain; Thierry, Eloïse; Subra, Frédéric; Deprez, Eric; Leh, Hervé; Bury-Moné, Stéphanie; Delelis, Olivier

    2016-01-01

    Integration of HIV-1 linear DNA into host chromatin is required for high levels of viral expression, and constitutes a key therapeutic target. Unintegrated viral DNA (uDNA) can support only limited transcription but may contribute to viral propagation, persistence and/or treatment escape under specific situations. The molecular mechanisms involved in the differential expression of HIV uDNA vs integrated genome (iDNA) remain to be elucidated. Here, we demonstrate, for the first time, that the expression of HIV uDNA is mainly supported by 1-LTR circles, and regulated in the opposite way, relatively to iDNA, following NF-κB pathway modulation. Upon treatment activating the NF-κB pathway, NF-κB p65 and AP-1 (cFos/cJun) binding to HIV LTR iDNA correlates with increased iDNA expression, while uDNA expression decreases. On the contrary, inhibition of the NF-κB pathway promotes the expression of circular uDNA, and correlates with Bcl-3 and AP-1 binding to its LTR region. Finally, this study identifies NF-κB subunits and Bcl-3 as transcription factors binding the HIV promoter differently depending on viral genome topology, and opens new insights on the potential roles of episomal genomes during the HIV-1 latency and persistence. PMID:27167871

  14. An Integrated Approach to Reconstructing Genome-Scale Transcriptional Regulatory Networks

    PubMed Central

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.

    2015-01-01

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  15. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGESBeta

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; Leslie, Christina

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of

  16. Correlation between ebv co-infection and HPV16 genome integrity in Tunisian cervical cancer patients

    PubMed Central

    Kahla, Saloua; Oueslati, Sarra; Achour, Mongia; Kochbati, Lotfi; Chanoufi, Mohamed Badis; Maalej, Mongi; Oueslati, Ridha

    2012-01-01

    Infection with high risk Human papillomavirus (HR-HPV) is necessary but not sufficient to cause cervical carcinoma. This study explored whether multiple HR-HPV or coinfection with Epstein-Barr virus (EBV) influence the integration status of HPV16 genome. The presence and typing of HPV in a series of 125 cervical specimens were assessed by polymerase chain reaction (PCR) using the specific primers for the HPV L1 region. As for EBV infection, the viral EBNA1 gene was used for its detection through PCR amplification. Disruption of the HPV E2 gene was assessed by amplification of the entire E2 gene with single set of primers, while E2 transcripts were evaluated by a reverse transcription PCR method (RT-PCR). The overall prevalence of HPVDNA was of 81.8% in cervical cancers versus 26.9% in benign lesions. In HPV positive cases, HPV16 and HPV18 were the most prevalent types, followed by HPV types 33, 31. EBV EBNA1 prevalence was statistically more frequent in cervical carcinomas than in benign lesions (29.5%, vs 9.6%; P=0.01). No viral infection was detected in healthy control women. The uninterrupted E2 gene was correlated with the presence of E2 transcripts originating from the HPV episomal forms. It was observed that integration was more common in HPV18 and EBV coinfection. The presence of EBV caused a five-fold [OR= 5; CI= 1.15-21.8; P = 0.04] increase in the risk of HPV16 genome integration in the host genome. This study indicates that EBV infection is acting as a cofactor for induction of cervical cancer by favoring HPVDNA integration. PMID:24031886

  17. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    SciTech Connect

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; Leslie, Christina

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  18. An integrated approach to reconstructing genome-scale transcriptional regulatory networks.

    PubMed

    Imam, Saheed; Noguera, Daniel R; Donohue, Timothy J

    2015-02-01

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  19. A case of bilateral human herpes virus 6 panuveitis with genomic viral DNA integration

    PubMed Central

    2014-01-01

    Background We report a rare case of bilateral panuveitis from human herpes virus 6 (HHV-6) with genomic viral DNA integration in an immunocompromised man. Findings A 59-year-old man with history of multiple myeloma presented with altered mental status, bilateral eye redness, and blurry vision. Examination revealed bilateral diffuse keratic precipitates, 4+ anterior chamber cell, hypopyon, vitritis, and intraretinal hemorrhages. Intraocular fluid testing by polymerase chain reaction (PCR) was positive for HHV-6. The patient was successfully treated with intravitreal foscarnet and intravenous ganciclovir and foscarnet. Despite clinical improvement, his serum HHV-6 levels remained high, and it was concluded that he had HHV-6 chromosomal integration. Conclusions HHV-6 should be considered in the differential for infectious uveitis in immunocompromised hosts who may otherwise have a negative work-up. HHV-6 DNA integration may lead to difficulties in disease diagnosis and determining disease resolution. PMID:24995045

  20. An integrative approach for efficient analysis of whole genome bisulfite sequencing data

    PubMed Central

    2015-01-01

    Background Whole genome bisulfite sequencing (WGBS) is a high-throughput technique for profiling genome-wide DNA methylation at single nucleotide resolution. However, the applications of WGBS are limited by low accuracy resulting from bisulfite-induced damage on DNA fragments. Although many computer programs have been developed for accurate detecting, most of the programs have barely succeeded in improving either quantity or quality of the methylation results. To improve both, we attempted to develop a novel integration of most widely used bisulfite-read mappers: Bismark, BSMAP, and BS-seeker2. Results A comprehensive analysis of the three mappers revealed that the mapping results of the mappers were mutually complementary under diverse read conditions. Therefore, we sought to integrate the characteristics of the mappers by scoring them to gain robustness against artifacts. As a result, the integration significantly increased detection accuracy compared with the individual mappers. In addition, the amount of detected cytosine was higher than that by Bismark. Furthermore, the integration successfully reduced the fluctuation of detection accuracy induced by read conditions. We applied the integration to real WGBS samples and succeeded in classifying the samples according to the originated tissues by both CpG and CpH methylation patterns. Conclusions In this study, we improved both quality and quantity of methylation results from WGBS data by integrating the mapping results of three bisulfite-read mappers. Also, we succeeded in combining and comparing WGBS samples by reducing the effects of read heterogeneity on methylation detection. This study contributes to DNA methylation researches by improving efficiency of methylation detection from WGBS data and facilitating the comprehensive analysis of public WGBS data. PMID:26680746

  1. BiologicalNetworks 2.0 - an integrative view of genome biology data

    PubMed Central

    2010-01-01

    Background A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. Results Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. Conclusions The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org. PMID:21190573

  2. Conditional Epistatic Interaction Maps Reveal Global Functional Rewiring of Genome Integrity Pathways in Escherichia coli.

    PubMed

    Kumar, Ashwani; Beloglazova, Natalia; Bundalovic-Torma, Cedoljub; Phanse, Sadhna; Deineko, Viktor; Gagarinova, Alla; Musso, Gabriel; Vlasblom, James; Lemak, Sofia; Hooshyar, Mohsen; Minic, Zoran; Wagih, Omar; Mosca, Roberto; Aloy, Patrick; Golshani, Ashkan; Parkinson, John; Emili, Andrew; Yakunin, Alexander F; Babu, Mohan

    2016-01-26

    As antibiotic resistance is increasingly becoming a public health concern, an improved understanding of the bacterial DNA damage response (DDR), which is commonly targeted by antibiotics, could be of tremendous therapeutic value. Although the genetic components of the bacterial DDR have been studied extensively in isolation, how the underlying biological pathways interact functionally remains unclear. Here, we address this by performing systematic, unbiased, quantitative synthetic genetic interaction (GI) screens and uncover widespread changes in the GI network of the entire genomic integrity apparatus of Escherichia coli under standard and DNA-damaging growth conditions. The GI patterns of untreated cultures implicated two previously uncharacterized proteins (YhbQ and YqgF) as nucleases, whereas reorganization of the GI network after DNA damage revealed DDR roles for both annotated and uncharacterized genes. Analyses of pan-bacterial conservation patterns suggest that DDR mechanisms and functional relationships are near universal, highlighting a modular and highly adaptive genomic stress response.

  3. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. PMID:26432246

  4. Advances in environmental genomics: towards an integrated view of micro-organisms and ecosystems.

    PubMed

    Bertin, Philippe N; Médigue, Claudine; Normand, Philippe

    2008-02-01

    Microbial genome sequencing has, for the first time, made accessible all the components needed for both the elaboration and the functioning of a cell. Associated with other global methods such as protein and mRNA profiling, genomics has considerably extended our knowledge of physiological processes and their diversity not only in human, animal and plant pathogens but also in environmental isolates. At a higher level of complexity, the so-called meta approaches have recently shown great promise in investigating microbial communities, including uncultured micro-organisms. Combined with classical methods of physico-chemistry and microbiology, these endeavours should provide us with an integrated view of how micro-organisms adapt to particular ecological niches and participate in the dynamics of ecosystems.

  5. The GDB Human Genome Data Base: a source of integrated genetic mapping and disease data.

    PubMed Central

    Brandt, K A

    1993-01-01

    The GDB Human Genome Data Base refers collectively to GDB and OMIM, Online Mendelian Inheritance in Man. GDB and OMIM are linked databases that provide an international repository for information generated by the Human Genome Initiative. GDB contains human gene mapping data, while OMIM offers the text of Dr. Victor A. McKusick's catalog of genetic disease and phenotype descriptions. These databases, updated and edited continuously, integrate bibliographic and full-text information with several types of mapping data. They are accessible through a flexible interface and are available through SprintNet and the Internet to the scientific community without cost. This paper provides an overview of the context, development, structure, content, and use of these databases. PMID:8374584

  6. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

  7. An integrated map of structural variation in 2,504 human genomes

    PubMed Central

    Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K.; Malhotra, Ankit; Stütz, Adrian M.; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J.P.; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y. K.; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M.; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A.; Marth, Gabor; Mason, Christopher E.; Menelaou, Androniki; Muzny, Donna M.; Nelson, Bradley J.; Noor, Amina; Parrish, Nicholas F.; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E.; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A.; Untergasser, Andreas; Walker, Jerilyn A.; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A.; McCarroll, Steven A.; Mills, Ryan E.; Gerstein, Mark B.; Bashir, Ali; Stegle, Oliver; Devine, Scott E.; Lee, Charles; Eichler, Evan E.; Korbel, Jan O.

    2015-01-01

    Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association. PMID:26432246

  8. Pancreatic cancer modeling using retrograde viral vector delivery and in vivo CRISPR/Cas9-mediated somatic genome editing

    PubMed Central

    Chiou, Shin-Heng; Winters, Ian P.; Wang, Jing; Naranjo, Santiago; Dudgeon, Crissy; Tamburini, Fiona B.; Brady, Jennifer J.; Yang, Dian; Grüner, Barbara M.; Chuang, Chen-Hua; Caswell, Deborah R.; Zeng, Hong; Chu, Pauline; Kim, Grace E.; Carpizo, Darren R.; Kim, Seung K.; Winslow, Monte M.

    2015-01-01

    Pancreatic ductal adenocarcinoma (PDAC) is a genomically diverse, prevalent, and almost invariably fatal malignancy. Although conventional genetically engineered mouse models of human PDAC have been instrumental in understanding pancreatic cancer development, these models are much too labor-intensive, expensive, and slow to perform the extensive molecular analyses needed to adequately understand this disease. Here we demonstrate that retrograde pancreatic ductal injection of either adenoviral-Cre or lentiviral-Cre vectors allows titratable initiation of pancreatic neoplasias that progress into invasive and metastatic PDAC. To enable in vivo CRISPR/Cas9-mediated gene inactivation in the pancreas, we generated a Cre-regulated Cas9 allele and lentiviral vectors that express Cre and a single-guide RNA. CRISPR-mediated targeting of Lkb1 in combination with oncogenic Kras expression led to selection for inactivating genomic alterations, absence of Lkb1 protein, and rapid tumor growth that phenocopied Cre-mediated genetic deletion of Lkb1. This method will transform our ability to rapidly interrogate gene function during the development of this recalcitrant cancer. PMID:26178787

  9. Pancreatic cancer modeling using retrograde viral vector delivery and in vivo CRISPR/Cas9-mediated somatic genome editing.

    PubMed

    Chiou, Shin-Heng; Winters, Ian P; Wang, Jing; Naranjo, Santiago; Dudgeon, Crissy; Tamburini, Fiona B; Brady, Jennifer J; Yang, Dian; Grüner, Barbara M; Chuang, Chen-Hua; Caswell, Deborah R; Zeng, Hong; Chu, Pauline; Kim, Grace E; Carpizo, Darren R; Kim, Seung K; Winslow, Monte M

    2015-07-15

    Pancreatic ductal adenocarcinoma (PDAC) is a genomically diverse, prevalent, and almost invariably fatal malignancy. Although conventional genetically engineered mouse models of human PDAC have been instrumental in understanding pancreatic cancer development, these models are much too labor-intensive, expensive, and slow to perform the extensive molecular analyses needed to adequately understand this disease. Here we demonstrate that retrograde pancreatic ductal injection of either adenoviral-Cre or lentiviral-Cre vectors allows titratable initiation of pancreatic neoplasias that progress into invasive and metastatic PDAC. To enable in vivo CRISPR/Cas9-mediated gene inactivation in the pancreas, we generated a Cre-regulated Cas9 allele and lentiviral vectors that express Cre and a single-guide RNA. CRISPR-mediated targeting of Lkb1 in combination with oncogenic Kras expression led to selection for inactivating genomic alterations, absence of Lkb1 protein, and rapid tumor growth that phenocopied Cre-mediated genetic deletion of Lkb1. This method will transform our ability to rapidly interrogate gene function during the development of this recalcitrant cancer. PMID:26178787

  10. USF-1 Is Critical for Maintaining Genome Integrity in Response to UV-Induced DNA Photolesions

    PubMed Central

    Mouchet, Nicolas; Vaulont, Sophie; Prince, Sharon; Galibert, Marie-Dominique

    2012-01-01

    An important function of all organisms is to ensure that their genetic material remains intact and unaltered through generations. This is an extremely challenging task since the cell's DNA is constantly under assault by endogenous and environmental agents. To protect against this, cells have evolved effective mechanisms to recognize DNA damage, signal its presence, and mediate its repair. While these responses are expected to be highly regulated because they are critical to avoid human diseases, very little is known about the regulation of the expression of genes involved in mediating their effects. The Nucleotide Excision Repair (NER) is the major DNA–repair process involved in the recognition and removal of UV-mediated DNA damage. Here we use a combination of in vitro and in vivo assays with an intermittent UV-irradiation protocol to investigate the regulation of key players in the DNA–damage recognition step of NER sub-pathways (TCR and GGR). We show an up-regulation in gene expression of CSA and HR23A, which are involved in TCR and GGR, respectively. Importantly, we show that this occurs through a p53 independent mechanism and that it is coordinated by the stress-responsive transcription factor USF-1. Furthermore, using a mouse model we show that the loss of USF-1 compromises DNA repair, which suggests that USF-1 plays an important role in maintaining genomic stability. PMID:22291606

  11. Ih-mediated depolarization enhances the temporal precision of neuronal integration

    PubMed Central

    Pavlov, Ivan; Scimemi, Annalisa; Savtchenko, Leonid; Kullmann, Dimitri M.; Walker, Matthew C.

    2011-01-01

    Feed-forward inhibition mediated by ionotropic GABAA receptors contributes to the temporal precision of neuronal signal integration. These receptors exert their inhibitory effect by shunting excitatory currents and by hyperpolarizing neurons. The relative roles of these mechanisms in neuronal computations are, however, incompletely understood. In this study, we show that by depolarizing the resting membrane potential relative to the reversal potential for GABAA receptors, the hyperpolarization-activated mixed cation current (Ih) maintains a voltage gradient for fast synaptic inhibition in hippocampal pyramidal cells. Pharmacological or genetic ablation of Ih broadens the depolarizing phase of afferent synaptic waveforms by hyperpolarizing the resting membrane potential. This increases the integration time window for action potential generation. These results indicate that the hyperpolarizing component of GABAA receptor-mediated inhibition has an important role in maintaining the temporal fidelity of coincidence detection and suggest a previously unrecognized mechanism by which Ih modulates information processing in the hippocampus. PMID:21326231

  12. Effects of Integrating and Non-Integrating Reprogramming Methods on Copy Number Variation and Genomic Stability of Human Induced Pluripotent Stem Cells.

    PubMed

    Kang, Xiangjin; Yu, Qian; Huang, Yuling; Song, Bing; Chen, Yaoyong; Gao, Xingcheng; He, Wenyin; Sun, Xiaofang; Fan, Yong

    2015-01-01

    Human-induced pluripotent stem cells (iPSCs) are derived from differentiated somatic cells using defined factors and provide a renewable source of autologous cells for cell therapy. Many reprogramming methods have been employed to generate human iPSCs, including the use of integrating vectors and non-integrating vectors. Maintenance of the genomic integrity of iPSCs is highly desirable if the cells are to be used in clinical applications. Here, using the Affymetrix Cytoscan HD array, we investigated the genomic aberration profiles of 19 human cell lines: 5 embryonic stem cell (ESC) lines, 6 iPSC lines derived using integrating vectors ("integrating iPSC lines"), 6 iPSC lines derived using non-integrating vectors ("non-integrating iPSC lines"), and the 2 parental cell lines from which the iPSCs were derived. The genome-wide copy number variation (CNV), loss of heterozygosity (LOH) and mosaicism patterns of integrating and non-integrating iPSC lines were investigated. The maximum sizes of CNVs in the genomes of the integrating iPSC lines were 20 times higher than those of the non-integrating iPSC lines. Moreover, the total number of CNVs was much higher in integrating iPSC lines than in other cell lines. The average numbers of novel CNVs with a low degree of overlap with the DGV and of likely pathogenic CNVs with a high degree of overlap with the ISCA (International Symposium on Computer Architecture) database were highest in integrating iPSC lines. Different single nucleotide polymorphisms (SNP) calls revealed that, using the parental cell genotype as a reference, integrating iPSC lines displayed more single nucleotide variations and mosaicism than did non-integrating iPSC lines. This study describes the genome stability of human iPSCs generated using either a DNA-integrating or non-integrating reprogramming method, of the corresponding somatic cells, and of hESCs. Our results highlight the importance of using a high-resolution method to monitor genomic aberrations

  13. Effects of Integrating and Non-Integrating Reprogramming Methods on Copy Number Variation and Genomic Stability of Human Induced Pluripotent Stem Cells.

    PubMed

    Kang, Xiangjin; Yu, Qian; Huang, Yuling; Song, Bing; Chen, Yaoyong; Gao, Xingcheng; He, Wenyin; Sun, Xiaofang; Fan, Yong

    2015-01-01

    Human-induced pluripotent stem cells (iPSCs) are derived from differentiated somatic cells using defined factors and provide a renewable source of autologous cells for cell therapy. Many reprogramming methods have been employed to generate human iPSCs, including the use of integrating vectors and non-integrating vectors. Maintenance of the genomic integrity of iPSCs is highly desirable if the cells are to be used in clinical applications. Here, using the Affymetrix Cytoscan HD array, we investigated the genomic aberration profiles of 19 human cell lines: 5 embryonic stem cell (ESC) lines, 6 iPSC lines derived using integrating vectors ("integrating iPSC lines"), 6 iPSC lines derived using non-integrating vectors ("non-integrating iPSC lines"), and the 2 parental cell lines from which the iPSCs were derived. The genome-wide copy number variation (CNV), loss of heterozygosity (LOH) and mosaicism patterns of integrating and non-integrating iPSC lines were investigated. The maximum sizes of CNVs in the genomes of the integrating iPSC lines were 20 times higher than those of the non-integrating iPSC lines. Moreover, the total number of CNVs was much higher in integrating iPSC lines than in other cell lines. The average numbers of novel CNVs with a low degree of overlap with the DGV and of likely pathogenic CNVs with a high degree of overlap with the ISCA (International Symposium on Computer Architecture) database were highest in integrating iPSC lines. Different single nucleotide polymorphisms (SNP) calls revealed that, using the parental cell genotype as a reference, integrating iPSC lines displayed more single nucleotide variations and mosaicism than did non-integrating iPSC lines. This study describes the genome stability of human iPSCs generated using either a DNA-integrating or non-integrating reprogramming method, of the corresponding somatic cells, and of hESCs. Our results highlight the importance of using a high-resolution method to monitor genomic

  14. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations

    PubMed Central

    Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD. PMID:26849207

  15. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    PubMed

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD.

  16. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    PubMed

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD. PMID:26849207

  17. Contrasting growth phenology of native and invasive forest shrubs mediated by genome size.

    PubMed

    Fridley, Jason D; Craddock, Alaä

    2015-08-01

    Examination of the significance of genome size to plant invasions has been largely restricted to its association with growth rate. We investigated the novel hypothesis that genome size is related to forest invasions through its association with growth phenology, as a result of the ability of large-genome species to grow more effectively through cell expansion at cool temperatures. We monitored the spring leaf phenology of 54 species of eastern USA deciduous forests, including native and invasive shrubs of six common genera. We used new measurements of genome size to evaluate its association with spring budbreak, cell size, summer leaf production rate, and photosynthetic capacity. In a phylogenetic hierarchical model that differentiated native and invasive species as a function of summer growth rate and spring budbreak timing, species with smaller genomes exhibited both faster growth and delayed budbreak compared with those with larger nuclear DNA content. Growth rate, but not budbreak timing, was associated with whether a species was native or invasive. Our results support genome size as a broad indicator of the growth behavior of woody species. Surprisingly, invaders of deciduous forests show the same small-genome tendencies of invaders of more open habitats, supporting genome size as a robust indicator of invasiveness.

  18. Silicon-on-insulator sensors using integrated resonance-enhanced defect-mediated photodetectors.

    PubMed

    Fard, Sahba Talebi; Murray, Kyle; Caverley, Michael; Donzella, Valentina; Flueckiger, Jonas; Grist, Samantha M; Huante-Ceron, Edgar; Schmidt, Shon A; Kwok, Ezra; Jaeger, Nicolas A F; Knights, Andrew P; Chrostowski, Lukas

    2014-11-17

    A resonance-enhanced, defect-mediated, ring resonator photodetector has been implemented as a single unit biosensor on a silicon-on-insulator platform, providing a cost effective means of integrating ring resonator sensors with photodetectors for lab-on-chip applications. This method overcomes the challenge of integrating hybrid photodetectors on the chip. The demonstrated responsivity of the photodetector-sensor was 90 mA/W. Devices were characterized using refractive index modified solutions and showed sensitivities of 30 nm/RIU.

  19. Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility

    PubMed Central

    Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K.; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C.; Burgess, Shawn M.; Sampath, Karuna

    2016-01-01

    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. PMID:26818075

  20. Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics.

    PubMed

    Sakai, Hiroaki; Lee, Sung Shin; Tanaka, Tsuyoshi; Numa, Hisataka; Kim, Jungsok; Kawahara, Yoshihiro; Wakimoto, Hironobu; Yang, Ching-chia; Iwamoto, Masao; Abe, Takashi; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro; Ikemura, Toshimichi; Matsumoto, Takashi; Sasaki, Takuji; Itoh, Takeshi

    2013-02-01

    The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics.

  1. Efficient site-specific integration in Plasmodium falciparum chromosomes mediated by mycobacteriophage Bxb1 integrase

    PubMed Central

    Nkrumah, Louis J; Muhle, Rebecca A; Moura, Pedro A; Ghosh, Pallavi; Hatfull, Graham F; Jacobs, William R; Fidock, David A

    2010-01-01

    Here we report an efficient, site-specific system of genetic integration into Plasmodium falciparum malaria parasite chromosomes. This is mediated by mycobacteriophage Bxb1 integrase, which catalyzes recombination between an incoming attP and a chromosomal attB site. We developed P. falciparum lines with the attB site integrated into the glutaredoxin-like cg6 gene. Transfection of these attB+ lines with a dual-plasmid system, expressing a transgene on an attP-containing plasmid together with a drug resistance gene and the integrase on a separate plasmid, produced recombinant parasites within 2 to 4 weeks that were genetically uniform for single-copy plasmid integration. Integrase-mediated recombination resulted in proper targeting of parasite proteins to intra-erythrocytic compartments, including the apicoplast, a plastid-like organelle. Recombinant attB × attP parasites were genetically stable in the absence of drug and were phenotypically homogeneous. This system can be exploited for rapid genetic integration and complementation analyses at any stage of the P. falciparum life cycle, and it illustrates the utility of Bxb1-based integrative recombination for genetic studies of intracellular eukaryotic organisms. PMID:16862136

  2. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care.

    PubMed

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K; Chung, Yeun-Jun

    2013-12-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population.

  3. Perspectives on Clinical Informatics: Integrating Large-Scale Clinical, Genomic, and Health Information for Clinical Care

    PubMed Central

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K.

    2013-01-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  4. Approaches for the study of cancer: towards the integration of genomics, proteomics and metabolomics.

    PubMed

    Casado-Vela, Juan; Cebrián, Arancha; Gómez del Pulgar, María Teresa; Lacal, Juan Carlos

    2011-09-01

    Recent technological advances, combined with the development of bioinformatic tools, allow us to better address biological questions combining -omic approaches (i.e., genomics, metabolomics and proteomics). This novel comprehensive perspective addresses the identification, characterisation and quantitation of the whole repertoire of genes, proteins and metabolites occurring in living organisms. Here we provide an overview of recent significant advances and technologies used in genomics, metabolomics and proteomics. We also underline the importance and limits of mass accuracy in mass spectrometry-based -omics and briefly describe emerging types of fragmentation used in mass spectrometry. The range of instruments and techniques used to address the study of each -omic approach, which provide vast amounts of information (usually termed "high-throughput" technologies in the literature) is briefly discussed, including names, links and descriptions of the main databases, data repositories and resources used. Integration of multiple -omic results and procedures seems necessary. Therefore, an emerging challenge is the integration of the huge amount of data generated and the standardisation of the procedures and methods used. Functional data integration will lead to answers to unsolved questions, hopefully, applicable to clinical practice and management of patients.

  5. Integration of the Rat Recombination and EST Maps in the Rat Genomic Sequence and Comparative Mapping Analysis With the Mouse Genome

    PubMed Central

    Wilder, Steven P.; Bihoreau, Marie-Thérèse; Argoud, Karène; Watanabe, Takeshi K.; Lathrop, Mark; Gauguier, Dominique

    2004-01-01

    Inbred strains of the laboratory rat are widely used for identifying genetic regions involved in the control of complex quantitative phenotypes of biomedical importance. The draft genomic sequence of the rat now provides essential information for annotating rat quantitative trait locus (QTL) maps. Following the survey of unique rat microsatellite (11,585 including 1648 new markers) and EST (10,067) markers currently available, we have incorporated a selection of 7952 rat EST sequences in an improved version of the integrated linkage-radiation hybrid map of the rat containing 2058 microsatellite markers which provided over 10,000 potential anchor points between rat QTL and the genomic sequence of the rat. A total of 996 genetic positions were resolved (avg. spacing 1.77 cM) in a single large intercross and anchored in the rat genomic sequence (avg. spacing 1.62 Mb). Comparative genome maps between rat and mouse were constructed by successful computational alignment of 6108 mapped rat ESTs in the mouse genome. The integration of rat linkage maps in the draft genomic sequence of the rat and that of other species represents an essential step for translating rat QTL intervals into human chromosomal targets. PMID:15060020

  6. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    PubMed

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  7. The GLOBE 3D Genome Platform - towards a novel system-biological paper tool to integrate the huge complexity of genome organization and function.

    PubMed

    Knoch, Tobias A; Lesnussa, Michael; Kepper, Nick; Eussen, Hubert B; Grosveld, Frank G

    2009-01-01

    Genomes are tremendous co-evolutionary holistic systems for molecular storage, processing and fabrication of information. Their system-biological complexity remains, however, still largely mysterious, despite immense sequencing achievements and huge advances in the understanding of the general sequential, three-dimensional and regulatory organization. Here, we present the GLOBE 3D Genome Platform a completely novel grid based virtual "paper" tool and in fact the first system-biological genome browser integrating the holistic complexity of genomes in a single easy comprehensible platform: Based on a detailed study of biophysical and IT requirements, every architectural level from sequence to morphology of one or several genomes can be approached in a real and in a symbolic representation simultaneously and navigated by continuous scale-free zooming within a unique three-dimensional OpenGL and grid driven environment. In principle an unlimited number of multi-dimensional data sets can be visualized, customized in terms of arrangement, shape, colour, and texture etc. as well as accessed and annotated individually or in groups using internal or external data bases/facilities. Any information can be searched and correlated by importing or calculating simple relations in real-time using grid resources. A general correlation and application platform for more complex correlative analysis and a front-end for system-biological simulations both using again the huge capabilities of grid infrastructures is currently under development. Hence, the GLOBE 3D Genome Platform is an example of a grid based approach towards a virtual desktop for genomic work combining the three fundamental distributed resources: i) visual data representation, ii) data access and management, and iii) data analysis and creation. Thus, the GLOBE 3D Genome Platform is the novel system-biology oriented information system urgently needed to access, present, annotate, and to simulate the holistic genome

  8. The GLOBE 3D Genome Platform - towards a novel system-biological paper tool to integrate the huge complexity of genome organization and function.

    PubMed

    Knoch, Tobias A; Lesnussa, Michael; Kepper, Nick; Eussen, Hubert B; Grosveld, Frank G

    2009-01-01

    Genomes are tremendous co-evolutionary holistic systems for molecular storage, processing and fabrication of information. Their system-biological complexity remains, however, still largely mysterious, despite immense sequencing achievements and huge advances in the understanding of the general sequential, three-dimensional and regulatory organization. Here, we present the GLOBE 3D Genome Platform a completely novel grid based virtual "paper" tool and in fact the first system-biological genome browser integrating the holistic complexity of genomes in a single easy comprehensible platform: Based on a detailed study of biophysical and IT requirements, every architectural level from sequence to morphology of one or several genomes can be approached in a real and in a symbolic representation simultaneously and navigated by continuous scale-free zooming within a unique three-dimensional OpenGL and grid driven environment. In principle an unlimited number of multi-dimensional data sets can be visualized, customized in terms of arrangement, shape, colour, and texture etc. as well as accessed and annotated individually or in groups using internal or external data bases/facilities. Any information can be searched and correlated by importing or calculating simple relations in real-time using grid resources. A general correlation and application platform for more complex correlative analysis and a front-end for system-biological simulations both using again the huge capabilities of grid infrastructures is currently under development. Hence, the GLOBE 3D Genome Platform is an example of a grid based approach towards a virtual desktop for genomic work combining the three fundamental distributed resources: i) visual data representation, ii) data access and management, and iii) data analysis and creation. Thus, the GLOBE 3D Genome Platform is the novel system-biology oriented information system urgently needed to access, present, annotate, and to simulate the holistic genome

  9. Organellar genome copy number variation and integrity during moderate maturation of roots and leaves of maize seedlings.

    PubMed

    Ma, Jin; Li, Xiu-Qing

    2015-11-01

    Little information is available about organellar genome copy numbers and integrity in plant roots, although it was reported recently that the plastid and mitochondrial genomes were damaged under light, resulting in non-functional fragments in green seedling leaves in a maize line. In the present study, we investigated organellar genome copy numbers and integrity, after assessing the cellular ploidy, in seedling leaves and roots of two elite maize (Zea mays) cultivars using both long-fragment polymerase chain reaction (long-PCR) and real-time quantitative polymerase chain reaction (qPCR, a type of short-PCR). Since maize leaf and root cells are mainly diploid according to chromosome number counting and the literature, the DNA amount ratio between the organellar genomes and the nuclear genome could be used to estimate average organellar genome copy numbers per cell. In the present study, both long-PCR and qPCR analyses found that green leaves had dramatically more plastid DNA and less mitochondrial DNA than roots had in both cultivars. The similarity in results from long-PCR and qPCR suggests that green leaves and roots during moderate maturation have largely intact plastid and mitochondrial genomes. The high resolution of qPCR led to the detection of an increase in copies in the plastid genome and a decrease in copies in the analyzed mitochondrial sub-genomes during the moderate maturation of seedling leaves and roots. These results suggest that green seedling leaves and roots of these two maize cultivars during moderate maturation had essentially intact organellar genomes, an increased copy number of the plastid genome, and decreased copy numbers of certain mitochondrial sub-genomes.

  10. Bacterial populations as perfect gases: genomic integrity and diversification tensions in Helicobacter pylori.

    PubMed

    Kang, Josephine; Blaser, Martin J

    2006-11-01

    Microorganisms that persist in single hosts face particular challenges. Helicobacter pylori, an obligate bacterial parasite of the human stomach, has evolved a lifestyle that features interstrain competition and intraspecies cooperation, both of which involve horizontal gene transfer. Microbial species must maintain genomic integrity, yet H. pylori has evolved a complex nonlinear system for diversification that exists in dynamic tension with the mechanisms for ensuring fidelity. Here, we review these tensions and propose that they create a dynamic pool of genetic variants that is sufficiently genetically diverse to allow H. pylori to occupy all of the potential niches in the stomach. PMID:17041630

  11. Methods for integration of transcriptomic data in genome-scale metabolic models

    PubMed Central

    Kim, Min Kyung; Lun, Desmond S.

    2014-01-01

    Several computational methods have been developed that integrate transcriptomic data with genome-scale metabolic reconstructions to infer condition-specific system-wide intracellular metabolic flux distributions. In this mini-review, we describe each of these methods published to date with categorizing them based on four different grouping criteria (requirement for multiple gene expression datasets as input, requirement for a threshold to define a gene's high and low expression, requirement for a priori assumption of an appropriate objective function, and validation of predicted fluxes directly against measured intracellular fluxes). Then, we recommend which group of methods would be more suitable from a practical perspective. PMID:25379144

  12. Evolutionary aspects of plastid proteins involved in transcription: the transcription of a tiny genome is mediated by a complicated machinery.

    PubMed

    Yagi, Yusuke; Shiina, Takashi

    2012-01-01

    Chloroplasts in land plants have a small genome consisting of only 100 genes encoding partial sets of proteins for photosynthesis, transcription and translation. Although it has been thought that chloroplast transcription is mediated by a basically cyanobacterium-derived system, due to the endosymbiotic origin of plastids, recent studies suggest the existence of a hybrid transcription machinery containing non-bacterial proteins that have been newly acquired during plant evolution. Here, we highlight chloroplast-specific non-bacterial transcription mechanisms by which land plant chloroplasts have gained novel functions.

  13. Complete genome sequence of the Sporosarcina psychrophila DSM 6497, a psychrophilic Bacillus strain that mediates the calcium carbonate precipitation.

    PubMed

    Yan, Wenkai; Xiao, Xiang; Zhang, Yu

    2016-05-20

    Sporosarcina psychrophila DSM 6497 is a gram positive, spore-formation psychrophilic bacterial strain, widely distributed in terrestrial and aquatic environments. Here we report its complete sequence including one circular chromosome of 4674191bp with a GC content of 40.3%. Genes encoding urease are predicted in the genome, which provide insight information on the microbiologically mediated urea hydrolysis process. This urea hydrolysis can further lead to an increase of carbonate anion and alkalinity in the environment, which promotes the microbiologically induced carbonate precipitation with various applications, such as the bioremediation of calcium rich wastewater and bio-reservation of architectural patrimony.

  14. A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication

    PubMed Central

    2014-01-01

    Background Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. Results We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. Conclusions The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel. PMID:24669946

  15. High-resolution FISH of the entire integrated Epstein-Barr virus genome on extended human DNA.

    PubMed

    Lestou, V S; Strehl, S; Lion, T; Gadner, H; Ambros, P F

    1996-01-01

    Here we report a high-resolution fluorescence in situ hybridization (FISH) analysis of the integrated Epstein-Barr virus (EBV) genome in chromosomes, decondensed interphase nuclear chromatin, and linearly extended chromatin fibers. We analyzed the EBV DNA integrated into the human genome in the well-characterized Burkitt's lymphoma cell line Namalwa, which contains two complete EBV genomes. The integration occurs via the terminal repeats of the virus and was always detectable at chromosome band 1p35. Using the biotinylated BamHIW fragment of the viral DNA, we observed distinct pairs of signals or small nuclear RNA "tracks" within interphase nuclei. FISH to stretched DNA fibers has a higher resolving power and; therefore, enables analysis of the structural organization of DNA. Application of this methodology to linearly extended chromatin of Namalwa cells using different EBV fragments allowed us to visualize the ordered arrangement of the integrated virus. Based on the predicted span of 0.34 nm per base pair for relaxed DNA, length measurements of 30 images showed a good correlation between the mean physical length of hybridized EBV DNA of 52.8 microns (158 kb) without the terminal repeats, and the EBV genomic length of 172 kb, including the terminal repeats. This DNA mapping procedure represents a useful tool for studying the structural organization of integrated viral genomes, and its application will have implications for the understanding of integration processes. PMID:8941376

  16. Multiple proviral integration events after virological synapse-mediated HIV-1 spread

    SciTech Connect

    Russell, Rebecca A.; Martin, Nicola; Mitar, Ivonne; Jones, Emma; Sattentau, Quentin J.

    2013-08-15

    HIV-1 can move directly between T cells via virological synapses (VS). Although aspects of the molecular and cellular mechanisms underlying this mode of spread have been elucidated, the outcomes for infection of the target cell remain incompletely understood. We set out to determine whether HIV-1 transfer via VS results in productive, high-multiplicity HIV-1 infection. We found that HIV-1 cell-to-cell spread resulted in nuclear import of multiple proviruses into target cells as seen by fluorescence in-situ hybridization. Proviral integration into the target cell genome was significantly higher than that seen in a cell-free infection system, and consequent de novo viral DNA and RNA production in the target cell detected by quantitative PCR increased over time. Our data show efficient proviral integration across VS, implying the probability of multiple integration events in target cells that drive productive T cell infection. - Highlights: • Cell-to-cell HIV-1 infection delivers multiple vRNA copies to the target cell. • Cell-to-cell infection results in productive infection of the target cell. • Cell-to-cell transmission is more efficient than cell-free HIV-1 infection. • Suggests a mechanism for recombination in cells infected with multiple viral genomes.

  17. Cytologically integrated physical restriction fragment length polymorphism maps for the barley genome based on translocation breakpoints.

    PubMed Central

    Künzel, G; Korzun, L; Meister, A

    2000-01-01

    We have developed a new technique for the physical mapping of barley chromosomes using microdissected translocation chromosomes for PCR with sequence-tagged site primers derived from >300 genetically mapped RFLP probes. The positions of 240 translocation breakpoints were integrated as physical landmarks into linkage maps of the seven barley chromosomes. This strategy proved to be highly efficient in relating physical to genetic distances. A very heterogeneous distribution of recombination rates was found along individual chromosomes. Recombination is mainly confined to a few relatively small areas spaced by large segments in which recombination is severely suppressed. The regions of highest recombination frequency (genome and harbor 47.3% of the 429 markers of the studied RFLP map. The results for barley correspond well with those obtained by deletion mapping in wheat. This indicates that chromosomal regions characterized by similar recombination frequencies and marker densities are highly conserved between the genomes of barley and wheat. The findings for barley support the conclusions drawn from deletion mapping in wheat that for all plant genomes, notwithstanding their size, the marker-rich regions are all of similar gene density and recombination activity and, therefore, should be equally accessible to map-based cloning. PMID:10628998

  18. Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci

    PubMed Central

    Miller, Clint L.; Pjanic, Milos; Wang, Ting; Nguyen, Trieu; Cohain, Ariella; Lee, Jonathan D.; Perisic, Ljubica; Hedin, Ulf; Kundu, Ramendra K.; Majmudar, Deshna; Kim, Juyong B.; Wang, Oliver; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Assimes, Themistocles L.; Montgomery, Stephen B.; Schadt, Eric E.; Björkegren, Johan L.M.; Quertermous, Thomas

    2016-01-01

    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified >150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context. PMID:27386823

  19. An integrated map of genetic variation from 1,092 human genomes.

    PubMed

    Abecasis, Goncalo R; Auton, Adam; Brooks, Lisa D; DePristo, Mark A; Durbin, Richard M; Handsaker, Robert E; Kang, Hyun Min; Marth, Gabor T; McVean, Gil A

    2012-11-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. PMID:23128226

  20. An integrated map of genetic variation from 1,092 human genomes

    PubMed Central

    2012-01-01

    Summary Through characterising the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help understand the genetic contribution to disease. We describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methodologies to integrate information across multiple algorithms and diverse data sources we provide a validated haplotype map of 38 million SNPs, 1.4 million indels and over 14 thousand larger deletions. We show that individuals from different populations carry different profiles of rare and common variants and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways and that each individual harbours hundreds of rare non-coding variants at conserved sites, such as transcription-factor-motif disrupting changes. This resource, which captures up to 98% of accessible SNPs at a frequency of 1% in populations of medical genetics focus, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. PMID:23128226

  1. Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci.

    PubMed

    Miller, Clint L; Pjanic, Milos; Wang, Ting; Nguyen, Trieu; Cohain, Ariella; Lee, Jonathan D; Perisic, Ljubica; Hedin, Ulf; Kundu, Ramendra K; Majmudar, Deshna; Kim, Juyong B; Wang, Oliver; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Assimes, Themistocles L; Montgomery, Stephen B; Schadt, Eric E; Björkegren, Johan L M; Quertermous, Thomas

    2016-01-01

    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified >150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context. PMID:27386823

  2. An integrated map of genetic variation from 1,092 human genomes.

    PubMed

    Abecasis, Goncalo R; Auton, Adam; Brooks, Lisa D; DePristo, Mark A; Durbin, Richard M; Handsaker, Robert E; Kang, Hyun Min; Marth, Gabor T; McVean, Gil A

    2012-11-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.

  3. InvFEST, a database integrating information of polymorphic inversions in the human genome.

    PubMed

    Martínez-Fundichely, Alexander; Casillas, Sònia; Egea, Raquel; Ràmia, Miquel; Barbadilla, Antonio; Pantano, Lorena; Puig, Marta; Cáceres, Mario

    2014-01-01

    The newest genomic advances have uncovered an unprecedented degree of structural variation throughout genomes, with great amounts of data accumulating rapidly. Here we introduce InvFEST (http://invfestdb.uab.cat), a database combining multiple sources of information to generate a complete catalogue of non-redundant human polymorphic inversions. Due to the complexity of this type of changes and the underlying high false-positive discovery rate, it is necessary to integrate all the available data to get a reliable estimate of the real number of inversions. InvFEST automatically merges predictions into different inversions, refines the breakpoint locations, and finds associations with genes and segmental duplications. In addition, it includes data on experimental validation, population frequency, functional effects and evolutionary history. All this information is readily accessible through a complete and user-friendly web report for each inversion. In its current version, InvFEST combines information from 34 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Therefore, InvFEST aims to represent the most reliable set of human inversions and become a central repository to share information, guide future studies and contribute to the analysis of the functional and evolutionary impact of inversions on the human genome.

  4. Microbial modulators of soil carbon storage: integrating genomic and metabolic knowledge for global prediction.

    PubMed

    Trivedi, Pankaj; Anderson, Ian C; Singh, Brajesh K

    2013-12-01

    Soil organic carbon performs a number of functions in ecosystems and it is clear that microbial communities play important roles in land-atmosphere carbon (C) exchange and soil C storage. In this review, we discuss microbial modulators of soil C storage, 'omics'-based approaches to characterize microbial system interactions impacting terrestrial C sequestration, and how data related to microbial composition and activities can be incorporated into mechanistic and predictive models. We argue that although making direct linkage of genomes to global phenomena is a significant challenge, many connections at intermediate scales are viable with integrated application of new systems biology approaches and powerful analytical and modelling techniques. This integration could enhance our capability to develop and evaluate microbial strategies for capturing and sequestering atmospheric CO2.

  5. A genomic DNA segment from Petunia hybrida leads to increased transformation frequencies and simple integration patterns.

    PubMed Central

    Meyer, P; Kartzke, S; Niedenhof, I; Heidmann, I; Bussmann, K; Saedler, H

    1988-01-01

    A 2-kilobase (kb) genomic fragment was selected from Petunia hybrida that increased transformation efficiencies by at least a factor of 20 after direct DNA transfer to petunia and tobacco protoplasts when supercoiled plasmid DNA was used. Because of this effect this fragment was named transformation booster sequence (TBS). Increased transformation frequencies were observed for plasmids that contained either the 2-kb fragment in dimeric or monomeric form or an internal 1.1-kb fragment of TBS. Analysis of transformants revealed that preferentially one copy of foreign DNA is integrated. Thus, TBS improves the poor transformation frequencies of direct gene transfer using circular plasmids, while it conserves the simple integration pattern that is important for practical applications. Possible mechanisms of TBS action are discussed. Images PMID:3186747

  6. Integrative genomic mining for enzyme function to enable engineering of a non-natural biosynthetic pathway

    PubMed Central

    Mak, Wai Shun; Tran, Stephen; Marcheschi, Ryan; Bertolani, Steve; Thompson, James; Baker, David; Liao, James C.; Siegel, Justin B.

    2015-01-01

    The ability to biosynthetically produce chemicals beyond what is commonly found in Nature requires the discovery of novel enzyme function. Here we utilize two approaches to discover enzymes that enable specific production of longer-chain (C5–C8) alcohols from sugar. The first approach combines bioinformatics and molecular modelling to mine sequence databases, resulting in a diverse panel of enzymes capable of catalysing the targeted reaction. The median catalytic efficiency of the computationally selected enzymes is 75-fold greater than a panel of naively selected homologues. This integrative genomic mining approach establishes a unique avenue for enzyme function discovery in the rapidly expanding sequence databases. The second approach uses computational enzyme design to reprogramme specificity. Both approaches result in enzymes with >100-fold increase in specificity for the targeted reaction. When enzymes from either approach are integrated in vivo, longer-chain alcohol production increases over 10-fold and represents >95% of the total alcohol products. PMID:26598135

  7. Integrative genomic mining for enzyme function to enable engineering of a non-natural biosynthetic pathway.

    PubMed

    Mak, Wai Shun; Tran, Stephen; Marcheschi, Ryan; Bertolani, Steve; Thompson, James; Baker, David; Liao, James C; Siegel, Justin B

    2015-11-24

    The ability to biosynthetically produce chemicals beyond what is commonly found in Nature requires the discovery of novel enzyme function. Here we utilize two approaches to discover enzymes that enable specific production of longer-chain (C5-C8) alcohols from sugar. The first approach combines bioinformatics and molecular modelling to mine sequence databases, resulting in a diverse panel of enzymes capable of catalysing the targeted reaction. The median catalytic efficiency of the computationally selected enzymes is 75-fold greater than a panel of naively selected homologues. This integrative genomic mining approach establishes a unique avenue for enzyme function discovery in the rapidly expanding sequence databases. The second approach uses computational enzyme design to reprogramme specificity. Both approaches result in enzymes with >100-fold increase in specificity for the targeted reaction. When enzymes from either approach are integrated in vivo, longer-chain alcohol production increases over 10-fold and represents >95% of the total alcohol products.

  8. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

    PubMed

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  9. Mycobacterium tuberculosis EsxO (Rv2346c) promotes bacillary survival by inducing oxidative stress mediated genomic instability in macrophages.

    PubMed

    Mohanty, Soumitra; Dal Molin, Michael; Ganguli, Geetanjali; Padhi, Avinash; Jena, Prajna; Selchow, Petra; Sengupta, Srabasti; Meuli, Michael; Sander, Peter; Sonawane, Avinash

    2016-01-01

    Mycobacterium tuberculosis (Mtb) survives inside the macrophages by modulating the host immune responses in its favor. The 6-kDa early secretory antigenic target (ESAT-6; esxA) of Mtb is known as a potent virulence and T-cell antigenic determinant. At least 23 such ESAT-6 family proteins are encoded in the genome of Mtb; however, the function of many of them is still unknown. We herein report that ectopic expression of Mtb Rv2346c (esxO), a member of ESAT-6 family proteins, in non-pathogenic Mycobacterium smegmatis strain (MsmRv2346c) aids host cell invasion and intracellular bacillary persistence. Further mechanistic studies revealed that MsmRv2346c infection abated macrophage immunity by inducing host cell death and genomic instability as evident from the appearance of several DNA damage markers. We further report that the induction of genomic instability in infected cells was due to increase in the hosts oxidative stress responses. MsmRv2346c infection was also found to induce autophagy and modulate the immune function of macrophages. In contrast, blockade of Rv2346c induced oxidative stress by treatment with ROS inhibitor N-acetyl-L-cysteine prevented the host cell death, autophagy induction and genomic instability in infected macrophages. Conversely, MtbΔRv2346c mutant did not show any difference in intracellular survival and oxidative stress responses. We envision that Mtb ESAT-6 family protein Rv2346c dampens antibacterial effector functions namely by inducing oxidative stress mediated genomic instability in infected macrophages, while loss of Rv2346c gene function may be compensated by other redundant ESAT-6 family proteins. Thus EsxO plays an important role in mycobacterial pathogenesis in the context of innate immunity. PMID:26786654

  10. NGS-based approach to determine the presence of HPV and their sites of integration in human cancer genome

    PubMed Central

    Chandrani, P; Kulkarni, V; Iyer, P; Upadhyay, P; Chaubal, R; Das, P; Mulherkar, R; Singh, R; Dutt, A

    2015-01-01

    Background: Human papilloma virus (HPV) accounts for the most common cause of all virus-associated human cancers. Here, we describe the first graphic user interface (GUI)-based automated tool ‘HPVDetector', for non-computational biologists, exclusively for detection and annotation of the HPV genome based on next-generation sequencing data sets. Methods: We developed a custom-made reference genome that comprises of human chromosomes along with annotated genome of 143 HPV types as pseudochromosomes. The tool runs on a dual mode as defined by the user: a ‘quick mode' to identify presence of HPV types and an ‘integration mode' to determine genomic location for the site of integration. The input data can be a paired-end whole-exome, whole-genome or whole-transcriptome data set. The HPVDetector is available in public domain for download: http://www.actrec.gov.in/pi-webpages/AmitDutt/HPVdetector/HPVDetector.html. Results: On the basis of our evaluation of 116 whole-exome, 23 whole-transcriptome and 2 whole-genome data, we were able to identify presence of HPV in 20 exomes and 4 transcriptomes of cervical and head and neck cancer tumour samples. Using the inbuilt annotation module of HPVDetector, we found predominant integration of viral gene E7, a known oncogene, at known 17q21, 3q27, 7q35, Xq28 and novel sites of integration in the human genome. Furthermore, co-infection with high-risk HPVs such as 16 and 31 were found to be mutually exclusive compared with low-risk HPV71. Conclusions: HPVDetector is a simple yet precise and robust tool for detecting HPV from tumour samples using variety of next-generation sequencing platforms including whole genome, whole exome and transcriptome. Two different modes (quick detection and integration mode) along with a GUI widen the usability of HPVDetector for biologists and clinicians with minimal computational knowledge. PMID:25973533

  11. Purdue Ionomics Information Management System. An Integrated Functional Genomics Platform1[C][W][OA

    PubMed Central

    Baxter, Ivan; Ouzzani, Mourad; Orcun, Seza; Kennedy, Brad; Jandhyala, Shrinivas S.; Salt, David E.

    2007-01-01

    The advent of high-throughput phenotyping technologies has created a deluge of information that is difficult to deal with without the appropriate data management tools. These data management tools should integrate defined workflow controls for genomic-scale data acquisition and validation, data storage and retrieval, and data analysis, indexed around the genomic information of the organism of interest. To maximize the impact of these large datasets, it is critical that they are rapidly disseminated to the broader research community, allowing open access for data mining and discovery. We describe here a system that incorporates such functionalities developed around the Purdue University high-throughput ionomics phenotyping platform. The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS is deployed as a World Wide Web-enabled system, allowing for integration of distributed workflow processes and open access to raw data for analysis by numerous laboratories. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations. PiiMS is accessible at www.purdue.edu/dp/ionomics. PMID:17189337

  12. An Integrative Genomic and Transcriptomic Analysis Reveals Potential Targets Associated with Cell Proliferation in Uterine Leiomyomas

    PubMed Central

    Cirilo, Priscila Daniele Ramos; Marchi, Fábio Albuquerque; Barros Filho, Mateus de Camargo; Rocha, Rafael Malagoli; Domingues, Maria Aparecida Custódio; Jurisica, Igor; Pontes, Anagloria; Rogatto, Silvia Regina

    2013-01-01

    Background Uterine Leiomyomas (ULs) are the most common benign tumours affecting women of reproductive age. ULs represent a major problem in public health, as they are the main indication for hysterectomy. Approximately 40–50% of ULs have non-random cytogenetic abnormalities, and half of ULs may have copy number alterations (CNAs). Gene expression microarrays studies have demonstrated that cell proliferation genes act in response to growth factors and steroids. However, only a few genes mapping to CNAs regions were found to be associated with ULs. Methodology We applied an integrative analysis using genomic and transcriptomic data to identify the pathways and molecular markers associated with ULs. Fifty-one fresh frozen specimens were evaluated by array CGH (JISTIC) and gene expression microarrays (SAM). The CONEXIC algorithm was applied to integrate the data. Principal Findings The integrated analysis identified the top 30 significant genes (P<0.01), which comprised genes associated with cancer, whereas the protein-protein interaction analysis indicated a strong association between FANCA and BRCA1. Functional in silico analysis revealed target molecules for drugs involved in cell proliferation, including FGFR1 and IGFBP5. Transcriptional and protein analyses showed that FGFR1 (P = 0.006 and P<0.01, respectively) and IGFBP5 (P = 0.0002 and P = 0.006, respectively) were up-regulated in the tumours when compared with the adjacent normal myometrium. Conclusions The integrative genomic and transcriptomic approach indicated that FGFR1 and IGFBP5 amplification, as well as the consequent up-regulation of the protein products, plays an important role in the aetiology of ULs and thus provides data for potential drug therapies development to target genes associated with cellular proliferation in ULs. PMID:23483937

  13. Bayesian Cue Integration as a Developmental Outcome of Reward Mediated Learning

    PubMed Central

    Weisswange, Thomas H.; Rothkopf, Constantin A.; Rodemann, Tobias; Triesch, Jochen

    2011-01-01

    Average human behavior in cue combination tasks is well predicted by Bayesian inference models. As this capability is acquired over developmental timescales, the question arises, how it is learned. Here we investigated whether reward dependent learning, that is well established at the computational, behavioral, and neuronal levels, could contribute to this development. It is shown that a model free reinforcement learning algorithm can indeed learn to do cue integration, i.e. weight uncertain cues according to their respective reliabilities and even do so if reliabilities are changing. We also consider the case of causal inference where multimodal signals can originate from one or multiple separate objects and should not always be integrated. In this case, the learner is shown to develop a behavior that is closest to Bayesian model averaging. We conclude that reward mediated learning could be a driving force for the development of cue integration and causal inference. PMID:21750717

  14. Microenvironmental Heterogeneity Parallels Breast Cancer Progression: A Histology–Genomic Integration Analysis

    PubMed Central

    Natrajan, Rachael; Sailem, Heba; Mardakheh, Faraz K.; Arias Garcia, Mar; Tape, Christopher J.; Dowsett, Mitch; Bakal, Chris; Yuan, Yinyin

    2016-01-01

    Background The intra-tumor diversity of cancer cells is under intense investigation; however, little is known about the heterogeneity of the tumor microenvironment that is key to cancer progression and evolution. We aimed to assess the degree of microenvironmental heterogeneity in breast cancer and correlate this with genomic and clinical parameters. Methods and Findings We developed a quantitative measure of microenvironmental heterogeneity along three spatial dimensions (3-D) in solid tumors, termed the tumor ecosystem diversity index (EDI), using fully automated histology image analysis coupled with statistical measures commonly used in ecology. This measure was compared with disease-specific survival, key mutations, genome-wide copy number, and expression profiling data in a retrospective study of 510 breast cancer patients as a test set and 516 breast cancer patients as an independent validation set. In high-grade (grade 3) breast cancers, we uncovered a striking link between high microenvironmental heterogeneity measured by EDI and a poor prognosis that cannot be explained by tumor size, genomics, or any other data types. However, this association was not observed in low-grade (grade 1 and 2) breast cancers. The prognostic value of EDI was superior to known prognostic factors and was enhanced with the addition of TP53 mutation status (multivariate analysis test set, p = 9 × 10−4, hazard ratio = 1.47, 95% CI 1.17–1.84; validation set, p = 0.0011, hazard ratio = 1.78, 95% CI 1.26–2.52). Integration with genome-wide profiling data identified losses of specific genes on 4p14 and 5q13 that were enriched in grade 3 tumors with high microenvironmental diversity that also substratified patients into poor prognostic groups. Limitations of this study include the number of cell types included in the model, that EDI has prognostic value only in grade 3 tumors, and that our spatial heterogeneity measure was dependent on spatial scale and tumor size. Conclusions To

  15. Circulating nucleic acids damage DNA of healthy cells by integrating into their genomes.

    PubMed

    Mittra, Indraneel; Khare, Naveen Kumar; Raghuram, Gorantla Venkata; Chaubal, Rohan; Khambatti, Fatema; Gupta, Deepika; Gaikwad, Ashwini; Prasannan, Preeti; Singh, Akshita; Iyer, Aishwarya; Singh, Ankita; Upadhyay, Pawan; Nair, Naveen Kumar; Mishra, Pradyumna Kumar; Dutt, Amit

    2015-03-01

    Whether nucleic acids that circulate in blood have any patho-physiological functions in the host have not been explored.We report here that far from being inert molecules, circulating nucleic acids have significant biological activities of their own that are deleterious to healthy cells of the body. Fragmented DNA and chromatin (DNAfs and Cfs) isolated from blood of cancer patients and healthy volunteers are readily taken up by a variety of cells in culture to be localized in their nuclei within a few minutes. The intra-nuclear DNAfs and Cfs associate themselves with host cell chromosomes to evoke a cellular DNA-damage-repair-response (DDR) followed by their incorporation into the host cell genomes. Whole genome sequencing detected the presence of tens of thousands of human sequence reads in the recipient mouse cells. Genomic incorporation of DNAfs and Cfs leads to dsDNA breaks and activation of apoptotic pathways in the treated cells. When injected intravenously into Balb/C mice, DNAfs and Cfs undergo genomic integration into cells of their vital organs resulting in activation of DDR and apoptotic proteins in the recipient cells. Cfs have significantly greater activity than DNAfs with respect to all parameters examined, while both DNAfs and Cfs isolated from cancer patients are more active than those from normal volunteers. All the above pathological actions of DNAfs and Cfs described above can be abrogated by concurrent treatment with DNase I and/or anti-histone antibody complexed nanoparticles both in vitro and in vivo. Taken together, our results suggest that circulating DNAfs and Cfs are physiological, continuously arising, endogenous DNA damaging agents with implications to ageing and a multitude of human pathologies including initiation of cancer.

  16. Polymorphic integrations of an endogenous gammaretrovirus in the mule deer genome.

    PubMed

    Elleder, Daniel; Kim, Oekyung; Padhi, Abinash; Bankert, Jason G; Simeonov, Ivan; Schuster, Stephan C; Wittekindt, Nicola E; Motameny, Susanne; Poss, Mary

    2012-03-01

    Endogenous retroviruses constitute a significant genomic fraction in all mammalian species. Typically they are evolutionarily old and fixed in the host species population. Here we report on a novel endogenous gammaretrovirus (CrERVγ; for cervid endogenous gammaretrovirus) in the mule deer (Odocoileus hemionus) that is insertionally polymorphic among individuals from the same geographical location, suggesting that it has a more recent evolutionary origin. Using PCR-based methods, we identified seven CrERVγ proviruses and demonstrated that they show various levels of insertional polymorphism in mule deer individuals. One CrERVγ provirus was detected in all mule deer sampled but was absent from white-tailed deer, indicating that this virus originally integrated after the split of the two species, which occurred approximately one million years ago. There are, on average, 100 CrERVγ copies in the mule deer genome based on quantitative PCR analysis. A CrERVγ provirus was sequenced and contained intact open reading frames (ORFs) for three virus genes. Transcripts were identified covering the entire provirus. CrERVγ forms a distinct branch of the gammaretrovirus phylogeny, with the closest relatives of CrERVγ being endogenous gammaretroviruses from sheep and pig. We demonstrated that white-tailed deer (Odocoileus virginianus) and elk (Cervus canadensis) DNA contain proviruses that are closely related to mule deer CrERVγ in a conserved region of pol; more distantly related sequences can be identified in the genome of another member of the Cervidae, the muntjac (Muntiacus muntjak). The discovery of a novel transcriptionally active and insertionally polymorphic retrovirus in mammals could provide a useful model system to study the dynamic interaction between the host genome and an invading retrovirus.

  17. Bridging the Gap from Bench to Bedside--An Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED).

    PubMed

    2015-01-01

    The abundance of heterogeneous biomedical data from a variety of sources demands the development of strategies to address data integration and management issues, so that the data can be used effectively in clinical practices and biomedical research. This research presents an Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED) and provides a roadmap that envisions utilizing the clinical and biomedical resources in our case study. This work describes a data integration approach, proposed by ICGED, with a two-fold purpose: personalized medicine and biomedical data storage and sharing platform. It describes our experiences integrating disease specific clinical and genomics datasets with Data Integration and Analysis Tools (DIAT)--using Informatics for Integrating Biology and the Bedside, and discusses work in progress and future work for extending DIAT, and the development of Risk Assessment and Prediction Tools, Clinical Decision Support Systems and a Bioinformatics Data Warehouse.

  18. Bridging the Gap from Bench to Bedside--An Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED).

    PubMed

    2015-01-01

    The abundance of heterogeneous biomedical data from a variety of sources demands the development of strategies to address data integration and management issues, so that the data can be used effectively in clinical practices and biomedical research. This research presents an Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED) and provides a roadmap that envisions utilizing the clinical and biomedical resources in our case study. This work describes a data integration approach, proposed by ICGED, with a two-fold purpose: personalized medicine and biomedical data storage and sharing platform. It describes our experiences integrating disease specific clinical and genomics datasets with Data Integration and Analysis Tools (DIAT)--using Informatics for Integrating Biology and the Bedside, and discusses work in progress and future work for extending DIAT, and the development of Risk Assessment and Prediction Tools, Clinical Decision Support Systems and a Bioinformatics Data Warehouse. PMID:26262353

  19. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments.

    PubMed

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R; Verstrepen, Kevin J; Thevelein, Johan M; Tohme, Joe

    2014-04-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species. PMID:24413664

  20. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments

    PubMed Central

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R.; Verstrepen, Kevin J.; Thevelein, Johan M.; Tohme, Joe

    2014-01-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species. PMID:24413664

  1. Shade avoidance 6 encodes an Arabidopsis flap endonuclease required for maintenance of genome integrity and development

    PubMed Central

    Zhang, Yijuan; Wen, Chunhong; Liu, Songbai; Zheng, Li; Shen, Binghui; Tao, Yi

    2016-01-01

    Flap endonuclease-1 (FEN1) belongs to the Rad2 family of structure-specific nucleases. It is required for several DNA metabolic pathways, including DNA replication and DNA damage repair. Here, we have identified a shade avoidance mutant, sav6, which reduces the mRNA splicing efficiency of SAV6. We have demonstrated that SAV6 is an FEN1 homologue that shows double-flap endonuclease and gap-dependent endonuclease activity, but lacks exonuclease activity. sav6 mutants are hypersensitive to DNA damage induced by ultraviolet (UV)-C radiation and reagents that induce double-stranded DNA breaks, but exhibit normal responses to chemicals that block DNA replication. Signalling components that respond to DNA damage are constitutively activated in sav6 mutants. These data indicate that SAV6 is required for DNA damage repair and the maintenance of genome integrity. Mutant sav6 plants also show reduced root apical meristem (RAM) size and defective quiescent centre (QC) development. The expression of SMR7, a cell cycle regulatory gene, and ERF115 and PSK5, regulators of QC division, is increased in sav6 mutants. Their constitutive induction is likely due to the elevated DNA damage responses in sav6 and may lead to defects in the development of the RAM and QC. Therefore, SAV6 assures proper root development through maintenance of genome integrity. PMID:26721386

  2. Drug-target interaction prediction by integrating chemical, genomic, functional and pharmacological data.

    PubMed

    Yang, Fan; Xu, Jinbo; Zeng, Jianyang

    2014-01-01

    In silico prediction of unknown drug-target interactions (DTIs) has become a popular tool for drug repositioning and drug development. A key challenge in DTI prediction lies in integrating multiple types of data for accurate DTI prediction. Although recent studies have demonstrated that genomic, chemical and pharmacological data can provide reliable information for DTI prediction, it remains unclear whether functional information on proteins can also contribute to this task. Little work has been developed to combine such information with other data to identify new interactions between drugs and targets. In this paper, we introduce functional data into DTI prediction and construct biological space for targets using the functional similarity measure. We present a probabilistic graphical model, called conditional random field (CRF), to systematically integrate genomic, chemical, functional and pharmacological data plus the topology of DTI networks into a unified framework to predict missing DTIs. Tests on two benchmark datasets show that our method can achieve excellent prediction performance with the area under the precision-recall curve (AUPR) up to 94.9. These results demonstrate that our CRF model can successfully exploit heterogeneous data to capture the latent correlations of DTIs, and thus will be practically useful for drug repositioning. Supplementary Material is available at http://iiis.tsinghua.edu.cn/~compbio/papers/psb2014/psb2014_sm.pdf. PMID:24297542

  3. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments.

    PubMed

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R; Verstrepen, Kevin J; Thevelein, Johan M; Tohme, Joe

    2014-04-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.

  4. Regulators of cyclin-dependent kinases are crucial for maintaining genome integrity in S phase

    PubMed Central

    Beck, Halfdan; Nähse, Viola; Larsen, Marie Sofie Yoo; Groth, Petra; Clancy, Trevor; Lees, Michael; Jørgensen, Mette; Helleday, Thomas; Syljuåsen, Randi G.

    2010-01-01

    Maintenance of genome integrity is of critical importance to cells. To identify key regulators of genomic integrity, we screened a human cell line with a kinome small interfering RNA library. WEE1, a major regulator of mitotic entry, and CHK1 were among the genes identified. Both kinases are important negative regulators of CDK1 and -2. Strikingly, WEE1 depletion rapidly induced DNA damage in S phase in newly replicated DNA, which was accompanied by a marked increase in single-stranded DNA. This DNA damage is dependent on CDK1 and -2 as well as the replication proteins MCM2 and CDT1 but not CDC25A. Conversely, DNA damage after CHK1 inhibition is highly dependent on CDC25A. Furthermore, the inferior proliferation of CHK1-depleted cells is improved substantially by codepletion of CDC25A. We conclude that the mitotic kinase WEE1 and CHK1 jointly maintain balanced cellular control of Cdk activity during normal DNA replication, which is crucial to prevent the generation of harmful DNA lesions during replication. PMID:20194642

  5. A novel integrated cytogenetic and genomic classification refines risk stratification in pediatric acute lymphoblastic leukemia.

    PubMed

    Moorman, Anthony V; Enshaei, Amir; Schwab, Claire; Wade, Rachel; Chilton, Lucy; Elliott, Alannah; Richardson, Stacey; Hancock, Jeremy; Kinsey, Sally E; Mitchell, Christopher D; Goulden, Nicholas; Vora, Ajay; Harrison, Christine J

    2014-08-28

    Recent genomic studies have provided a refined genetic map of acute lymphoblastic leukemia (ALL) and increased the number of potential prognostic markers. Therefore, we integrated copy-number alteration data from the 8 most commonly deleted genes, subordinately, with established chromosomal abnormalities to derive a 2-tier genetic classification. The classification was developed using 809 ALL97/99 patients and validated using 742 United Kingdom (UK)ALL2003 patients. Good-risk (GR) genetic features included ETV6-RUNX1, high hyperdiploidy, normal copy-number status for all 8 genes, isolated deletions affecting ETV6/PAX5/BTG1, and ETV6 deletions with a single additional deletion of BTG1/PAX5/CDKN2A/B. All other genetic features were classified as poor risk (PR). Three-quarters of UKALL2003 patients had a GR genetic profile and a significantly improved event-free survival (EFS) (94%) compared with patients with a PR genetic profile (79%). This difference was driven by a lower relapse rate (4% vs 17%), was seen across all patient subgroups, and was independent of other risk factors. Even genetic GR patients with minimal residual disease (>0.01%) at day 29 had an EFS in excess of 90%. In conclusion, the integration of genomic and cytogenetic data defines 2 subgroups with distinct responses to treatment and identifies a large subset of children suitable for treatment deintensification.

  6. Integrative Genomics-Based Discovery of Novel Regulators of the Innate Antiviral Response

    PubMed Central

    van der Lee, Robin; ter Horst, Rob; Szklarczyk, Radek; Netea, Mihai G.; Andeweg, Arno C.; van Kuppeveld, Frank J. M.; Huynen, Martijn A.

    2015-01-01

    The RIG-I-like receptor (RLR) pathway is essential for detecting cytosolic viral RNA to trigger the production of type I interferons (IFNα/β) that initiate an innate antiviral response. Through systematic assessment of a wide variety of genomics data, we discovered 10 molecular signatures of known RLR pathway components that collectively predict novel members. We demonstrate that RLR pathway genes, among others, tend to evolve rapidly, interact with viral proteins, contain a limited set of protein domains, are regulated by specific transcription factors, and form a tightly connected interaction network. Using a Bayesian approach to integrate these signatures, we propose likely novel RLR regulators. RNAi knockdown experiments revealed a high prediction accuracy, identifying 94 genes among 187 candidates tested (~50%) that affected viral RNA-induced production of IFNβ. The discovered antiviral regulators may participate in a wide range of processes that highlight the complexity of antiviral defense (e.g. MAP3K11, CDK11B, PSMA3, TRIM14, HSPA9B, CDC37, NUP98, G3BP1), and include uncharacterized factors (DDX17, C6orf58, C16orf57, PKN2, SNW1). Our validated RLR pathway list (http://rlr.cmbi.umcn.nl/), obtained using a combination of integrative genomics and experiments, is a new resource for innate antiviral immunity research. PMID:26485378

  7. Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship.

    PubMed

    Kim, Hyunsoo; Huang, Wei; Jiang, Xiuli; Pennicooke, Brenton; Park, Peter J; Johnson, Mark D

    2010-02-01

    Using a multidimensional genomic data set on glioblastoma from The Cancer Genome Atlas, we identified hsa-miR-26a as a cooperating component of a frequently occurring amplicon that also contains CDK4 and CENTG1, two oncogenes that regulate the RB1 and PI3 kinase/AKT pathways, respectively. By integrating DNA copy number, mRNA, microRNA, and DNA methylation data, we identified functionally relevant targets of miR-26a in glioblastoma, including PTEN, RB1, and MAP3K2/MEKK2. We demonstrate that miR-26a alone can transform cells and it promotes glioblastoma cell growth in vitro and in the mouse brain by decreasing PTEN, RB1, and MAP3K2/MEKK2 protein expression, thereby increasing AKT activation, promoting proliferation, and decreasing c-JUN N-terminal kinase-dependent apoptosis. Overexpression of miR-26a in PTEN-competent and PTEN-deficient glioblastoma cells promoted tumor growth in vivo, and it further increased growth in cells overexpressing CDK4 or CENTG1. Importantly, glioblastoma patients harboring this amplification displayed markedly decreased survival. Thus, hsa-miR-26a, CDK4, and CENTG1 comprise a functionally integrated oncomir/oncogene DNA cluster that promotes aggressiveness in human cancers by cooperatively targeting the RB1, PI3K/AKT, and JNK pathways. PMID:20080666

  8. Flowering and genome integrity control by a nuclear matrix protein in Arabidopsis.

    PubMed

    Xu, Yifeng; Gan, Eng-Seng; He, Yuehui; Ito, Toshiro

    2013-01-01

    The matrix attachment regions (MARs) binding proteins could finely orchestrate temporal and spatial gene expression during development. In Arabidopsis, transposable elements (TEs) and TE-like repeat sequences are transcriptionally repressed or attenuated by the coordination of many key players including DNA methyltransferases, histone deacetylases, histone methyltransferases and the siRNA pathway, which help to protect genomic integrity and control multiple developmental processes such as flowering. We have recently reported that an AT-hook nuclear matrix binding protein, TRANSPOSABLE ELEMENT SILENCING VIA AT-HOOK (TEK), participates in a histone deacetylation (HDAC) complex to silence TEs and genes containing a TE-like sequence, including AtMu1, FWA and FLOWERING LOCUS C (FLC) in Ler background. We have shown that TEK knockdown causes increased histone acetylation, reduced H3K9me2 and moderate reduction of DNA methylation in the target loci, leading to the de-repression of FLC and FWA, as well as TE reactivation. Here we discuss the role of TEK as a putative MAR binding protein which functions in the maintenance of genome integrity and in flowering control by silencing TEs and repeat-containing genes. PMID:23836195

  9. A Genetic Response Score for Hydrochlorothiazide Use: Insights From Genomics and Metabolomics Integration.

    PubMed

    Shahin, Mohamed H; Gong, Yan; McDonough, Caitrin W; Rotroff, Daniel M; Beitelshees, Amber L; Garrett, Timothy J; Gums, John G; Motsinger-Reif, Alison; Chapman, Arlene B; Turner, Stephen T; Boerwinkle, Eric; Frye, Reginald F; Fiehn, Oliver; Cooper-DeHoff, Rhonda M; Kaddurah-Daouk, Rima; Johnson, Julie A

    2016-09-01

    Hydrochlorothiazide is among the most commonly prescribed antihypertensives; yet, <50% of hydrochlorothiazide-treated patients achieve blood pressure (BP) control. Herein, we integrated metabolomic and genomic profiles of hydrochlorothiazide-treated patients to identify novel genetic markers associated with hydrochlorothiazide BP response. The primary analysis included 228 white hypertensives treated with hydrochlorothiazide from the Pharmacogenomic Evaluation of Antihypertensive Responses (PEAR) study. Genome-wide analysis was conducted using Illumina Omni 1 mol/L-Quad Chip, and untargeted metabolomics was performed on baseline fasting plasma samples using a gas chromatography-time-of-flight mass spectrometry platform. We found 13 metabolites significantly associated with hydrochlorothiazide systolic BP (SBP) and diastolic BP (DBP) responses (false discovery rate, <0.05). In addition, integrating genomic and metabolomic data revealed 3 polymorphisms (rs2727563 PRKAG2, rs12604940 DCC, and rs13262930 EPHX2) along with arachidonic acid, converging in the netrin signaling pathway (P=1×10(-5)), as potential markers, significantly influencing hydrochlorothiazide BP response. We successfully replicated the 3 genetic signals in 212 white hypertensives treated with hydrochlorothiazide and created a response score by summing their BP-lowering alleles. We found patients carrying 1 response allele had a significantly lower response than carriers of 6 alleles (∆SBP/∆DBP: -1.5/1.2 versus -16.3/-10.4 mm Hg, respectively, SBP score, P=1×10(-8) and DBP score, P=3×10(-9)). This score explained 11.3% and 11.9% of the variability in hydrochlorothiazide SBP and DBP responses, respectively, and was further validated in another independent study of 196 whites treated with hydrochlorothiazide (DBP score, P=0.03; SBP score, P=0.07). This study suggests that PRKAG2, DCC, and EPHX2 might be important determinants of hydrochlorothiazide BP response. PMID:27381900

  10. One-step high-efficiency CRISPR/Cas9-mediated genome editing in Streptomyces.

    PubMed

    Huang, He; Zheng, Guosong; Jiang, Weihong; Hu, Haifeng; Lu, Yinhua

    2015-04-01

    The RNA-guided DNA editing technology CRISPRs (clustered regularly interspaced short palindromic repeats)/Cas9 had been used to introduce double-stranded breaks into genomes and to direct subsequent site-specific insertions/deletions or the replacement of genetic material in bacteria, such as Escherichia coli, Streptococcus pneumonia, and Lactobacillus reuteri. In this study, we established a high-efficiency CRISPR/Cas9 genome editing plasmid pKCcas9dO for use in Streptomyces genetic manipulation, which comprises a target-specific guide RNA, a codon-optimized cas9, and two homology-directed repair templates. By delivering pKCcas9dO series editing plasmids into the model strain Streptomyces coelicolor M145, through one-step intergeneric transfer, we achieved the genome editing at different levels with high efficiencies of 60%-100%, including single gene deletion, such as actII-orf4, redD, and glnR, and single large-size gene cluster deletion, such as the antibiotic biosynthetic clusters of actinorhodin (ACT) (21.3 kb), undecylprodigiosin (RED) (31.6 kb), and Ca(2+)-dependent antibiotic (82.8 kb). Furthermore, we also realized simultaneous deletions of actII-orf4 and redD, and of the ACT and RED biosynthetic gene clusters with high efficiencies of 54% and 45%, respectively. Finally, we applied this system to introduce nucleotide point mutations into the rpsL gene, which conferred the mutants with resistance to streptomycin. Notably, using this system, the time required for one round of genome modification is reduced by one-third or one-half of those for conventional methods. These results clearly indicate that the established CRISPR/Cas9 genome editing system substantially improves the genome editing efficiency compared with the currently existing methods in Streptomyces, and it has promise for application to genome modification in other Actinomyces species.

  11. Integrative analyses reveal a long noncoding RNA-mediated sponge regulatory network in prostate cancer

    PubMed Central

    Du, Zhou; Sun, Tong; Hacisuleyman, Ezgi; Fei, Teng; Wang, Xiaodong; Brown, Myles; Rinn, John L.; Lee, Mary Gwo-Shu; Chen, Yiwen; Kantoff, Philip W.; Liu, X. Shirley

    2016-01-01

    Mounting evidence suggests that long noncoding RNAs (lncRNAs) can function as microRNA sponges and compete for microRNA binding to protein-coding transcripts. However, the prevalence, functional significance and targets of lncRNA-mediated sponge regulation of cancer are mostly unknown. Here we identify a lncRNA-mediated sponge regulatory network that affects the expression of many protein-coding prostate cancer driver genes, by integrating analysis of sequence features and gene expression profiles of both lncRNAs and protein-coding genes in tumours. We confirm the tumour-suppressive function of two lncRNAs (TUG1 and CTB-89H12.4) and their regulation of PTEN expression in prostate cancer. Surprisingly, one of the two lncRNAs, TUG1, was previously known for its function in polycomb repressive complex 2 (PRC2)-mediated transcriptional regulation, suggesting its sub-cellular localization-dependent function. Our findings not only suggest an important role of lncRNA-mediated sponge regulation in cancer, but also underscore the critical influence of cytoplasmic localization on the efficacy of a sponge lncRNA. PMID:26975529

  12. CRISPR/Cas9 mediated genome editing in ES cells and its application for chimeric analysis in mice

    PubMed Central

    Oji, Asami; Noda, Taichi; Fujihara, Yoshitaka; Miyata, Haruhiko; Kim, Yeon Joo; Muto, Masanaga; Nozawa, Kaori; Matsumura, Takafumi; Isotani, Ayako; Ikawa, Masahito

    2016-01-01

    Targeted gene disrupted mice can be efficiently generated by expressing a single guide RNA (sgRNA)/CAS9 complex in the zygote. However, the limited success of complicated genome editing, such as large deletions, point mutations, and knockins, remains to be improved. Further, the mosaicism in founder generations complicates the genotypic and phenotypic analyses in these animals. Here we show that large deletions with two sgRNAs as well as dsDNA-mediated point mutations are efficient in mouse embryonic stem cells (ESCs). The dsDNA-mediated gene knockins are also feasible in ESCs. Finally, we generated chimeric mice with biallelic mutant ESCs for a lethal gene, Dnajb13, and analyzed their phenotypes. Not only was the lethal phenotype of hydrocephalus suppressed, but we also found that Dnajb13 is required for sperm cilia formation. The combination of biallelic genome editing in ESCs and subsequent chimeric analysis provides a useful tool for rapid gene function analysis in the whole organism. PMID:27530713

  13. Fitness Cost Implications of PhiC31-Mediated Site-Specific Integrations in Target-Site Strains of the Mexican Fruit Fly, Anastrepha ludens (Diptera: Tephritidae)

    PubMed Central

    Meza, José S.; Díaz-Fleischer, Francisco; Sánchez-Velásquez, Lázaro R.; Zepeda-Cisneros, Cristina Silvia; Handler, Alfred M.; Schetelig, Marc F.

    2014-01-01

    Site-specific recombination technologies are powerful new tools for the manipulation of genomic DNA in insects that can improve transgenesis strategies such as targeting transgene insertions, allowing transgene cassette exchange and DNA mobilization for transgene stabilization. However, understanding the fitness cost implications of these manipulations for transgenic strain applications is critical. In this study independent piggyBac-mediated attP target-sites marked with DsRed were created in several genomic positions in the Mexican fruit fly, Anastrepha ludens. Two of these strains, one having an autosomal (attP_F7) and the other a Y-linked (attP_2-M6y) integration, exhibited fitness parameters (dynamic demography and sexual competitiveness) similar to wild type flies. These strains were thus selected for targeted insertion using, for the first time in mexfly, the phiC31-integrase recombination system to insert an additional EGFP-marked transgene to determine its effect on host strain fitness. Fitness tests showed that the integration event in the int_2-M6y recombinant strain had no significant effect, while the int_F7 recombinant strain exhibited significantly lower fitness relative to the original attP_F7 target-site host strain. These results indicate that while targeted transgene integrations can be achieved without an additional fitness cost, at some genomic positions insertion of additional DNA into a previously integrated transgene can have a significant negative effect. Thus, for targeted transgene insertions fitness costs must be evaluated both previous to and subsequent to new site-specific insertions in the target-site strain. PMID:25303238

  14. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine

    PubMed Central

    Orechia, John; Pathak, Ameet; Shi, Yunling; Nawani, Aniket; Belozerov, Andrey; Fontes, Caitlin; Lakhiani, Camille; Jawale, Chetan; Patel, Chetansharan; Quinn, Daniel; Botvinnik, Dmitry; Mei, Eddie; Cotter, Elizabeth; Byleckie, James; Ullman-Cullere, Mollie; Chhetri, Padam; Chalasani, Poornima; Karnam, Purushotham; Beaudoin, Ronald; Sahu, Sandeep; Belozerova, Yelena; Mathew, Jomol P.

    2015-01-01

    We live in the genomic era of medicine, where a patient's genomic/molecular data is becoming increasingly important for disease diagnosis, identification of targeted therapy, and risk assessment for adverse reactions. However, decoding the genomic test results and integrating it with clinical data for retrospective studies and cohort identification for prospective clinical trials is still a challenging task. In order to overcome these barriers, we developed an overarching enterprise informatics framework for translational research and personalized medicine called Synergistic Patient and Research Knowledge Systems (SPARKS) and a suite of tools called Oncology Data Retrieval Systems (OncDRS). OncDRS enables seamless data integration, secure and self-navigated query and extraction of clinical and genomic data from heterogeneous sources. Within a year of release, the system has facilitated more than 1500 research queries and has delivered data for more than 50 research studies. PMID:27054074

  15. Genome-wide conserved non-coding microsatellite (CNMS) marker-based integrative genetical genomics for quantitative dissection of seed weight in chickpea.

    PubMed

    Bajaj, Deepak; Saxena, Maneesha S; Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Tripathi, Shailesh; Upadhyaya, Hari D; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-03-01

    Phylogenetic footprinting identified 666 genome-wide paralogous and orthologous CNMS (conserved non-coding microsatellite) markers from 5'-untranslated and regulatory regions (URRs) of 603 protein-coding chickpea genes. The (CT)n and (GA)n CNMS carrying CTRMCAMV35S and GAGA8BKN3 regulatory elements, respectively, are abundant in the chickpea genome. The mapped genic CNMS markers with robust amplification efficiencies (94.7%) detected higher intraspecific polymorphic potential (37.6%) among genotypes, implying their immense utility in chickpea breeding and genetic analyses. Seventeen differentially expressed CNMS marker-associated genes showing strong preferential and seed tissue/developmental stage-specific expression in contrasting genotypes were selected to narrow down the gene targets underlying seed weight quantitative trait loci (QTLs)/eQTLs (expression QTLs) through integrative genetical genomics. The integration of transcript profiling with seed weight QTL/eQTL mapping, molecular haplotyping, and association analyses identified potential molecular tags (GAGA8BKN3 and RAV1AAT regulatory elements and alleles/haplotypes) in the LOB-domain-containing protein- and KANADI protein-encoding transcription factor genes controlling the cis-regulated expression for seed weight in the chickpea. This emphasizes the potential of CNMS marker-based integrative genetical genomics for the quantitative genetic dissection of complex seed weight in chickpea.

  16. Oligonucleotide-Mediated Genome Editing Provides Precision and Function to Engineered Nucleases and Antibiotics in Plants.

    PubMed

    Sauer, Noel J; Narváez-Vásquez, Javier; Mozoruk, Jerry; Miller, Ryan B; Warburg, Zachary J; Woodward, Melody J; Mihiret, Yohannes A; Lincoln, Tracey A; Segami, Rosa E; Sanders, Steven L; Walker, Keith A; Beetham, Peter R; Schöpke, Christian R; Gocal, Greg F W

    2016-04-01

    Here, we report a form of oligonucleotide-directed mutagenesis for precision genome editing in plants that uses single-stranded oligonucleotides (ssODNs) to precisely and efficiently generate genome edits at DNA strand lesions made by DNA double strand break reagents. Employing a transgene model in Arabidopsis (Arabidopsis thaliana), we obtained a high frequency of precise targeted genome edits when ssODNs were introduced into protoplasts that were pretreated with the glycopeptide antibiotic phleomycin, a nonspecific DNA double strand breaker. Simultaneous delivery of ssODN and a site-specific DNA double strand breaker, either transcription activator-like effector nucleases (TALENs) or clustered, regularly interspaced, short palindromic repeats (CRISPR/Cas9), resulted in a much greater targeted genome-editing frequency compared with treatment with DNA double strand-breaking reagents alone. Using this site-specific approach, we applied the combination of ssODN and CRISPR/Cas9 to develop an herbicide tolerance trait in flax (Linum usitatissimum) by precisely editing the 5'-ENOLPYRUVYLSHIKIMATE-3-PHOSPHATE SYNTHASE (EPSPS) genes. EPSPS edits occurred at sufficient frequency that we could regenerate whole plants from edited protoplasts without employing selection. These plants were subsequently determined to be tolerant to the herbicide glyphosate in greenhouse spray tests. Progeny (C1) of these plants showed the expected Mendelian segregation of EPSPS edits. Our findings show the enormous potential of using a genome-editing platform for precise, reliable trait development in crop plants. PMID:26864017

  17. Ca2+ channels as integrators of G protein-mediated signaling in neurons.

    PubMed

    Strock, Jesse; Diversé-Pierluissi, María A

    2004-11-01

    The observations from Dunlap and Fischbach that transmitter-mediated shortening of the duration of action potentials could be caused by a decrease in calcium conductance led to numerous studies of the mechanisms of modulation of voltage-dependent calcium channels. Calcium channels are well known targets for inhibition by receptor-G protein pathways, and multiple forms of inhibition have been described. Inhibition of Ca(2+) channels can be mediated by G protein betagamma-subunits or by kinases, such as protein kinase C and tyrosine kinases. In the last few years, it has been shown that integration of G protein signaling can take place at the level of the calcium channel by regulation of the interaction of the channel pore-forming subunit with different cellular proteins.

  18. Plant Clonal Integration Mediates the Horizontal Redistribution of Soil Resources, Benefiting Neighboring Plants.

    PubMed

    Ye, Xue-Hua; Zhang, Ya-Lin; Liu, Zhi-Lan; Gao, Shu-Qin; Song, Yao-Bin; Liu, Feng-Hong; Dong, Ming

    2016-01-01

    Resources such as water taken up by plants can be released into soils through hydraulic redistribution and can also be translocated by clonal integration within a plant clonal network. We hypothesized that the resources from one (donor) microsite could be translocated within a clonal network, released into different (recipient) microsites and subsequently used by neighbor plants in the recipient microsite. To test these hypotheses, we conducted two experiments in which connected and disconnected ramet pairs of Potentilla anserina were grown under both homogeneous and heterogeneous water regimes, with seedlings of Artemisia ordosica as neighbors. The isotopes [(15)N] and deuterium were used to trace the translocation of nitrogen and water, respectively, within the clonal network. The water and nitrogen taken up by P. anserina ramets in the donor microsite were translocated into the connected ramets in the recipient microsites. Most notably, portions of the translocated water and nitrogen were released into the recipient microsite and were used by the neighboring A. ordosica, which increased growth of the neighboring A. ordosica significantly. Therefore, our hypotheses were supported, and plant clonal integration mediated the horizontal hydraulic redistribution of resources, thus benefiting neighboring plants. Such a plant clonal integration-mediated resource redistribution in horizontal space may have substantial effects on the interspecific relations and composition of the community and consequently on ecosystem processes.

  19. Plant Clonal Integration Mediates the Horizontal Redistribution of Soil Resources, Benefiting Neighboring Plants

    PubMed Central

    Ye, Xue-Hua; Zhang, Ya-Lin; Liu, Zhi-Lan; Gao, Shu-Qin; Song, Yao-Bin; Liu, Feng-Hong; Dong, Ming

    2016-01-01

    Resources such as water taken up by plants can be released into soils through hydraulic redistribution and can also be translocated by clonal integration within a plant clonal network. We hypothesized that the resources from one (donor) microsite could be translocated within a clonal network, released into different (recipient) microsites and subsequently used by neighbor plants in the recipient microsite. To test these hypotheses, we conducted two experiments in which connected and disconnected ramet pairs of Potentilla anserina were grown under both homogeneous and heterogeneous water regimes, with seedlings of Artemisia ordosica as neighbors. The isotopes [15N] and deuterium were used to trace the translocation of nitrogen and water, respectively, within the clonal network. The water and nitrogen taken up by P. anserina ramets in the donor microsite were translocated into the connected ramets in the recipient microsites. Most notably, portions of the translocated water and nitrogen were released into the recipient microsite and were used by the neighboring A. ordosica, which increased growth of the neighboring A. ordosica significantly. Therefore, our hypotheses were supported, and plant clonal integration mediated the horizontal hydraulic redistribution of resources, thus benefiting neighboring plants. Such a plant clonal integration-mediated resource redistribution in horizontal space may have substantial effects on the interspecific relations and composition of the community and consequently on ecosystem processes. PMID:26904051

  20. Integrated, genome-wide screening for hypomethylated oncogenes in salivary gland adenoid cystic carcinoma

    PubMed Central

    Shao, Chunbo; Sun, Wenyue; Tan, Marietta; Glazer, Chad A.; Bhan, Sheetal; Zhong, Xiaoli; Fakhry, Carole; Sharma, Rajni; Westra, William H.; Hoque, Mohammad O.; Moskaluk, Christopher A.; Sidransky, David; Califano, Joseph A.; Ha, Patrick K.

    2011-01-01

    Purpose Salivary gland adenoid cystic carcinoma (ACC) is a rare malignancy that is poorly understood. In order to look for relevant oncogene candidates under the control of promoter methylation, an integrated, genome-wide screen was performed. Experimental Design Global demethylation of normal salivary gland cell strains using 5-aza-2′-deoxycytidine (5-Aza dC) and Trichostatin A (TSA), followed by expression array analysis was performed. ACC-specific expression profiling was generated using expression microarray analysis of primary ACC and normal samples. Next, the two profiles were integrated to identify a subset of genes for further validation of promoter demethylation in ACC versus normal. Finally, promising candidates were further validated for mRNA, protein, and promoter methylation levels in larger ACC cohorts. Functional validation was then performed in cancer cell lines. Results We found 159 genes that were significantly re-expressed after 5-Aza dC/TSA treatment and overexpressed in ACC. After initial validation, eight candidates showed hypomethylation in ACC: AQP1, CECR1, C1QR1, CTAG2, P53AIP1, TDRD12, BEX1, and DYNLT3. Aquaporin 1 (AQP1) showed the most significant hypomethylation and was further validated. AQP1 hypomethylation in ACC was confirmed with two independent cohorts. Of note, there was significant overexpression of AQP1 in both mRNA and protein in the paraffin-embedded ACC cohort. Furthermore, AQP1 was up-regulated in 5-Aza dC/TSA treated SACC83. Lastly, AQP1 promoted cell proliferation and colony formation in SACC83. Conclusions Our integrated, genome-wide screening method proved to be an effective strategy for detecting novel oncogenes in ACC. AQP1 is a promising oncogene candidate for ACC and is transcriptionally regulated by promoter hypomethylation. PMID:21551254