Science.gov

Sample records for genomic integration mediated

  1. Altering genomic integrity: heavy metal exposure promotes trans-posable element-mediated damage

    PubMed Central

    Morales, Maria E.; Servant, Geraldine; Ade, Catherine; Roy-Enge, Astrid M.

    2015-01-01

    Maintenance of genomic integrity is critical for cellular homeostasis and survival. The active transposable elements (TEs) composed primarily of three mobile element lineages LINE-1, Alu, and SVA comprise approximately 30% of the mass of the human genome. For the past two decades, studies have shown that TEs significantly contribute to genetic instability and that TE-caused damages are associated with genetic diseases and cancer. Different environmental exposures, including several heavy metals, influence how TEs interact with its host genome increasing their negative impact. This mini-review provides some basic knowledge on TEs, their contribution to disease and an overview of the current knowledge on how heavy metals influence TE-mediated damage. PMID:25774044

  2. iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis.

    PubMed

    Huang, Yen-Tsung; Liang, Liming; Moffatt, Miriam F; Cookson, William O C M; Lin, Xihong

    2015-07-01

    Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful. PMID:25997986

  3. iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis

    PubMed Central

    Huang, Yen-Tsung; Liang, Liang; Moffatt, Miriam F.; Cookson, William O. C. M.; Lin, Xihong

    2015-01-01

    Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful. PMID:25997986

  4. An Integrated Genomic Strategy Delineates Candidate Mediator Genes Regulating Grain Size and Weight in Rice

    PubMed Central

    Malik, Naveen; Dwivedi, Nidhi; Singh, Ashok K.; Parida, Swarup K.; Agarwal, Pinky; Thakur, Jitendra K.; Tyagi, Akhilesh K.

    2016-01-01

    The present study deployed a Mediator (MED) genes-mediated integrated genomic strategy for understanding the complex genetic architecture of grain size/weight quantitative trait in rice. The targeted multiplex amplicon resequencing of 55 MED genes annotated from whole rice genome in 384 accessions discovered 3971 SNPs, which were structurally and functionally annotated in diverse coding and non-coding sequence-components of genes. Association analysis, using the genotyping information of 3971 SNPs in a structured population of 384 accessions (with 50–100 kb linkage disequilibrium decay), detected 10 MED gene-derived SNPs significantly associated (46% combined phenotypic variation explained) with grain length, width and weight in rice. Of these, one strong grain weight-associated non-synonymous SNP (G/A)-carrying OsMED4_2 gene was validated successfully in low- and high-grain weight parental accessions and homozygous individuals of a rice mapping population. The seed-specific expression, including differential up/down-regulation of three grain size/weight-associated MED genes (including OsMED4_2) in six low and high-grain weight rice accessions was evident. Altogether, combinatorial genomic approach involving haplotype-based association analysis delineated diverse functionally relevant natural SNP-allelic variants in 10 MED genes, including three potential novel SNP haplotypes in an OsMED4_2 gene governing grain size/weight differentiation in rice. These molecular tags have potential to accelerate genomics-assisted crop improvement in rice. PMID:27000976

  5. Integrative modeling of multi-platform genomic data under the framework of mediation analysis.

    PubMed

    Huang, Yen-Tsung

    2015-01-15

    Given the availability of genomic data, there have been emerging interests in integrating multi-platform data. Here, we propose to model genetics (single nucleotide polymorphism (SNP)), epigenetics (DNA methylation), and gene expression data as a biological process to delineate phenotypic traits under the framework of causal mediation modeling. We propose a regression model for the joint effect of SNPs, methylation, gene expression, and their nonlinear interactions on the outcome and develop a variance component score test for any arbitrary set of regression coefficients. The test statistic under the null follows a mixture of chi-square distributions, which can be approximated using a characteristic function inversion method or a perturbation procedure. We construct tests for candidate models determined by different combinations of SNPs, DNA methylation, gene expression, and interactions and further propose an omnibus test to accommodate different models. We then study three path-specific effects: the direct effect of SNPs on the outcome, the effect mediated through expression, and the effect through methylation. We characterize correspondences between the three path-specific effects and coefficients in the regression model, which are influenced by causal relations among SNPs, DNA methylation, and gene expression. We illustrate the utility of our method in two genomic studies and numerical simulation studies. PMID:25316269

  6. Integrative modeling of multi-platform genomic data under the framework of mediation analysis

    PubMed Central

    Huang, Yen-Tsung

    2014-01-01

    Given the availability of genomic data, there have been emerging interests in integrating multi-platform data. Here, we propose to model genetics (single nucleotide polymorphism (SNP)), epigenetics (DNA methylation), and gene expression data as a biological process to delineate phenotypic traits under the framework of causal mediation modeling. We propose a regression model for the joint effect of SNPs, methylation, gene expression, and their nonlinear interactions on the outcome and develop a variance component score test for any arbitrary set of regression coefficients. The test statistic under the null follows a mixture of chi-square distributions, which can be approximated using a characteristic function inversion method or a perturbation procedure. We construct tests for candidate models determined by different combinations of SNPs, DNA methylation, gene expression, and interactions and further propose an omnibus test to accommodate different models. We then study three path-specific effects: the direct effect of SNPs on the outcome, the effect mediated through expression, and the effect through methylation. We characterize correspondences between the three path-specific effects and coefficients in the regression model, which are influenced by causal relations among SNPs, DNA methylation, and gene expression. We illustrate the utility of our method in two genomic studies and numerical simulation studies. PMID:25316269

  7. Integrated Genomics Identifies Convergence of Ankylosing Spondylitis with Global Immune Mediated Disease Pathways

    PubMed Central

    Uddin, Mohammed; Codner, Dianne; Mahmud Hasan, S M; Scherer, Stephen W; O’Rielly, Darren D; Rahman, Proton

    2015-01-01

    Ankylosing spondylitis(AS), a highly heritable complex inflammatory arthritis. Although, a handful of non-HLA risk loci have been identified, capturing the unexplained genetic contribution to AS pathogenesis remains a challenge attributed to additive, pleiotropic and epistatic-interactions at the molecular level. Here, we developed multiple integrated genomic approaches to quantify molecular convergence of non-HLA loci with global immune mediated diseases. We show that non-HLA genes are significantly sensitive to deleterious mutation accumulation in the general population compared with tolerant genes. Human developmental proteomics (prenatal to adult) analysis revealed that proteins encoded by non-HLA AS risk loci are 2-fold more expressed in adult hematopoietic cells.Enrichment analysis revealed AS risk genes overlap with a significant number of immune related pathways (p < 0.0001 to 9.8 × 10-12). Protein-protein interaction analysis revealed non-shared AS risk genes are highly clustered seeds that significantly converge (empirical; p < 0.01 to 1.6 × 10-4) into networks of global immune mediated disease risk loci. We have also provided initial evidence for the involvement of STAT2/3 in AS pathogenesis. Collectively, these findings highlight molecular insight on non-HLA AS risk loci that are not exclusively connected with overlapping immune mediated diseases; rather a component of common pathophysiological pathways with other immune mediated diseases. This information will be pivotal to fully explain AS pathogenesis and identify new therapeutic targets. PMID:25980808

  8. Zbtb1 Safeguards Genome Integrity and Prevents p53-Mediated Apoptosis in Proliferating Lymphoid Progenitors.

    PubMed

    Cao, Xin; Lu, Ying; Zhang, Xianyu; Kovalovsky, Damian

    2016-08-15

    Expression of the transcription factor Zbtb1 is required for normal lymphoid development. We report in the present study that Zbtb1 maintains genome integrity in immune progenitors, without which cells undergo increased DNA damage and p53-mediated apoptosis during replication and differentiation. Increased DNA damage in Zbtb1-mutant (ScanT) progenitors was due to increased sensitivity to replication stress, which was a consequence of inefficient activation of the S-phase checkpoint response. Increased p53-mediated apoptosis affected not only lymphoid but also myeloid development in competitive bone marrow chimeras, and prevention of apoptosis by transgenic Bcl2 expression and p53 deficiency rescued lymphoid as well as myeloid development from Zbtb1-mutant progenitors. Interestingly, however, protection from apoptosis rescued only the early stages of T cell development, and thymocytes remained arrested at the double-negative 3 developmental stage, indicating a strict requirement of Zbtb1 at later T cell developmental stages. Collectively, these results indicate that Zbtb1 prevents DNA damage in replicating immune progenitors, allowing the generation of B cells, T cells, and myeloid cells. PMID:27402700

  9. Integrative genomic analysis identifies epigenetic marks that mediate genetic risk for epithelial ovarian cancer

    PubMed Central

    2014-01-01

    Background Both genetic and epigenetic factors influence the development and progression of epithelial ovarian cancer (EOC). However, there is an incomplete understanding of the interrelationship between these factors and the extent to which they interact to impact disease risk. In the present study, we aimed to gain insight into this relationship by identifying DNA methylation marks that are candidate mediators of ovarian cancer genetic risk. Methods We used 214 cases and 214 age-matched controls from the Mayo Clinic Ovarian Cancer Study. Pretreatment, blood-derived DNA was profiled for genome-wide methylation (Illumina Infinium HumanMethylation27 BeadArray) and single nucleotide polymorphisms (SNPs, Illumina Infinium HD Human610-Quad BeadArray). The Causal Inference Test (CIT) was implemented to distinguish CpG sites that mediate genetic risk, from those that are consequential or independently acted on by genotype. Results Controlling for the estimated distribution of immune cells and other key covariates, our initial epigenome-wide association analysis revealed 1,993 significantly differentially methylated CpGs that between cases and controls (FDR, q < 0.05). The relationship between methylation and case-control status for these 1,993 CpGs was found to be highly consistent with the results of previously published, independent study that consisted of peripheral blood DNA methylation signatures in 131 pretreatment cases and 274 controls. Implementation of the CIT test revealed 17 CpG/SNP pairs, comprising 13 unique CpGs and 17 unique SNPs, which represent potential methylation-mediated relationships between genotype and EOC risk. Of these 13 CpGs, several are associated with immune related genes and genes that have been previously shown to exhibit altered expression in the context of cancer. Conclusions These findings provide additional insight into EOC etiology and may serve as novel biomarkers for EOC susceptibility. PMID:24479488

  10. An Integrated Genomic Analysis of Aryl Hydrocarbon Receptor-Mediated Inhibition of B-Cell Differentiation

    PubMed Central

    De Abrew, K. Nadira; Kaminski, Norbert E.; Thomas, Russell S.

    2010-01-01

    The aryl hydrocarbon receptor (AHR) agonist 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) alters differentiation of B cells and suppresses antibody production. A combination of whole-genome, microarray-based chromatin immunoprecipitation (ChIP-on-chip), and time course gene expression microarray analysis was performed on the mouse B-cell line CH12.LX following exposure to lipopolysaccharide (LPS) or LPS and TCDD to identify the primary and downstream transcriptional elements of B-cell differentiation that are altered by the AHR. ChIP-on-chip analysis identified 1893 regions with a significant increase in AHR binding with TCDD treatment. Transcription factor binding site analysis on the ChIP-on-chip data showed enrichment in AHR response elements. Other transcription factors showed significant coenrichment with AHR response elements. When ChIP-on-chip regions were compared with gene expression changes at the early time points, 78 genes were identified as potential direct targets of the AHR. AHR binding and expression changes were confirmed for a subset of genes in primary mouse B cells. Network analysis examining connections between the 78 potential AHR target genes and three transcription factors known to regulate B-cell differentiation indicated multiple paths for potential regulation by the AHR. Enrichment analysis on the differentially expressed genes at each time point evaluated the downstream impact of AHR-regulated gene expression changes on B-cell–related processes. AHR-mediated impairment of B-cell differentiation occurred at multiple nodes of the B-cell differentiation network and potentially through multiple mechanisms including direct cis-acting effects on key regulators of B-cell differentiation, indirect regulation of B-cell differentiation–related pathways, and transcriptional coregulation of target genes by AHR and other transcription factors. PMID:20819909

  11. Exogenous gene can be integrated into Nosema bombycis genome by mediating with a non-transposon vector.

    PubMed

    Guo, Rui; Cao, Guangli; Lu, Yahong; Xue, Renyu; Kumar, Dhiraj; Hu, Xiaolong; Gong, Chengliang

    2016-08-01

    Nosema bombycis, a microsporidium, is a pathogen of pebrine disease of silkworms, and its genomic DNA sequences had been determined. Thus far, the research of gene functions of microsporidium including N. bombycis cannot be performed with gain/loss of function. In the present study, we targeted to construct transgenic N. bombycis. Therefore, hemocytes of the infected silkworm were transfected with a non-transposon vector pIZT/V5-His vector in vivo, and the blood, in which the hemocyte with green fluorescence could be observed, was added to the cultured BmN cells. Furthermore, normal BmN cells were infected with germinated N. bombycis, and the infected cells were transfected with pIZT/V5-His. Continuous fluorescence observations exposed that there were N. bombycis with green fluorescence in some N. bombycis-infected cells, and the extracted genome from the purified N. bombycis spore was used as templates. PCR amplification was carried out with a pair of primers for specifically amplifying the green fluorescence protein (GFP) gene; a specific product representing the gfp gene could be amplified. Expression of the GFP protein through Western blotting also demonstrated that the gfp gene was perfectly inserted into the genome of N. bombysis. These results illustrated that exogenous gene can be integrated into N. bombycis genome by mediating with a non-transposon vector. Our research not only offers a strategy for research on gene function of N. bombycis but also provides an important reference for constructing genetically modified microsporidium utilized for biocontrol of pests. PMID:27083186

  12. Expression and genomic integration of transgenes after Agrobacterium-mediated transformation of mature barley embryos.

    PubMed

    Uçarlı, C; Tufan, F; Gürel, F

    2015-01-01

    Mature embryos in tissue cultures are advantageous because of their abundance and rapid germination, which reduces genomic instability problems. In this study, 2-day-old isolated mature barley embryos were infected with 2 Agrobacterium hypervirulent strains (AGL1 and EHA105), followed by a 3-day period of co-cultivation in the presence of L-cystein amino acid. Chimeric expression of the b-glucuronidase gene (gusA) directed by a viral promoter of strawberry vein banding virus was observed in coleoptile epidermal cells and seminal roots in 5-day-old germinated seedlings. In addition to varying infectivity patterns in different strains, there was a higher ratio of transient b-glucuronidase expression in developing coleoptiles than in embryonic roots, indicating the high competency of shoot apical meristem cells in the mature embryo. A total of 548 explants were transformed and 156 plants developed to maturity on G418 media after 18-25 days. We detected transgenes in 74% of the screened plant leaves by polymerase chain reaction, and 49% of these expressed neomycin phosphotransferase II gene following AGL1 transformation. Ten randomly selected T0 transformants were analyzed using thermal asymmetric interlaced polymerase chain reaction and 24 fragments ranged between 200-600 base pairs were sequenced. Three of the sequences flanked with transferred-DNA showed high similarity to coding regions of the barley genome, including alpha tubulin5, homeobox 1, and mitochondrial 16S genes. We observed 70-200-base pair filler sequences only in the coding regions of barley in this study. PMID:25730049

  13. XerD-mediated FtsK-independent integration of TLCϕ into the Vibrio cholerae genome.

    PubMed

    Midonet, Caroline; Das, Bhabatosh; Paly, Evelyne; Barre, Francois-Xavier

    2014-11-25

    As in most bacteria, topological problems arising from the circularity of the two Vibrio cholerae chromosomes, chrI and chrII, are resolved by the addition of a crossover at a specific site of each chromosome, dif, by two tyrosine recombinases, XerC and XerD. The reaction is under the control of a cell division protein, FtsK, which activates the formation of a Holliday Junction (HJ) intermediate by XerD catalysis that is resolved into product by XerC catalysis. Many plasmids and phages exploit Xer recombination for dimer resolution and for integration, respectively. In all cases so far described, they rely on an alternative recombination pathway in which XerC catalyzes the formation of a HJ independently of FtsK. This is notably the case for CTXϕ, the cholera toxin phage. Here, we show that in contrast, integration of TLCϕ, a toxin-linked cryptic satellite phage that is almost always found integrated at the chrI dif site before CTXϕ, depends on the formation of a HJ by XerD catalysis, which is then resolved by XerC catalysis. The reaction nevertheless escapes the normal cellular control exerted by FtsK on XerD. In addition, we show that the same reaction promotes the excision of TLCϕ, along with any CTXϕ copy present between dif and its left attachment site, providing a plausible mechanism for how chrI CTXϕ copies can be eliminated, as occurred in the second wave of the current cholera pandemic. PMID:25385643

  14. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity

    PubMed Central

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  15. DNA-PK-mediated phosphorylation of EZH2 regulates the DNA damage-induced apoptosis to maintain T-cell genomic integrity.

    PubMed

    Wang, Y; Sun, H; Wang, J; Wang, H; Meng, L; Xu, C; Jin, M; Wang, B; Zhang, Y; Zhang, Y; Zhu, T

    2016-01-01

    EZH2 is a histone methyltransferase whose functions in stem cells and tumor cells are well established. Accumulating evidence shows that EZH2 has critical roles in T cells and could be a promising therapeutic target for several immune diseases. To further reveal the novel functions of EZH2 in human T cells, protein co-immunoprecipitation combined mass spectrometry was conducted and several previous unknown EZH2-interacting proteins were identified. Of them, we focused on a DNA damage responsive protein, Ku80, because of the limited knowledge regarding EZH2 in the DNA damage response. Then, we demonstrated that instead of being methylated by EZH2, Ku80 bridges the interaction between the DNA-dependent protein kinase (DNA-PK) complex and EZH2, thus facilitating EZH2 phosphorylation. Moreover, EZH2 histone methyltransferase activity was enhanced when Ku80 was knocked down or DNA-PK activity was inhibited, suggesting DNA-PK-mediated EZH2 phosphorylation impairs EZH2 histone methyltransferase activity. On the other hand, EZH2 inhibition increased the DNA damage level at the late phase of T-cell activation, suggesting EZH2 involved in genomic integrity maintenance. In conclusion, our study is the first to demonstrate that EZH2 is phosphorylated by the DNA damage responsive complex DNA-PK and regulates DNA damage-mediated T-cell apoptosis, which reveals a novel functional crosstalk between epigenetic regulation and genomic integrity. PMID:27468692

  16. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    PubMed Central

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease. PMID:27585646

  17. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer's disease.

    PubMed

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease. PMID:27585646

  18. CUL9 mediates the functions of the 3M complex and ubiquitylates survivin to maintain genome integrity

    PubMed Central

    Li, Zhijun; Pei, Xin-Hai; Yan, Jun; Yan, Feng; Cappell, Kathryn M.; Whitehurst, Angelique W.; Xiong, Yue

    2014-01-01

    SUMMARY The Cullin 9 (CUL9) gene encodes a putative E3 ligase that localizes in the cytoplasm. Cul9 null mice develop spontaneous tumors in multiple organs, however either the cellular or molecular mechanisms of CUL9 in tumor suppression are currently not known. We show here that deletion of Cul9 leads to abnormal nuclear morphology, increased DNA damage and aneuploidy. CUL9 knockdown rescues the microtubule and mitosis defects in cells depleted for CUL7 or OBSL1, two genes that are mutated in a mutually exclusive manner in 3M growth retardation syndrome and function in microtubule dynamics. CUL9 promotes the ubiquitylation and degradation of survivin and is inhibited by CUL7. Depletion of CUL7 decreases survivin level and overexpression of survivin rescues the defects caused by CUL7 depletion. We propose a 3M–CUL9-survivin pathway in maintaining microtubule and genome integrity, normal development and tumor suppression. PMID:24793696

  19. Integrative Genomics Implicates EGFR as a Downstream Mediator in NKX2-1 Amplified Non-Small Cell Lung Cancer

    PubMed Central

    Clarke, Nicole; Biscocho, Jewison; Kwei, Kevin A.; Davidson, Jean M.; Sridhar, Sushmita; Gong, Xue; Pollack, Jonathan R.

    2015-01-01

    NKX2-1, encoding a homeobox transcription factor, is amplified in approximately 15% of non-small cell lung cancers (NSCLC), where it is thought to drive cancer cell proliferation and survival. However, its mechanism of action remains largely unknown. To identify relevant downstream transcriptional targets, here we carried out a combined NKX2-1 transcriptome (NKX2-1 knockdown followed by RNAseq) and cistrome (NKX2-1 binding sites by ChIPseq) analysis in four NKX2-1-amplified human NSCLC cell lines. While NKX2-1 regulated genes differed among the four cell lines assayed, cell proliferation emerged as a common theme. Moreover, in 3 of the 4 cell lines, epidermal growth factor receptor (EGFR) was among the top NKX2-1 upregulated targets, which we confirmed at the protein level by western blot. Interestingly, EGFR knockdown led to upregulation of NKX2-1, suggesting a negative feedback loop. Consistent with this finding, combined knockdown of NKX2-1 and EGFR in NCI-H1819 lung cancer cells reduced cell proliferation (as well as MAP-kinase and PI3-kinase signaling) more than knockdown of either alone. Likewise, NKX2-1 knockdown enhanced the growth-inhibitory effect of the EGFR-inhibitor erlotinib. Taken together, our findings implicate EGFR as a downstream effector of NKX2-1 in NKX2-1 amplified NSCLC, with possible clinical implications, and provide a rich dataset for investigating additional mediators of NKX2-1 driven oncogenesis. PMID:26556242

  20. Yeast oligo-mediated genome engineering (YOGE).

    PubMed

    DiCarlo, James E; Conley, Andrew J; Penttilä, Merja; Jäntti, Jussi; Wang, Harris H; Church, George M

    2013-12-20

    High-frequency oligonucleotide-directed recombination engineering (recombineering) has enabled rapid modification of several prokaryotic genomes to date. Here, we present a method for oligonucleotide-mediated recombineering in the model eukaryote and industrial production host Saccharomyces cerevisiae , which we call yeast oligo-mediated genome engineering (YOGE). Through a combination of overexpression and knockouts of relevant genes and optimization of transformation and oligonucleotide designs, we achieve high gene-modification frequencies at levels that only require screening of dozens of cells. We demonstrate the robustness of our approach in three divergent yeast strains, including those involved in industrial production of biobased chemicals. Furthermore, YOGE can be iteratively executed via cycling to generate genomic libraries up to 10 (5) individuals at each round for diversity generation. YOGE cycling alone or in combination with phenotypic selections or endonuclease-based negative genotypic selections can be used to generate modified alleles easily in yeast populations with high frequencies. PMID:24160921

  1. Yeast Oligo-mediated Genome Engineering (YOGE)

    PubMed Central

    DiCarlo, JE; Conley, AJ; Penttilä, M; Jäntti, J; Wang, HH; Church, GM

    2014-01-01

    High-frequency oligonucleotide-directed recombination engineering (recombineering) has enabled rapid modification of several prokaryotic genomes to date. Here, we present a method for oligonucleotide-mediated recombineering in the model eukaryote and industrial production host S. cerevisiae, which we call Yeast Oligo-mediated Genome Engineering (YOGE). Through a combination of overexpression and knockouts of relevant genes and optimization of transformation and oligonucleotide designs, we achieve high gene modification frequencies at levels that only require screening of dozens of cells. We demonstrate the robustness of our approach in three divergent yeast strains, including those involved in industrial production of bio-based chemicals. Furthermore, YOGE can be iteratively executed via cycling to generate genomic libraries up to 105 individuals at each round for diversity generation. YOGE cycling alone, or in combination with phenotypic selections or endonuclease-based negative genotypic selections, can be used to easily generate modified alleles in yeast populations with high frequencies. PMID:24160921

  2. TAL effector-mediated genome visualization (TGV).

    PubMed

    Miyanari, Yusuke

    2014-09-01

    The three-dimensional remodeling of chromatin within nucleus is being recognized as determinant for genome regulation. Recent technological advances in live imaging of chromosome loci begun to explore the biological roles of the movement of the chromatin within the nucleus. To facilitate better understanding of the functional relevance and mechanisms regulating genome architecture, we applied transcription activator-like effector (TALE) technology to visualize endogenous repetitive genomic sequences in mouse cells. The application, called TAL effector-mediated genome visualization (TGV), allows us to label specific repetitive sequences and trace nuclear remodeling in living cells. Using this system, parental origin of chromosomes was specifically traced by distinction of single-nucleotide polymorphisms (SNPs). This review will present our approaches to monitor nuclear dynamics of target sequences and highlights key properties and potential uses of TGV. PMID:24704356

  3. Triplex-mediated genome targeting and editing.

    PubMed

    Reza, Faisal; Glazer, Peter M

    2014-01-01

    Genome targeting and editing in vitro and in vivo can be achieved through an interplay of exogenously introduced molecules and the induction of endogenous recombination machinery. The former includes a repertoire of sequence-specific binding molecules for targeted induction and appropriation of this machinery, such as by triplex-forming oligonucleotides (TFOs) or triplex-forming peptide nucleic acids (PNAs) and recombinagenic donor DNA, respectively. This versatile targeting and editing via recombination approach facilitates high-fidelity and low-off-target genome mutagenesis, repair, expression, and regulation. Herein, we describe the current state-of-the-art in triplex-mediated genome targeting and editing with a perspective towards potential translational and therapeutic applications. We detail several materials and methods for the design, delivery, and use of triplex-forming and recombinagenic molecules for mediating and introducing specific, heritable, and safe genomic modifications. Furthermore we denote some guidelines for endogenous genome targeting and editing site identification and techniques to test targeting and editing efficiency. PMID:24557900

  4. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  5. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  6. Zinc Finger Nuclease-Expressing Baculoviral Vectors Mediate Targeted Genome Integration of Reprogramming Factor Genes to Facilitate the Generation of Human Induced Pluripotent Stem Cells

    PubMed Central

    Phang, Rui-Zhe; Tay, Felix Chang; Goh, Sal-Lee; Lau, Cia-Hin; Zhu, Haibao; Tan, Wee-Kiat; Liang, Qingle; Chen, Can; Du, Shouhui; Li, Zhendong; Tay, Johan Chin-Kang; Wu, Chunxiao; Zeng, Jieming; Fan, Weimin; Toh, Han Chong

    2013-01-01

    Integrative gene transfer using retroviruses to express reprogramming factors displays high efficiency in generating induced pluripotent stem cells (iPSCs), but the value of the method is limited because of the concern over mutagenesis associated with random insertion of transgenes. Site-specific integration into a preselected locus by engineered zinc-finger nuclease (ZFN) technology provides a potential way to overcome the problem. Here, we report the successful reprogramming of human fibroblasts into a state of pluripotency by baculoviral transduction-mediated, site-specific integration of OKSM (Oct3/4, Klf4, Sox2, and c-myc) transcription factor genes into the AAVS1 locus in human chromosome 19. Two nonintegrative baculoviral vectors were used for cotransduction, one expressing ZFNs and another as a donor vector encoding the four transcription factors. iPSC colonies were obtained at a high efficiency of 12% (the mean value of eight individual experiments). All characterized iPSC clones carried the transgenic cassette only at the ZFN-specified AAVS1 locus. We further demonstrated that when the donor cassette was flanked by heterospecific loxP sequences, the reprogramming genes in iPSCs could be replaced by another transgene using a baculoviral vector-based Cre recombinase-mediated cassette exchange system, thereby producing iPSCs free of exogenous reprogramming factors. Although the use of nonintegrating methods to generate iPSCs is rapidly becoming a standard approach, methods based on site-specific integration of reprogramming factor genes as reported here hold the potential for efficient generation of genetically amenable iPSCs suitable for future gene therapy applications. PMID:24167318

  7. Integrative Genomics of Chronic Obstructive Pulmonary Disease

    PubMed Central

    Hobbs, Brian D.; Hersh, Craig P.

    2014-01-01

    Chronic obstructive pulmonary disease (COPD) is a complex disease with both environmental and genetic determinants, the most important of which is cigarette smoking. There is marked heterogeneity in the development of COPD among persons with similar cigarette smoking histories, which is likely partially explained by genetic variation. Genomic approaches such as genomewide association studies and gene expression studies have been used to discover genes and molecular pathways involved in COPD pathogenesis; however, these “first generation” omics studies have limitations. Integrative genomic studies are emerging which can combine genomic datasets to further examine the molecular underpinnings of COPD. Future research in COPD genetics will likely use network-based approaches to integrate multiple genomic data types in order to model the complex molecular interactions involved in COPD pathogenesis. This article reviews the genomic research to date and offers a vision for the future of integrative genomic research in COPD. PMID:25078622

  8. Reverse transcriptase: mediator of genomic plasticity.

    PubMed

    Brosius, J; Tiedge, H

    1995-01-01

    Reverse transcription has been an important mediator of genomic change. This influence dates back more than three billion years, when the RNA genome was converted into the DNA genome. While the current cellular role(s) of reverse transcriptase are not yet completely understood, it has become clear over the last few years that this enzyme is still responsible for generating significant genomic change and that its activities are one of the driving forces of evolution. Reverse transcriptase generates, for example, extra gene copies (retrogenes), using as a template mature messenger RNAs. Such retrogenes do not always end up as nonfunctional pseudogenes but form, after reinsertion into the genome, new unions with resident promoter elements that may alter the gene's temporal and/or spatial expression levels. More frequently, reverse transcriptase produces copies of nonmessenger RNAs, such as small nuclear or cytoplasmic RNAs. Extremely high copy numbers can be generated by this process. The resulting reinserted DNA copies are therefore referred to as short interspersed repetitive elements (SINEs). SINEs have long been considered selfish DNA, littering the genome via exponential propagation but not contributing to the host's fitness. Many SINEs, however, can give rise to novel genes encoding small RNAs, and are the migrant carriers of numerous control elements and sequence motifs that can equip resident genes with novel regulatory elements [Brosius J. and Gould S.J., Proc Natl Acad Sci USA 89, 10706-10710, 1992]. Retrosequences, such as SINEs and portions of retroelements (e.g., long terminal repeats, LTRs), are capable of donating sequence motifs for nucleosome positioning, DNA methylation, transcriptional enhancers and silencers, poly(A) addition sequences, determinants of RNA stability or transport, splice sites, and even amino acid codons for incorporation into open reading frames as novel protein domains. Retroposition can therefore be considered as a major

  9. Next-Generation Genomics: an Integrative Approach

    PubMed Central

    Hawkins, R. David; Hon, Gary C.; Ren, Bing

    2011-01-01

    Integrating results from diverse experiments is an essential process in our effort to understand the logic of complex systems, such as development, homeostasis and responses to the environment. With the advent of high-throughput methods - including genome-wide association studies (GWAS), ChIP-Seq, and RNA-Seq, etc., - acquisition of genome-scale data has never been easier. Epigenetics, transcriptomics, proteomics and genomics each provide an insightful, and yet single-dimensional, view of genome function; integrative analysis promises a unified, global view. However, the large amount of information and diverse technology platforms pose multiple challenges for data access and processing. This Review discusses emerging issues and strategies related to data integration in the era of next-generation genomics. PMID:20531367

  10. Integrated genome browser: visual analytics platform for genomics

    PubMed Central

    Norris, David C.; Loraine, Ann E.

    2016-01-01

    Motivation: Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Results: Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB’s ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. Availability and implementation: IGB is open source and is freely available from http://bioviz.org/igb. Contact: aloraine@uncc.edu PMID:27153568

  11. Transcription as a Threat to Genome Integrity.

    PubMed

    Gaillard, Hélène; Aguilera, Andrés

    2016-06-01

    Genomes undergo different types of sporadic alterations, including DNA damage, point mutations, and genome rearrangements, that constitute the basis for evolution. However, these changes may occur at high levels as a result of cell pathology and trigger genome instability, a hallmark of cancer and a number of genetic diseases. In the last two decades, evidence has accumulated that transcription constitutes an important natural source of DNA metabolic errors that can compromise the integrity of the genome. Transcription can create the conditions for high levels of mutations and recombination by its ability to open the DNA structure and remodel chromatin, making it more accessible to DNA insulting agents, and by its ability to become a barrier to DNA replication. Here we review the molecular basis of such events from a mechanistic perspective with particular emphasis on the role of transcription as a genome instability determinant. PMID:27023844

  12. Integrating Mediators and Moderators in Research Design

    ERIC Educational Resources Information Center

    MacKinnon, David P.

    2011-01-01

    The purpose of this article is to describe mediating variables and moderating variables and provide reasons for integrating them in outcome studies. Separate sections describe examples of moderating and mediating variables and the simplest statistical model for investigating each variable. The strengths and limitations of incorporating mediating…

  13. Methods of Genomic Competency Integration in Practice

    PubMed Central

    Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

    2015-01-01

    Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through

  14. An Integrated System for Precise Genome Modification in Escherichia coli

    PubMed Central

    Tas, Huseyin; Nguyen, Cac T.; Patel, Ravish; Kim, Neil H.; Kuhlman, Thomas E.

    2015-01-01

    We describe an optimized system for the easy, effective, and precise modification of the Escherichia coli genome. Genome changes are introduced first through the integration of a 1.3 kbp Landing Pad consisting of a gene conferring resistance to tetracycline (tetA) or the ability to metabolize the sugar galactose (galK). The Landing Pad is then excised as a result of double-strand breaks by the homing endonuclease I-SceI, and replaced with DNA fragments bearing the desired change via λ-Red mediated homologous recombination. Repair of the double strand breaks and counterselection against the Landing Pad (using NiCl2 for tetA or 2-deoxy-galactose for galK) allows the isolation of modified bacteria without the use of additional antibiotic selection. We demonstrate the power of this method to make a variety of genome modifications: the exact integration, without any extraneous sequence, of the lac operon (~6.5 kbp) to any desired location in the genome and without the integration of antibiotic markers; the scarless deletion of ribosomal rrn operons (~6 kbp) through either intrachromosomal or oligonucleotide recombination; and the in situ fusion of native genes to fluorescent reporter genes without additional perturbation. PMID:26332675

  15. An Integrated System for Precise Genome Modification in Escherichia coli.

    PubMed

    Tas, Huseyin; Nguyen, Cac T; Patel, Ravish; Kim, Neil H; Kuhlman, Thomas E

    2015-01-01

    We describe an optimized system for the easy, effective, and precise modification of the Escherichia coli genome. Genome changes are introduced first through the integration of a 1.3 kbp Landing Pad consisting of a gene conferring resistance to tetracycline (tetA) or the ability to metabolize the sugar galactose (galK). The Landing Pad is then excised as a result of double-strand breaks by the homing endonuclease I-SceI, and replaced with DNA fragments bearing the desired change via λ-Red mediated homologous recombination. Repair of the double strand breaks and counterselection against the Landing Pad (using NiCl2 for tetA or 2-deoxy-galactose for galK) allows the isolation of modified bacteria without the use of additional antibiotic selection. We demonstrate the power of this method to make a variety of genome modifications: the exact integration, without any extraneous sequence, of the lac operon (~6.5 kbp) to any desired location in the genome and without the integration of antibiotic markers; the scarless deletion of ribosomal rrn operons (~6 kbp) through either intrachromosomal or oligonucleotide recombination; and the in situ fusion of native genes to fluorescent reporter genes without additional perturbation. PMID:26332675

  16. An Integrated Approach to Predictive Genomic Analytics

    SciTech Connect

    McDermott, Jason E.; Sanfilippo, Antonio P.; Taylor, Ronald C.; Baddeley, Robert L.; Riensche, Roderick M.; Jensen, Russell S.

    2010-08-02

    A variety of methods and algorithms have recently been employed in the analysis of gene expression data, including reverse-engineering and knowledge-based pathway modeling, semantic gene similarity, network analysis and clustering. These methods and algorithms address different subparts of the same overall challenge and need to be applied in combination to address predictive genomic analysis as a whole. In this paper, we present an integrated approach to predictive genomic analysis that achieves this objective and describe an application of the approach to the study of neuroprotection in stroke.

  17. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  18. RNA-Mediated Epigenetic Programming of Genome Rearrangements

    PubMed Central

    Nowacki, Mariusz; Shetty, Keerthi; Landweber, Laura F.

    2012-01-01

    RNA, normally thought of as a conduit in gene expression, has a novel mode of action in ciliated protozoa. Maternal RNA templates provide both an organizing guide for DNA rearrangements and a template that can transport somatic mutations to the next generation. This opportunity for RNA-mediated genome rearrangement and DNA repair is profound in the ciliate Oxytricha, which deletes 95% of its germline genome during development in a process that severely fragments its chromosomes and then sorts and reorders the hundreds of thousands of pieces remaining. Oxytricha’s somatic nuclear genome is therefore an epigenome formed through RNA templates and signals arising from the previous generation. Furthermore, this mechanism of RNA-mediated epigenetic inheritance can function across multiple generations, and the discovery of maternal template RNA molecules has revealed new biological roles for RNA and has hinted at the power of RNA molecules to sculpt genomic information in cells. PMID:21801022

  19. Integrative Genomics and Computational Systems Medicine

    SciTech Connect

    McDermott, Jason E.; Huang, Yufei; Zhang, Bing; Xu, Hua; Zhao, Zhongming

    2014-01-01

    The exponential growth in generation of large amounts of genomic data from biological samples has driven the emerging field of systems medicine. This field is promising because it improves our understanding of disease processes at the systems level. However, the field is still in its young stage. There exists a great need for novel computational methods and approaches to effectively utilize and integrate various omics data.

  20. Genomic, Proteomic, and Metabolomic Data Integration Strategies

    PubMed Central

    Wanichthanarak, Kwanjeera; Fahrmann, Johannes F; Grapov, Dmitry

    2015-01-01

    Robust interpretation of experimental results measuring discreet biological domains remains a significant challenge in the face of complex biochemical regulation processes such as organismal versus tissue versus cellular metabolism, epigenetics, and protein post-translational modification. Integration of analyses carried out across multiple measurement or omic platforms is an emerging approach to help address these challenges. This review focuses on select methods and tools for the integration of metabolomic with genomic and proteomic data using a variety of approaches including biochemical pathway-, ontology-, network-, and empirical-correlation-based methods. PMID:26396492

  1. Integrating Computer-Mediated Communication Strategy Instruction

    ERIC Educational Resources Information Center

    McNeil, Levi

    2016-01-01

    Communication strategies (CSs) play important roles in resolving problematic second language interaction and facilitating language learning. While studies in face-to-face contexts demonstrate the benefits of communication strategy instruction (CSI), there have been few attempts to integrate computer-mediated communication and CSI. The study…

  2. Adeno-Associated Virus Type 2 Wild-Type and Vector-Mediated Genomic Integration Profiles of Human Diploid Fibroblasts Analyzed by Third-Generation PacBio DNA Sequencing

    PubMed Central

    Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei

    2014-01-01

    ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will

  3. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational

  4. Multidimensional Genome-wide Analyses Show Accurate FVIII Integration by ZFN in Primary Human Cells

    PubMed Central

    Sivalingam, Jaichandran; Kenanov, Dimitar; Han, Hao; Nirmal, Ajit Johnson; Ng, Wai Har; Lee, Sze Sing; Masilamani, Jeyakumar; Phan, Toan Thang; Maurer-Stroh, Sebastian; Kon, Oi Lian

    2016-01-01

    Costly coagulation factor VIII (FVIII) replacement therapy is a barrier to optimal clinical management of hemophilia A. Therapy using FVIII-secreting autologous primary cells is potentially efficacious and more affordable. Zinc finger nucleases (ZFN) mediate transgene integration into the AAVS1 locus but comprehensive evaluation of off-target genome effects is currently lacking. In light of serious adverse effects in clinical trials which employed genome-integrating viral vectors, this study evaluated potential genotoxicity of ZFN-mediated transgenesis using different techniques. We employed deep sequencing of predicted off-target sites, copy number analysis, whole-genome sequencing, and RNA-seq in primary human umbilical cord-lining epithelial cells (CLECs) with AAVS1 ZFN-mediated FVIII transgene integration. We combined molecular features to enhance the accuracy and activity of ZFN-mediated transgenesis. Our data showed a low frequency of ZFN-associated indels, no detectable off-target transgene integrations or chromosomal rearrangements. ZFN-modified CLECs had very few dysregulated transcripts and no evidence of activated oncogenic pathways. We also showed AAVS1 ZFN activity and durable FVIII transgene secretion in primary human dermal fibroblasts, bone marrow- and adipose tissue-derived stromal cells. Our study suggests that, with close attention to the molecular design of genome-modifying constructs, AAVS1 ZFN-mediated FVIII integration in several primary human cell types may be safe and efficacious. PMID:26689265

  5. Site-specific recombination in the chicken genome using Flipase recombinase-mediated cassette exchange.

    PubMed

    Lee, Hong Jo; Lee, Hyung Chul; Kim, Young Min; Hwang, Young Sun; Park, Young Hyun; Park, Tae Sub; Han, Jae Yong

    2016-02-01

    Targeted genome recombination has been applied in diverse research fields and has a wide range of possible applications. In particular, the discovery of specific loci in the genome that support robust and ubiquitous expression of integrated genes and the development of genome-editing technology have facilitated rapid advances in various scientific areas. In this study, we produced transgenic (TG) chickens that can induce recombinase-mediated gene cassette exchange (RMCE), one of the site-specific recombination technologies, and confirmed RMCE in TG chicken-derived cells. As a result, we established TG chicken lines that have, Flipase (Flp) recognition target (FRT) pairs in the chicken genome, mediated by piggyBac transposition. The transgene integration patterns were diverse in each TG chicken line, and the integration diversity resulted in diverse levels of expression of exogenous genes in each tissue of the TG chickens. In addition, the replaced gene cassette was expressed successfully and maintained by RMCE in the FRT predominant loci of TG chicken-derived cells. These results indicate that targeted genome recombination technology with RMCE could be adaptable to TG chicken models and that the technology would be applicable to specific gene regulation by cis-element insertion and customized expression of functional proteins at predicted levels without epigenetic influence. PMID:26443821

  6. Domain-mediated protein interaction prediction: From genome to network.

    PubMed

    Reimand, Jüri; Hui, Shirley; Jain, Shobhit; Law, Brian; Bader, Gary D

    2012-08-14

    Protein-protein interactions (PPIs), involved in many biological processes such as cellular signaling, are ultimately encoded in the genome. Solving the problem of predicting protein interactions from the genome sequence will lead to increased understanding of complex networks, evolution and human disease. We can learn the relationship between genomes and networks by focusing on an easily approachable subset of high-resolution protein interactions that are mediated by peptide recognition modules (PRMs) such as PDZ, WW and SH3 domains. This review focuses on computational prediction and analysis of PRM-mediated networks and discusses sequence- and structure-based interaction predictors, techniques and datasets for identifying physiologically relevant PPIs, and interpreting high-resolution interaction networks in the context of evolution and human disease. PMID:22561014

  7. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  8. Transposon-mediated Genome Manipulations in Vertebrates

    PubMed Central

    Ivics, Zoltán; Li, Meng Amy; Mátés, Lajos; Boeke, Jef D.; Bradley, Allan; Izsvák, Zsuzsanna

    2010-01-01

    Transposable elements are segments of DNA with the unique ability to move about in the genome. This inherent feature can be exploited to harness these elements as gene vectors for diverse genome manipulations. Transposon-based genetic strategies have been established in vertebrate species over the last decade, and current progress in this field indicates that transposable elements will serve as indispensable tools in the genetic toolkit of vertebrate models. In particular, transposons can be applied as vectors for somatic and germline transgenesis, and as insertional mutagens in both loss-of-function and gain-of-function forward mutagenesis screens. The major advantage of using transposons as genetic tools is that they facilitate analysis of gene function in an easy, controlled and scalable manner. Transposon-based technologies are beginning to be exploited to link sequence information to gene functions in vertebrate models. In this article, we provide an overview of transposon-based methods used in vertebrate model organisms, and highlight the most important considerations concerning genetic applications of the transposon systems. PMID:19478801

  9. Integrated Genomic Characterization of Endometrial Carcinoma

    PubMed Central

    2013-01-01

    Summary We performed an integrated genomic, transcriptomic, and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumors and ~25% of high-grade endometrioid tumors have extensive copy number alterations, few DNA methylation changes, low ER/PR levels, and frequent TP53 mutations. Most endometrioid tumors have few copy number alterations or TP53 mutations but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A, KRAS and novel mutations in the SWI/SNF gene ARID5B. A subset of endometrioid tumors we identified had a dramatically increased transversion mutation frequency, and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy number low, and copy number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may impact post-surgical adjuvant treatment for women with aggressive tumors. PMID:23636398

  10. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish.

    PubMed

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-01-01

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish. PMID:27187373

  11. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    PubMed Central

    Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito

    2016-01-01

    The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish. PMID:27187373

  12. LINE-1 Retrotransposons: Mediators of Somatic Variation in Neuronal Genomes?

    PubMed Central

    Singer, Tatjana; McConnell, Michael J.; Marchetto, Maria C.N.; Coufal, Nicole G.; Gage, Fred H.

    2010-01-01

    LINE-1 (L1) elements are retrotransposons that insert extra copies of themselves throughout the genome using a “copy and paste” mechanism. L1s have contributed ~20% to total human genome content and are able to influence chromosome integrity and gene expression upon reinsertion. Recent studies show that L1 elements are active and “jumping” during neuronal differentiation. New somatic L1 insertions may generate “genomic plasticity” in neurons by causing variation in genomic DNA sequences and by altering the transcriptome of individual cells. Thus, L1-induced variation may affect neuronal plasticity and behavior. Here, we discuss potential consequences of L1-induced neuronal diversity and propose that a mechanism generating diversity in the brain could broaden the spectrum of behavioral phenotypes that can originate from any single genome. PMID:20471112

  13. Genome integrity, stem cells and hyaluronan

    PubMed Central

    Darzynkiewicz, Zbigniew; Balazs, Endre A.

    2012-01-01

    Faithful preservation of genome integrity is the critical mission of stem cells as well as of germ cells. Reviewed are the following mechanisms involved in protecting DNA in these cells: (a) The efflux machinery that can pump out variety of genotoxins in ATP-dependent manner; (b) the mechanisms maintaining minimal metabolic activity which reduces generation of reactive oxidants, by-products of aerobic respiration; (c) the role of hypoxic niche of stem cells providing a gradient of variable oxygen tension; (d) (e) the presence of hyaluronan (HA) and HA receptors on stem cells and in the niche; (f) the role of HA in protecting DNA from oxidative damage; (g) the specific function of HA in protecting DNA in stem cells; (h) the interactions of HA with sperm cells and oocytes that also may shield their DNA from oxidative damage, and (e) mechanisms by which HA exerts the anti-oxidant activity. While HA has multitude of functions its anti-oxidant capabilities are often overlooked but may be of significance in preservation of integrity of stem and germ cells genome. PMID:22383371

  14. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  15. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  16. Efficient strategies for TALEN-mediated genome editing in mammalian cell lines.

    PubMed

    Valton, Julien; Cabaniols, Jean-Pierre; Galetto, Romàn; Delacote, Fabien; Duhamel, Marianne; Paris, Sebastien; Blanchard, Domique Alain; Lebuhotel, Céline; Thomas, Séverine; Moriceau, Sandra; Demirdjian, Raffy; Letort, Gil; Jacquet, Adeline; Gariboldi, Annabelle; Rolland, Sandra; Daboussi, Fayza; Juillerat, Alexandre; Bertonati, Claudia; Duclert, Aymeric; Duchateau, Philippe

    2014-09-01

    TALEN is one of the most widely used tools in the field of genome editing. It enables gene integration and gene inactivation in a highly efficient and specific fashion. Although very attractive, the apparent simplicity and high success rate of TALEN could be misleading for novices in the field of gene editing. Depending on the application, specific TALEN designs, activity assessments and screening strategies need to be adopted. Here we report different methods to efficiently perform TALEN-mediated gene integration and inactivation in different mammalian cell systems including induced pluripotent stem cells and delineate experimental examples associated with these approaches. PMID:25047178

  17. LKB1 preserves genome integrity by stimulating BRCA1 expression

    PubMed Central

    Gupta, Romi; Liu, Alex. Y.; Glazer, Peter M.; Wajapeyee, Narendra

    2015-01-01

    Serine/threonine kinase 11 (STK11, also known as LKB1) functions as a tumor suppressor in many human cancers. However, paradoxically loss of LKB1 in mouse embryonic fibroblast results in resistance to oncogene-induced transformation. Therefore, it is unclear why loss of LKB1 leads to increased predisposition to develop a wide variety of cancers. Here, we show that LKB1 protects cells from genotoxic stress. Cells lacking LKB1 display increased sensitivity to irradiation, accumulates more DNA double-strand breaks, display defective homology-directed DNA repair (HDR) and exhibit increased mutation rate, compared with that of LKB1-expressing cells. Conversely, the ectopic expression of LKB1 in cells lacking LKB1 protects them against genotoxic stress-induced DNA damage and prevents the accumulation of mutations. We find that LKB1 post-transcriptionally stimulates HDR gene BRCA1 expression by inhibiting the cytoplasmic localization of the RNA-binding protein, HU antigen R, in an AMP kinase-dependent manner and stabilizes BRCA1 mRNA. Cells lacking BRCA1 similar to the cell lacking LKB1 display increased genomic instability and ectopic expression of BRCA1 rescues LKB1 loss-induced sensitivity to genotoxic stress. Collectively, our results demonstrate that LKB1 is a crucial regulator of genome integrity and reveal a novel mechanism for LKB1-mediated tumor suppression with direct therapeutic implications for cancer prevention. PMID:25488815

  18. CRISPR mediated somatic cell genome engineering in the chicken.

    PubMed

    Véron, Nadège; Qu, Zhengdong; Kipen, Phoebe A S; Hirst, Claire E; Marcelle, Christophe

    2015-11-01

    Gene-targeted knockout technologies are invaluable tools for understanding the functions of genes in vivo. CRISPR/Cas9 system of RNA-guided genome editing is revolutionizing genetics research in a wide spectrum of organisms. Here, we combined CRISPR with in vivo electroporation in the chicken embryo to efficiently target the transcription factor PAX7 in tissues of the developing embryo. This approach generated mosaic genetic mutations within a wild-type cellular background. This series of proof-of-principle experiments indicate that in vivo CRISPR-mediated cell genome engineering is an effective method to achieve gene loss-of-function in the tissues of the chicken embryo and it completes the growing genetic toolbox to study the molecular mechanisms regulating development in this important animal model. PMID:26277216

  19. Precise in-frame integration of exogenous DNA mediated by CRISPR/Cas9 system in zebrafish.

    PubMed

    Hisano, Yu; Sakuma, Tetsushi; Nakade, Shota; Ohga, Rie; Ota, Satoshi; Okamoto, Hitoshi; Yamamoto, Takashi; Kawahara, Atsuo

    2015-01-01

    The CRISPR/Cas9 system provides a powerful tool for genome editing in various model organisms, including zebrafish. The establishment of targeted gene-disrupted zebrafish (knockouts) is readily achieved by CRISPR/Cas9-mediated genome modification. Recently, exogenous DNA integration into the zebrafish genome via homology-independent DNA repair was reported, but this integration contained various mutations at the junctions of genomic and integrated DNA. Thus, precise genome modification into targeted genomic loci remains to be achieved. Here, we describe efficient, precise CRISPR/Cas9-mediated integration using a donor vector harbouring short homologous sequences (10-40 bp) flanking the genomic target locus. We succeeded in integrating with high efficiency an exogenous mCherry or eGFP gene into targeted genes (tyrosinase and krtt1c19e) in frame. We found the precise in-frame integration of exogenous DNA without backbone vector sequences when Cas9 cleavage sites were introduced at both sides of the left homology arm, the eGFP sequence and the right homology arm. Furthermore, we confirmed that this precise genome modification was heritable. This simple method enables precise targeted gene knock-in in zebrafish. PMID:25740433

  20. Perspectives of integrative cancer genomics in next generation sequencing era.

    PubMed

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-06-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

  1. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    PubMed

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org. PMID:27252399

  2. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database

    PubMed Central

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T.; Karra, Kalpana; Hitz, Benjamin C.; Nash, Robert S.; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J.

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences. Database URL: www.yeastgenome.org PMID:27252399

  3. Report from the First Snake Genomics and Integrative Biology Meeting

    PubMed Central

    Castoe, Todd A.; Braun, Edward L.; Bronikowski, Anne M.; Cox, Christian L.; Rabosky, Alison R. Davis; Jason de Koning, A.P.; Dobry, Jason; Fujita, Matthew K.; Giorgianni, Matt W; Hargreaves, Adam; Henkel, Christiaan V.; Mackessy, Stephen P.; O’Meally, Denis; Rokyta, Darin R.; Secor, Stephen M.; Streicher, Jeffrey W.; Wray, Kenneth P.; Yokoyama, Ken D.; Pollock, David D.

    2012-01-01

    This report summarizes the proceedings of the 1st Snake Genomics and Integrative Biology Meeting held in Vail, CO USA, 5-8 October 2011. The meeting had over twenty registered participants, and was conducted as a single session of presentations. Goals of the meeting included coordination of genomic data collection and fostering collaborative interactions among researchers using snakes as model systems. PMID:23451292

  4. Methods for integration site distribution analyses in animal cell genomes

    PubMed Central

    Ciuffi, Angela; Ronen, Keshet; Brady, Troy; Malani, Nirav; Wang, Gary; Berry, Charles C.; Bushman, Frederic D.

    2014-01-01

    The question of where retroviral DNA becomes integrated in chromosomes is important for understanding (i) the mechanisms of viral growth, (ii) devising new anti-retroviral therapy, (iii) understanding how genomes evolve, and (iv) developing safer methods for gene therapy. With the completion of genome sequences for many organisms, it has become possible to study integration targeting by cloning and sequencing large numbers of host–virus DNA junctions, then mapping the host DNA segments back onto the genomic sequence. This allows statistical analysis of the distribution of integration sites relative to the myriad types of genomic features that are also being mapped onto the sequence scaffold. Here we present methods for recovering and analyzing integration site sequences. PMID:19038346

  5. Integrated proteomic and genomic analysis of colorectal cancer

    Cancer.gov

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  6. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Zhou, Jizhong; He, Zhili

    2014-04-08

    As a part of the Shewanella Federation project, we have used integrated genomic, proteomic and computational technologies to study various aspects of energy metabolism of two Shewanella strains from a systems-level perspective.

  7. Applied plant genomics: the secret is integration.

    PubMed

    Osterlund, Mark T; Paterson, Andrew H

    2002-04-01

    Although concerted efforts to understand selected botanical models have been made, the resulting basic knowledge varies in its applicability to other diverse species including the major crops. Recent advances in high-throughput genomics are offering new avenues through which to exploit model systems for the study of botanical diversity, providing prospects for crop improvement. In particular, whole-genome sequencing has provided opportunities for the broader application of reverse genetics, expression profiling, and molecular mapping in diverse species. PMID:11856610

  8. Integrated Microbial Genomes (IMG) System from the DOE Joint Genome Institute (JGI)

    DOE Data Explorer

    The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov. [Abstract from The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions; Victor M. Markowitz, Ernest Szeto, Krishna Palaniappan, Yuri Grechkin, Ken Chu, I-Min A. Chen, Inna Dubchak, Iain Anderson, Athanasios Lykidis, Konstantinos Mavromatis, Natalia N. Ivanova and Nikos C. Kyrpides; Nucleic Acids Research, 2008, Vol. 36. (Database Issue) See also the companion system, Integrated Microbial Genomes with Microbiome Samples.

  9. Recombination-mediated genetic engineering of large genomic DNA transgenes.

    PubMed

    Ejsmont, Radoslaw Kamil; Ahlfeld, Peter; Pozniakovsky, Andrei; Stewart, A Francis; Tomancak, Pavel; Sarov, Mihail

    2011-01-01

    Faithful gene activity reporters are a useful tool for evo-devo studies enabling selective introduction of specific loci between species and assaying the activity of large gene regulatory sequences. The use of large genomic constructs such as BACs and fosmids provides an efficient platform for exploration of gene function under endogenous regulatory control. Despite their large size they can be easily engineered using in vivo homologous recombination in Escherichia coli (recombineering). We have previously demonstrated that the efficiency and fidelity of recombineering are sufficient to allow high-throughput transgene engineering in liquid culture, and have successfully applied this approach in several model systems. Here, we present a detailed protocol for recombineering of BAC/fosmid transgenes for expression of fluorescent or affinity tagged proteins in Drosophila under endogenous in vivo regulatory control. The tag coding sequence is seamlessly recombineered into the genomic region contained in the BAC/fosmid clone, which is then integrated into the fly genome using ϕC31 recombination. This protocol can be easily adapted to other recombineering projects. PMID:22065454

  10. CRISPR-Cas9-Mediated Genome Editing in Leishmania donovani

    PubMed Central

    Zhang, Wen-Wei

    2015-01-01

    ABSTRACT The prokaryotic CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9, an RNA-guided endonuclease, has been shown to mediate efficient genome editing in a wide variety of organisms. In the present study, the CRISPR-Cas9 system has been adapted to Leishmania donovani, a protozoan parasite that causes fatal human visceral leishmaniasis. We introduced the Cas9 nuclease into L. donovani and generated guide RNA (gRNA) expression vectors by using the L. donovani rRNA promoter and the hepatitis delta virus (HDV) ribozyme. It is demonstrated within that L. donovani mainly used homology-directed repair (HDR) and microhomology-mediated end joining (MMEJ) to repair the Cas9 nuclease-created double-strand DNA break (DSB). The nonhomologous end-joining (NHEJ) pathway appears to be absent in L. donovani. With this CRISPR-Cas9 system, it was possible to generate knockouts without selection by insertion of an oligonucleotide donor with stop codons and 25-nucleotide homology arms into the Cas9 cleavage site. Likewise, we disrupted and precisely tagged endogenous genes by inserting a bleomycin drug selection marker and GFP gene into the Cas9 cleavage site. With the use of Hammerhead and HDV ribozymes, a double-gRNA expression vector that further improved gene-targeting efficiency was developed, and it was used to make precise deletion of the 3-kb miltefosine transporter gene (LdMT). In addition, this study identified a novel single point mutation caused by CRISPR-Cas9 in LdMT (M381T) that led to miltefosine resistance, a concern for the only available oral antileishmanial drug. Together, these results demonstrate that the CRISPR-Cas9 system represents an effective genome engineering tool for L. donovani. PMID:26199327

  11. Integrated genomic characterization of IDH1-mutant glioma malignant progression

    PubMed Central

    Bai, Hanwen; Harmanci, Akdes Serin; Erson-Omay, E Zeynep; Li, Jie; Coşkun, Süleyman; Simon, Matthias; Krischek, Boris; Özduman, Koray; Omay, S Bülent; Sorensen, Eric A; Turcan, Şevin; Bakırcığlu, Mehmet; Carrión-Grant, Geneive; Murray, Phillip B; Clark, Victoria E; Ercan-Sencicek, A Gulhan; Knight, James; Sencar, Leman; Altınok, Selin; Kaulen, Leon D; Gülez, Burcu; Timmer, Marco; Schramm, Johannes; Mishra-Gorur, Ketu; Henegariu, Octavian; Moliterno, Jennifer; Louvi, Angeliki; Chan, Timothy A; Tannheimer, Stacey L; Pamir, M Necmettin; Vortmeyer, Alexander O; Bilguvar, Kaya; Yasuno, Katsuhito; Günel, Murat

    2016-01-01

    Gliomas represent approximately 30% of all central nervous system tumors and 80% of malignant brain tumors1. To understand the molecular mechanisms underlying the malignant progression of low-grade gliomas with mutations in IDH1 (encoding isocitrate dehydrogenase 1), we studied paired tumor samples from 41 patients, comparing higher-grade, progressed samples to their lower-grade counterparts. Integrated genomic analyses, including whole-exome sequencing and copy number, gene expression and DNA methylation profiling, demonstrated nonlinear clonal expansion of the original tumors and identified oncogenic pathways driving progression. These include activation of the MYC and RTK-RAS-PI3K pathways and upregulation of the FOXM1- and E2F2-mediated cell cycle transitions, as well as epigenetic silencing of developmental transcription factor genes bound by Polycomb repressive complex 2 in human embryonic stem cells. Our results not only provide mechanistic insight into the genetic and epigenetic mechanisms driving glioma progression but also identify inhibition of the bromodomain and extraterminal (BET) family as a potential therapeutic approach. PMID:26618343

  12. Genome Instability Mediates the Loss of Key Traits by Acinetobacter baylyi ADP1 during Laboratory Evolution

    PubMed Central

    Renda, Brian A.; Dasgupta, Aurko; Leon, Dacia

    2014-01-01

    Acinetobacter baylyi ADP1 has the potential to be a versatile bacterial host for synthetic biology because it is naturally transformable. To examine the genetic reliability of this desirable trait and to understand the potential stability of other engineered capabilities, we propagated ADP1 for 1,000 generations of growth in rich nutrient broth and analyzed the genetic changes that evolved by whole-genome sequencing. Substantially reduced transformability and increased cellular aggregation evolved during the experiment. New insertions of IS1236 transposable elements and IS1236-mediated deletions led to these phenotypes in most cases and were common overall among the selected mutations. We also observed a 49-kb deletion of a prophage region that removed an integration site, which has been used for genome engineering, from every evolved genome. The comparatively low rates of these three classes of mutations in lineages that were propagated with reduced selection for 7,500 generations indicate that they increase ADP1 fitness under common laboratory growth conditions. Our results suggest that eliminating transposable elements and other genetic failure modes that affect key organismal traits is essential for improving the reliability of metabolic engineering and genome editing in undomesticated microbial hosts, such as Acinetobacter baylyi ADP1. PMID:25512307

  13. Amplification, Next-generation Sequencing, and Genomic DNA Mapping of Retroviral Integration Sites.

    PubMed

    Serrao, Erik; Cherepanov, Peter; Engelman, Alan N

    2016-01-01

    Retroviruses exhibit signature integration preferences on both the local and global scales. Here, we present a detailed protocol for (1) generation of diverse libraries of retroviral integration sites using ligation-mediated PCR (LM-PCR) amplification and next-generation sequencing (NGS), (2) mapping the genomic location of each virus-host junction using BEDTools, and (3) analyzing the data for statistical relevance. Genomic DNA extracted from infected cells is fragmented by digestion with restriction enzymes or by sonication. After suitable DNA end-repair, double-stranded linkers are ligated onto the DNA ends, and semi-nested PCR is conducted using primers complementary to both the long terminal repeat (LTR) end of the virus and the ligated linker DNA. The PCR primers carry sequences required for DNA clustering during NGS, negating the requirement for separate adapter ligation. Quality control (QC) is conducted to assess DNA fragment size distribution and adapter DNA incorporation prior to NGS. Sequence output files are filtered for LTR-containing reads, and the sequences defining the LTR and the linker are cropped away. Trimmed host cell sequences are mapped to a reference genome using BLAT and are filtered for minimally 97% identity to a unique point in the reference genome. Unique integration sites are scrutinized for adjacent nucleotide (nt) sequence and distribution relative to various genomic features. Using this protocol, integration site libraries of high complexity can be constructed from genomic DNA in three days. The entire protocol that encompasses exogenous viral infection of susceptible tissue culture cells to integration site analysis can therefore be conducted in approximately one to two weeks. Recent applications of this technology pertain to longitudinal analysis of integration sites from HIV-infected patients. PMID:27023428

  14. A physical map of the papaya genome with integrated genetic map and genome sequence

    PubMed Central

    2009-01-01

    Background Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we constructed a BAC-based physical map using a high information-content fingerprinting approach to assist whole genome shotgun sequence assembly. Results The physical map consists of 963 contigs, representing 9.4× genome equivalents, and was integrated with the genetic map and genome sequence using BAC end sequences and a sequence-tagged high-density genetic map. The estimated genome coverage of the physical map is about 95.8%, while 72.4% of the genome was aligned to the genetic map. A total of 1,181 high quality overgo (overlapping oligonucleotide) probes representing conserved sequences in Arabidopsis and genetically mapped loci in Brassica were anchored on the physical map, which provides a foundation for comparative genomics in the Brassicales. The integrated genetic and physical map aligned with the genome sequence revealed recombination hotspots as well as regions suppressed for recombination across the genome, particularly on the recently evolved sex chromosomes. Suppression of recombination spread to the adjacent region of the male specific region of the Y chromosome (MSY), and recombination rates were recovered gradually and then exceeded the genome average. Recombination hotspots were observed at about 10 Mb away on both sides of the MSY, showing 7-fold increase compared with the genome wide average, demonstrating the dynamics of recombination of the sex chromosomes. Conclusion A BAC-based physical map of papaya was constructed and integrated with the genetic map and genome sequence. The integrated map facilitated the draft genome assembly

  15. Roles of DNA helicases in the maintenance of genome integrity

    PubMed Central

    Bochman, Matthew L

    2014-01-01

    Genome integrity is achieved and maintained by the sum of all of the processes in the cell that ensure the faithful duplication and repair of DNA, as well as its genetic transmission from one cell division to the next. As central players in virtually all of the DNA transactions that occur in vivo, DNA helicases (molecular motors that unwind double-stranded DNA to produce single-stranded substrates) represent a crucial enzyme family that is necessary for genomic stability. Indeed, mutations in many human helicase genes are linked to a variety of diseases with symptoms that can be generally described as genomic instability, such as predispositions to cancers. This review focuses on the roles of both DNA replication helicases and recombination/repair helicases in maintaining genome integrity and provides a brief overview of the diseases related to defects in these enzymes. PMID:27308340

  16. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M.; Micheals, G.S.; Taylor, R.

    1992-12-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator`s tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  17. Overview of the Integrated Genomic Data system (IGD)

    SciTech Connect

    Hagstrom, R.; Overbeek, R.; Price, M. ); Micheals, G.S.; Taylor, R. . Div. of Computer Resources and Technology)

    1992-01-01

    In previous work, we developed a database system to support analysis of the E. coli genome. That system provided a pidgin-English query facility, rudimentary pattern-matching capabilities, and the ability to rapidly extract answers to a wide variety of questions about the organization of the E. coli genome. To enable the comparative analysis of the genomes from different species, we have designed and implemented a new prototype database system, called the Integrated Genomic Database (IGD). IGD extends our earlier effort by incorporating a set of curator's tools that facilitate the incorporation of physical and genetic data, together with the results of genome organization analysis, into a common database system. Additional tools for extracting, manipulating, and analyzing data are planned.

  18. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species. PMID:23324169

  19. Identifying potential cancer driver genes by genomic data integration

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-12-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis.

  20. Mutation Detection with Next-Generation Resequencing through a Mediator Genome

    SciTech Connect

    Wurtzel, Omri; Dori-Bachash, Mally; Pietrokovski, Shmuel; Jurkevitch, Edouard; Sorek, Rotem

    2010-12-20

    The affordability of next generation sequencing (NGS) is transforming the field of mutation analysis in bacteria. The genetic basis for phenotype alteration can be identified directly by sequencing the entire genome of the mutant and comparing it to the wild-type (WT) genome, thus identifying acquired mutations. A major limitation for this approach is the need for an a-priori sequenced reference genome for the WT organism, as the short reads of most current NGS approaches usually prohibit de-novo genome assembly. To overcome this limitation we propose a general framework that utilizes the genome of relative organisms as mediators for comparing WT and mutant bacteria. Under this framework, both mutant and WT genomes are sequenced with NGS, and the short sequencing reads are mapped to the mediator genome. Variations between the mutant and the mediator that recur in the WT are ignored, thus pinpointing the differences between the mutant and the WT. To validate this approach we sequenced the genome of Bdellovibrio bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent mutant, and compared both to the mediator species Bdellovibrio bacteriovorus HD100. Although the mutant and the mediator sequences differed in more than 28,000 nucleotide positions, our approach enabled pinpointing the single causative mutation. Experimental validation in 53 additional mutants further established the implicated gene. Our approach extends the applicability of NGS-based mutant analyses beyond the domain of available reference genomes.

  1. Integrator mediates the biogenesis of enhancer RNAs

    PubMed Central

    Lai, Fan; Gardini, Alessandro; Zhang, Anda; Shiekhattar, Ramin

    2015-01-01

    Integrator is a multi-subunit complex stably associated with the C-terminal domain (CTD) of RNA polymerase II (RNAPII) 1. Integrator is endowed with a core catalytic RNA endonuclease activity, which is required for the 3′-end processing of non-polyadenylated RNAPII-dependent uridylate-rich small nuclear RNA genes (UsnRNAs) 1. Here, we examined the requirement of Integrator in the biogenesis of transcripts derived from distal regulatory elements (enhancers) involved in tissue- and temporal-specific regulation of gene expression 2–5. Integrator is recruited to enhancers and super-enhancers in a stimulus-dependent manner. Functional depletion of Integrator subunits diminishes the signal-dependent induction of eRNAs and abrogates the stimulus-induced enhancer-promoter chromatin looping. Global nuclear run-on and RNAPII profiling reveals a role for Integrator in 3′-end cleavage of eRNAs primary transcripts leading to transcriptional termination. In the absence of Integrator, eRNAs remain bound to RNAPII and their primary transcripts accumulates. Importantly, the induction of eRNAs and gene expression responsiveness requires the catalytic activity of Integrator complex. We propose a role for Integrator in biogenesis of eRNAs and enhancer function in metazoans. PMID:26308897

  2. Integrated genomic characterization of papillary thyroid carcinoma.

    PubMed

    2014-10-23

    Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D, and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors, and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  3. Integrated Genomic Characterization of Papillary Thyroid Carcinoma

    PubMed Central

    Agrawal, Nishant; Akbani, Rehan; Aksoy, B. Arman; Ally, Adrian; Arachchi, Harindra; Asa, Sylvia L.; Auman, J. Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B.; Behera, Madhusmita; Bernard, Brady; Beroukhim, Rameen; Bishop, Justin A.; Black, Aaron D.; Bodenheimer, Tom; Boice, Lori; Bootwalla, Moiz S.; Bowen, Jay; Bowlby, Reanne; Bristow, Christopher A.; Brookens, Robin; Brooks, Denise; Bryant, Robert; Buda, Elizabeth; Butterfield, Yaron S.N.; Carling, Tobias; Carlsen, Rebecca; Carter, Scott L.; Carty, Sally E.; Chan, Timothy A.; Chen, Amy Y.; Cherniack, Andrew D.; Cheung, Dorothy; Chin, Lynda; Cho, Juok; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Ciriello, Giovanni; Clarke, Amanda; Clayman, Gary L.; Cope, Leslie; Copland, John; Covington, Kyle; Danilova, Ludmila; Davidsen, Tanja; Demchok, John A.; DiCara, Daniel; Dhalla, Noreen; Dhir, Rajiv; Dookran, Sheliann S.; Dresdner, Gideon; Eldridge, Jonathan; Eley, Greg; El-Naggar, Adel K.; Eng, Stephanie; Fagin, James A.; Fennell, Timothy; Ferris, Robert L.; Fisher, Sheila; Frazer, Scott; Frick, Jessica; Gabriel, Stacey B.; Ganly, Ian; Gao, Jianjiong; Garraway, Levi A.; Gastier-Foster, Julie M.; Getz, Gad; Gehlenborg, Nils; Ghossein, Ronald; Gibbs, Richard A.; Giordano, Thomas J.; Gomez-Hernandez, Karen; Grimsby, Jonna; Gross, Benjamin; Guin, Ranabir; Hadjipanayis, Angela; Harper, Hollie A.; Hayes, D. Neil; Heiman, David I.; Herman, James G.; Hoadley, Katherine A.; Hofree, Matan; Holt, Robert A.; Hoyle, Alan P.; Huang, Franklin W.; Huang, Mei; Hutter, Carolyn M.; Ideker, Trey; Iype, Lisa; Jacobsen, Anders; Jefferys, Stuart R.; Jones, Corbin D.; Jones, Steven J.M.; Kasaian, Katayoon; Kebebew, Electron; Khuri, Fadlo R.; Kim, Jaegil; Kramer, Roger; Kreisberg, Richard; Kucherlapati, Raju; Kwiatkowski, David J.; Ladanyi, Marc; Lai, Phillip H.; Laird, Peter W.; Lander, Eric; Lawrence, Michael S.; Lee, Darlene; Lee, Eunjung; Lee, Semin; Lee, William; Leraas, Kristen M.; Lichtenberg, Tara M.; Lichtenstein, Lee; Lin, Pei; Ling, Shiyun; Liu, Jinze; Liu, Wenbin; Liu, Yingchun; LiVolsi, Virginia A.; Lu, Yiling; Ma, Yussanne; Mahadeshwar, Harshad S.; Marra, Marco A.; Mayo, Michael; McFadden, David G.; Meng, Shaowu; Meyerson, Matthew; Mieczkowski, Piotr A.; Miller, Michael; Mills, Gordon; Moore, Richard A.; Mose, Lisle E.; Mungall, Andrew J.; Murray, Bradley A.; Nikiforov, Yuri E.; Noble, Michael S.; Ojesina, Akinyemi I.; Owonikoko, Taofeek K.; Ozenberger, Bradley A.; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J.; Parker, Joel S.; Paull, Evan O.; Pedamallu, Chandra Sekhar; Perou, Charles M.; Prins, Jan F.; Protopopov, Alexei; Ramalingam, Suresh S.; Ramirez, Nilsa C.; Ramirez, Ricardo; Raphael, Benjamin J.; Rathmell, W. Kimryn; Ren, Xiaojia; Reynolds, Sheila M.; Rheinbay, Esther; Ringel, Matthew D.; Rivera, Michael; Roach, Jeffrey; Robertson, A. Gordon; Rosenberg, Mara W.; Rosenthall, Matthew; Sadeghi, Sara; Saksena, Gordon; Sander, Chris; Santoso, Netty; Schein, Jacqueline E.; Schultz, Nikolaus; Schumacher, Steven E.; Seethala, Raja R.; Seidman, Jonathan; Senbabaoglu, Yasin; Seth, Sahil; Sharpe, Samantha; Mills Shaw, Kenna R.; Shen, John P.; Shen, Ronglai; Sherman, Steven; Sheth, Margi; Shi, Yan; Shmulevich, Ilya; Sica, Gabriel L.; Simons, Janae V.; Sipahimalani, Payal; Smallridge, Robert C.; Sofia, Heidi J.; Soloway, Matthew G.; Song, Xingzhi; Sougnez, Carrie; Stewart, Chip; Stojanov, Petar; Stuart, Joshua M.; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Tarnuzzer, Roy; Taylor, Barry S.; Thiessen, Nina; Thorne, Leigh; Thorsson, Vésteinn; Tuttle, R. Michael; Umbricht, Christopher B.; Van Den Berg, David J.; Vandin, Fabio; Veluvolu, Umadevi; Verhaak, Roel G.W.; Vinco, Michelle; Voet, Doug; Walter, Vonn; Wang, Zhining; Waring, Scot; Weinberger, Paul M.; Weinstein, John N.; Weisenberger, Daniel J.; Wheeler, David; Wilkerson, Matthew D.; Wilson, Jocelyn; Williams, Michelle; Winer, Daniel A.; Wise, Lisa; Wu, Junyuan; Xi, Liu; Xu, Andrew W.; Yang, Liming; Yang, Lixing; Zack, Travis I.; Zeiger, Martha A.; Zeng, Dong; Zenklusen, Jean Claude; Zhao, Ni; Zhang, Hailei; Zhang, Jianhua; Zhang, Jiashan (Julia); Zhang, Wei; Zmuda, Erik; Zou., Lihua

    2014-01-01

    Summary Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  4. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  5. G protein-coupled receptors: extranuclear mediators for the non-genomic actions of steroids.

    PubMed

    Wang, Chen; Liu, Yi; Cao, Ji-Min

    2014-01-01

    Steroids hormones possess two distinct actions, a delayed genomic effect and a rapid non-genomic effect. Rapid steroid-triggered signaling is mediated by specific receptors localized most often to the plasma membrane. The nature of these receptors is of great interest and accumulated data suggest that G protein-coupled receptors (GPCRs) are appealing candidates. Increasing evidence regarding the interaction between steroids and specific membrane proteins, as well as the involvement of G protein and corresponding downstream signaling, have led to identification of physiologically relevant GPCRs as steroid extranuclear receptors. Examples include G protein-coupled receptor 30 (GPR30) for estrogen, membrane progestin receptor for progesterone, G protein-coupled receptor family C group 6 member A (GPRC6A) and zinc transporter member 9 (ZIP9) for androgen, and trace amine associated receptor 1 (TAAR1) for thyroid hormone. These receptor-mediated biological effects have been extended to reproductive development, cardiovascular function, neuroendocrinology and cancer pathophysiology. However, although great progress have been achieved, there are still important questions that need to be answered, including the identities of GPCRs responsible for the remaining steroids (e.g., glucocorticoid), the structural basis of steroids and GPCRs' interaction and the integration of extranuclear and nuclear signaling to the final physiological function. Here, we reviewed the several significant developments in this field and highlighted a hypothesis that attempts to explain the general interaction between steroids and GPCRs. PMID:25257522

  6. Integration of cancer genomics with treatment selection: from the genome to predictive biomarkers

    PubMed Central

    Ow, Thomas J.; Sandulache, Vlad C.; Skinner, Heath D.; Myers, Jeffrey N.

    2013-01-01

    The field of cancer genomics is rapidly advancing as new technology provides detailed genetic and epigenetic profiling of human cancers. The amount of new data available describing the genetic make-up of tumors is paralleled by rapid advances in drug discovery and molecular therapy currently under investigation to treat these diseases. This review summarizes the challenges and approaches associated with the integration of genomic data into the development of new biomarkers in the management of cancer. PMID:24037788

  7. The genomic basis of vomeronasal-mediated behaviour.

    PubMed

    Ibarra-Soria, Ximena; Levitin, Maria O; Logan, Darren W

    2014-02-01

    The vomeronasal organ (VNO) is a chemosensory subsystem found in the nose of most mammals. It is principally tasked with detecting pheromones and other chemical signals that initiate innate behavioural responses. The VNO expresses subfamilies of vomeronasal receptors (VRs) in a cell-specific manner: each sensory neuron expresses just one or two receptors and silences all the other receptor genes. VR genes vary greatly in number within mammalian genomes, from no functional genes in some primates to many hundreds in rodents. They bind semiochemicals, some of which are also encoded in gene families that are coexpanded in species with correspondingly large VR repertoires. Protein and peptide cues that activate the VNO tend to be expressed in exocrine tissues in sexually dimorphic, and sometimes individually variable, patterns. Few chemical ligand-VR-behaviour relationships have been fully elucidated to date, largely due to technical difficulties in working with large, homologous gene families with high sequence identity. However, analysis of mouse lines with mutations in genes involved in ligand-VR signal transduction has revealed that the VNO mediates a range of social behaviours, including male-male and maternal aggression, sexual attraction, lordosis, and selective pregnancy termination, as well as interspecific responses such as avoidance and defensive behaviours. The unusual logic of VR expression now offers an opportunity to map the specific neural circuits that drive these behaviours. PMID:23884334

  8. Integrative analysis of genome-wide RNA interference screens.

    PubMed

    Berndt, Jason D; Biechele, Travis L; Moon, Randall T; Major, Michael B

    2009-01-01

    High-throughput genetic screens have exponentially increased the functional annotation of the genome over the past 10 years. Likewise, genome-scale efforts to map DNA methylation, chromatin state and occupancy, messenger RNA expression patterns, and disease-associated genetic polymorphisms, and proteome-wide efforts to map protein-protein interactions, have also created vast resources of data. An emerging trend involves combining multiple types of data, referred to as integrative screening. Examples include papers that report integrated data generated from large-scale RNA interference screens on the Wnt/beta-catenin pathway with either genotypic or proteomic data in colorectal cancer. These studies demonstrate the power of data integration to generate focused, validated data sets and to identify high-confidence candidate genes for follow-up experiments. We present the ongoing evolution and new strategies for the integrative screening approach with respect to understanding and treating human disease. PMID:19436058

  9. Integrated translational genomics for analysis of complex traits in sorghum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  10. Performing integrative functional genomics analysis in GeneWeaver.org.

    PubMed

    Jay, Jeremy J; Chesler, Elissa J

    2014-01-01

    Functional genomics experiments and analyses give rise to large sets of results, each typically quantifying the relation of molecular entities including genes, gene products, polymorphisms, and other genomic features with biological characteristics or processes. There is tremendous utility and value in using these data in an integrative fashion to find convergent evidence for the role of genes in various processes, to identify functionally similar molecular entities, or to compare processes based on their genomic correlates. However, these gene-centered data are often deposited in diverse and non-interoperable stores. Therefore, integration requires biologists to implement computational algorithms and harmonization of gene identifiers both within and across species. The GeneWeaver web-based software system brings together a large data archive from diverse functional genomics data with a suite of combinatorial tools in an interactive environment. Account management features allow data and results to be shared among user-defined groups. Users can retrieve curated gene set data, upload, store, and share their own experimental results and perform integrative analyses including novel algorithmic approaches for set-set integration of genes and functions. PMID:24233775

  11. An integrative genomic and proteomic approach to chemosensitivity prediction

    PubMed Central

    Ma, Yan; Ding, Zhenyu; Qian, Yong; Wan, Ying-Wooi; Tosun, Kursad; Shi, Xianglin; Castranova, Vincent; Harner, E. James; Guo, Nancy I.

    2009-01-01

    New computational approaches are needed to integrate both protein expression and gene expression profiles, extending beyond the correlation analyses of gene and protein expression profiles in the current practices. Here, we developed an algorithm to classify cell line chemosensitivity based on integrated transcriptional and proteomic profiles. We sought to determine whether a combination of gene and protein expression profiles of untreated cells was able to enhance the performance of chemosensitivity prediction. An integrative feature selection scheme was employed to identify chemosensitivity determinants from genome-wide transcriptional profiles and 52 protein expression levels in 60 human cancer cell lines (the NCI-60). A set of 118 anti-cancer drugs whose mechanisms of action were putatively understood was evaluated. Classifiers of the complete range of drug response (sensitive, intermediate, or resistant) were generated for the evaluated anti-cancer drugs, one for each agent. The classifiers were designed to be independent of the cells' tissue origins. The classification accuracy of all the evaluated 118 agents was remarkably better (P<0.001) than that would be achieved by chance. Furthermore, 76 out of the 118 classifiers identified from integrated genomic and protein profiles significantly (P<0.05) improved the accuracy of protein expression-based classifiers identified previously. These results demonstrate that our integrated genomic and proteomic approach enhances the performance of chemosensitivity prediction. This study presents a new analytical framework to identify integrated gene and protein expression signatures for predicting cellular behavior and clinical outcome in general. PMID:19082483

  12. Integrated genomic analyses of ovarian carcinoma.

    PubMed

    2011-06-30

    A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology. PMID:21720365

  13. Integrative clinical genomics of advanced prostate cancer

    PubMed Central

    Dan, Robinson; Van Allen, Eliezer M.; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J.; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim M.; Bradley, Robert K.; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P.; Hussain, Maha; Feng, Felix Y.; Tomlins, Scott A.; Cooney, Kathleen A.; Smith, David C.; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E.; Morris, Michael J.; Solomon, Stephen B.; Durack, Jeremy C.; Reuter, Victor E.; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T.; Bowden, Michaela; Balk, Stephen P.; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y.; Mostaghel, Elahe A.; Cheng, Heather H.; Mulcahy, Hyojeong; True, Lawrence D.; Plymate, Stephen R.; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Lopez, Raquel Perez; Demichelis, Francesca; Robinson, Brian D.; Schiffman, Marc A.; Nanus, David M.; Tagawa, Scott T.; Sigaras, Alexandros; Eng, Kenneth W.; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I.; Scher, Howard I.; Pienta, Kenneth J.; Kantoff, Philip; de Bono, Johann S.; Rubin, Mark A.; Nelson, Peter S.; Garraway, Levi A.; Sawyers, Charles L.; Chinnaiyan, Arul M.

    2015-01-01

    SUMMARY Toward development of a precision medicine framework for metastatic, castration resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53 and PTEN were frequent (40–60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified novel genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin and ZBTB16/PLZF. Aberrations of BRCA2, BRCA1 and ATM were observed at substantially higher frequencies (19.3% overall) than seen in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides evidence that clinical sequencing in mCRPC is feasible and could impact treatment decisions in significant numbers of affected individuals. PMID:26000489

  14. Integrative clinical genomics of advanced prostate cancer.

    PubMed

    Robinson, Dan; Van Allen, Eliezer M; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim; Bradley, Robert K; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P; Hussain, Maha; Feng, Felix Y; Tomlins, Scott A; Cooney, Kathleen A; Smith, David C; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E; Morris, Michael J; Solomon, Stephen B; Durack, Jeremy C; Reuter, Victor E; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T; Bowden, Michaela; Balk, Stephen P; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y; Mostaghel, Elahe A; Cheng, Heather H; Mulcahy, Hyojeong; True, Lawrence D; Plymate, Stephen R; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Perez-Lopez, Raquel; Demichelis, Francesca; Robinson, Brian D; Schiffman, Marc; Nanus, David M; Tagawa, Scott T; Sigaras, Alexandros; Eng, Kenneth W; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I; Scher, Howard I; Pienta, Kenneth J; Kantoff, Philip; de Bono, Johann S; Rubin, Mark A; Nelson, Peter S; Garraway, Levi A; Sawyers, Charles L; Chinnaiyan, Arul M

    2015-05-21

    Toward development of a precision medicine framework for metastatic, castration-resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole-exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53, and PTEN were frequent (40%-60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified new genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin, and ZBTB16/PLZF. Moreover, aberrations of BRCA2, BRCA1, and ATM were observed at substantially higher frequencies (19.3% overall) compared to those in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration, including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides clinically actionable information that could impact treatment decisions for these affected individuals. PMID:26000489

  15. Integrated Genomic Analyses of Ovarian Carcinoma

    PubMed Central

    2011-01-01

    Summary The Cancer Genome Atlas (TCGA) project has analyzed mRNA expression, miRNA expression, promoter methylation, and DNA copy number in 489 high-grade serous ovarian adenocarcinomas (HGS-OvCa) and the DNA sequences of exons from coding genes in 316 of these tumors. These results show that HGS-OvCa is characterized by TP53 mutations in almost all tumors (96%); low prevalence but statistically recurrent somatic mutations in 9 additional genes including NF1, BRCA1, BRCA2, RB1, and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three miRNA subtypes, four promoter methylation subtypes, a transcriptional signature associated with survival duration and shed new light on the impact on survival of tumors with BRCA1/2 and CCNE1 aberrations. Pathway analyses suggested that homologous recombination is defective in about half of tumors, and that Notch and FOXM1 signaling are involved in serous ovarian cancer pathophysiology. PMID:21720365

  16. Integrating genomic selection into dairy cattle breeding programmes: a review.

    PubMed

    Bouquet, A; Juga, J

    2013-05-01

    Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However

  17. Defining nephrotic syndrome from an integrative genomics perspective.

    PubMed

    Sampson, Matthew G; Hodgin, Jeffrey B; Kretzler, Matthias

    2015-01-01

    Nephrotic syndrome (NS) is a clinical condition with a high degree of morbidity and mortality, caused by failure of the glomerular filtration barrier, resulting in massive proteinuria. Our current diagnostic, prognostic and therapeutic decisions in NS are largely based upon clinical or histological patterns such as "focal segmental glomerulosclerosis" or "steroid sensitive". Yet these descriptive classifications lack the precision to explain the physiologic origins and clinical heterogeneity observed in this syndrome. A more precise definition of NS is required to identify mechanisms of disease and capture various clinical trajectories. An integrative genomics approach to NS applies bioinformatics and computational methods to comprehensive experimental, molecular and clinical data for holistic disease definition. A unique aspect is analysis of data together to discover NS-associated molecules, pathways, and networks. Integrating multidimensional datasets from the outset highlights how molecular lesions impact the entire individual. Data sets integrated range from genetic variation to gene expression, to histologic changes, to progression of chronic kidney disease (CKD). This review will introduce the tenets of integrative genomics and suggest how it can increase our understanding of NS from molecular and pathophysiological perspectives. A diverse group of genome-scale experiments are presented that have sought to define molecular signatures of NS. Finally, the Nephrotic Syndrome Study Network (NEPTUNE) will be introduced as an international, prospective cohort study of patients with NS that utilizes an integrated systems genomics approach from the outset. A major NEPTUNE goal is to achieve comprehensive disease definition from a genomics perspective and identify shared molecular drivers of disease. PMID:24890338

  18. Computational and molecular tools for scalable rAAV-mediated genome editing.

    PubMed

    Stoimenov, Ivaylo; Ali, Muhammad Akhtar; Pandzic, Tatjana; Sjöblom, Tobias

    2015-03-11

    The rapid discovery of potential driver mutations through large-scale mutational analyses of human cancers generates a need to characterize their cellular phenotypes. Among the techniques for genome editing, recombinant adeno-associated virus (rAAV)-mediated gene targeting is suited for knock-in of single nucleotide substitutions and to a lesser degree for gene knock-outs. However, the generation of gene targeting constructs and the targeting process is time-consuming and labor-intense. To facilitate rAAV-mediated gene targeting, we developed the first software and complementary automation-friendly vector tools to generate optimized targeting constructs for editing human protein encoding genes. By computational approaches, rAAV constructs for editing ~71% of bases in protein-coding exons were designed. Similarly, ~81% of genes were predicted to be targetable by rAAV-mediated knock-out. A Gateway-based cloning system for facile generation of rAAV constructs suitable for robotic automation was developed and used in successful generation of targeting constructs. Together, these tools enable automated rAAV targeting construct design, generation as well as enrichment and expansion of targeted cells with desired integrations. PMID:25488813

  19. Integrated Genomic Analysis of Pancreatic Ductal Adenocarcinomas Reveals Genomic Rearrangement Events as Significant Drivers of Disease.

    PubMed

    Murphy, Stephen J; Hart, Steven N; Halling, Geoffrey C; Johnson, Sarah H; Smadbeck, James B; Drucker, Travis; Lima, Joema Felipe; Rohakhtar, Fariborz Rakhshan; Harris, Faye R; Kosari, Farhad; Subramanian, Subbaya; Petersen, Gloria M; Wiltshire, Timothy D; Kipp, Benjamin R; Truty, Mark J; McWilliams, Robert R; Couch, Fergus J; Vasmatzis, George

    2016-02-01

    Many somatic mutations have been detected in pancreatic ductal adenocarcinoma (PDAC), leading to the identification of some key drivers of disease progression, but the involvement of large genomic rearrangements has often been overlooked. In this study, we performed mate pair sequencing (MPseq) on genomic DNA from 24 PDAC tumors, including 15 laser-captured microdissected PDAC and 9 patient-derived xenografts, to identify genome-wide rearrangements. Large genomic rearrangements with intragenic breakpoints altering key regulatory genes involved in PDAC progression were detected in all tumors. SMAD4, ZNF521, and FHIT were among the most frequently hit genes. Conversely, commonly reported genes with copy number gains, including MYC and GATA6, were frequently observed in the absence of direct intragenic breakpoints, suggesting a requirement for sustaining oncogenic function during PDAC progression. Integration of data from MPseq, exome sequencing, and transcriptome analysis of primary PDAC cases identified limited overlap in genes affected by both rearrangements and point mutations. However, significant overlap was observed in major PDAC-associated signaling pathways, with all PDAC exhibiting reduced SMAD4 expression, reduced SMAD-dependent TGFβ signaling, and increased WNT and Hedgehog signaling. The frequent loss of SMAD4 and FHIT due to genomic rearrangements strongly implicates these genes as key drivers of PDAC, thus highlighting the strengths of an integrated genomic and transcriptomic approach for identifying mechanisms underlying disease initiation and progression. PMID:26676757

  20. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  1. DemaDb: an integrated dematiaceous fungal genomes database.

    PubMed

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my. PMID:26980516

  2. Megx.net: integrated database resource for marine ecological genomics.

    PubMed

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  3. Megx.net: integrated database resource for marine ecological genomics

    PubMed Central

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  4. GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

    PubMed Central

    Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191

  5. Brown Planthopper Nudivirus DNA Integrated in Its Host Genome

    PubMed Central

    Cheng, Ruo-Lin; Xi, Yu; Lou, Yi-Han; Wang, Zhuo; Xu, Ji-Yu; Xu, Hai-Jun

    2014-01-01

    ABSTRACT The brown planthopper (BPH), Nilaparvata lugens (Hemiptera:Delphacidae), is one of the most destructive insect pests of rice crops in Asia. Nudivirus-like sequences were identified during the whole-genome sequencing of BPH. PCR examination showed that the virus sequences were present in all of the 22 BPH populations collected from East, Southeast, and South Asia. Thirty-two of the 33 nudivirus core genes were identified, including 20 homologues of baculovirus core genes. In addition, several gene clusters that were arranged collinearly with those of other nudiviruses were found in the partial virus genome. In a phylogenetic tree constructed using the supermatrix method, the original virus was grouped with other nudiviruses and was closely related to polydnavirus. Taken together, these data indicated that the virus sequences belong to a new member of the family Nudiviridae. More specifically, the virus sequences were integrated into the chromosome of its insect host during coevolution. This study is the first report of a large double-stranded circular DNA virus genome in a sap-sucking hemipteran insect. IMPORTANCE This is the first report of a large double-stranded DNA virus integrated genome in the planthopper, a plant sap-sucking hemipteran insect. It is an exciting addition to the evolutionary story of bracoviruses (polydnaviruses), nudiviruses, and baculoviruses. The results on the virus sequences integrated in the chromosomes of its insect host also represent a story of successful coevolution of an invertebrate virus and a plant sap-sucking insect. PMID:24574410

  6. Knowledge integration at the center of genomic medicine.

    PubMed

    Khoury, Muin J; Gwinn, Marta; Dotson, W David; Schully, Sheri D

    2012-07-01

    Three articles in this issue of Genetics in Medicine describe examples of "knowledge integration," involving methods for generating and synthesizing rapidly emerging information on health-related genomic technologies and engaging stakeholders around the evidence. Knowledge integration, the central process in translating genomic research, involves three closely related, iterative components: knowledge management, knowledge synthesis, and knowledge translation. Knowledge management is the ongoing process of obtaining, organizing, and displaying evolving evidence. For example, horizon scanning and "infoveillance" use emerging technologies to scan databases, registries, publications, and cyberspace for information on genomic applications. Knowledge synthesis is the process of conducting systematic reviews using a priori rules of evidence. For example, methods including meta-analysis, decision analysis, and modeling can be used to combine information from basic, clinical, and population research. Knowledge translation refers to stakeholder engagement and brokering to influence policy, guidelines and recommendations, as well as the research agenda to close knowledge gaps. The ultrarapid production of information requires adequate public and private resources for knowledge integration to support the evidence-based development of genomic medicine. PMID:22555656

  7. TALEN-mediated genome engineering to generate targeted mice.

    PubMed

    Sommer, Daniel; Peters, Annika E; Baumgart, Ann-Kathrin; Beyer, Marc

    2015-02-01

    Genetic mouse models are critical for biomedical research to understand gene function and pathophysiology. In the last years, the generation of genetic mouse models has been revolutionized by the emergence of transcription activator-like effector nucleases (TALENs). TALENs are programmable, sequence-specific DNA-binding proteins fused to a non-specific endonuclease domain used as powerful tools for site-specific induction of DNA double-strand breaks. These result in disruption of the gene product of the targeted locus by mutations induced during repair by error-prone non-homologous end-joining. Alternatively, these DNA double-strand breaks can be exploited to integrate a user-defined sequence by homologous recombination if an appropriate repair plasmid is provided. In this review, we highlight the major technological improvements for genome editing in murine oocytes which have been achieved using TALENs, discuss current limitations of the technology, suggest strategies to broadly apply TALENs, and describe possible future directions to facilitate gene editing in murine oocytes. PMID:25596827

  8. Transgene integration and organization in cotton (Gossypium hirsutum L.) genome.

    PubMed

    Zhang, Jun; Cai, Lin; Cheng, Jiaqin; Mao, Huizhu; Fan, Xiaoping; Meng, Zhaohong; Chan, Ka Man; Zhang, Huijun; Qi, Jianfei; Ji, Lianghui; Hong, Yan

    2008-04-01

    While genetically modified upland cotton (Gossypium hirsutum L.) varieties are ranked among the most successful genetically modified organisms (GMO), there is little knowledge on transgene integration in the cotton genome, partly because of the difficulty in obtaining large numbers of transgenic plants. In this study, we analyzed 139 independently derived T0 transgenic cotton plants transformed by Agrobacterium tumefaciens strain AGL1 carrying a binary plasmid pPZP-GFP. It was found by PCR that as many as 31% of the plants had integration of vector backbone sequences. Of the 110 plants with good genomic Southern blot results, 37% had integration of a single T-DNA, 24% had two T-DNA copies and 39% had three or more copies. Multiple copies of the T-DNA existed either as repeats in complex loci or unlinked loci. Our further analysis of two T1 populations showed that segregants with a single T-DNA and no vector sequence could be obtained from T0 plants having multiple T-DNA copies and vector sequence. Out of the 57 T-DNA/T-DNA junctions cloned from complex loci, 27 had canonical T-DNA tandem repeats, the rest (30) had deletions to T-DNAs or had inclusion of vector sequences. Overlapping micro-homology was present for most of the T-DNA/T-DNA junctions (38/57). Right border (RB) ends of the T-DNA were precise while most left border (LB) ends (64%) had truncations to internal border sequences. Sequencing of collinear vector integration outside LB in 33 plants gave evidence that collinear vector sequence was determined in agrobacterium culture. Among the 130 plants with characterized flanking sequences, 12% had the transgene integrated into coding sequences, 12% into repetitive sequences, 7% into rDNAs. Interestingly, 7% had the transgene integrated into chloroplast derived sequences. Nucleotide sequence comparison of target sites in cotton genome before and after T-DNA integration revealed overlapping microhomology between target sites and the T-DNA (8/8), deletions to

  9. Integrating hospital information systems in healthcare institutions: a mediation architecture.

    PubMed

    El Azami, Ikram; Cherkaoui Malki, Mohammed Ouçamah; Tahon, Christian

    2012-10-01

    Many studies have examined the integration of information systems into healthcare institutions, leading to several standards in the healthcare domain (CORBAmed: Common Object Request Broker Architecture in Medicine; HL7: Health Level Seven International; DICOM: Digital Imaging and Communications in Medicine; and IHE: Integrating the Healthcare Enterprise). Due to the existence of a wide diversity of heterogeneous systems, three essential factors are necessary to fully integrate a system: data, functions and workflow. However, most of the previous studies have dealt with only one or two of these factors and this makes the system integration unsatisfactory. In this paper, we propose a flexible, scalable architecture for Hospital Information Systems (HIS). Our main purpose is to provide a practical solution to insure HIS interoperability so that healthcare institutions can communicate without being obliged to change their local information systems and without altering the tasks of the healthcare professionals. Our architecture is a mediation architecture with 3 levels: 1) a database level, 2) a middleware level and 3) a user interface level. The mediation is based on two central components: the Mediator and the Adapter. Using the XML format allows us to establish a structured, secured exchange of healthcare data. The notion of medical ontology is introduced to solve semantic conflicts and to unify the language used for the exchange. Our mediation architecture provides an effective, promising model that promotes the integration of hospital information systems that are autonomous, heterogeneous, semantically interoperable and platform-independent. PMID:22086739

  10. Integrated analysis of genome-wide genetic and epigenetic association data for identification of disease mechanisms.

    PubMed

    Ke, Xiayi; Cortina-Borja, Mario; Silva, Bruno Cesar; Lowe, Robert; Rakyan, Vardhman; Balding, David

    2013-11-01

    Many human diseases are multifactorial, involving multiple genetic and environmental factors impacting on one or more biological pathways. Much of the environmental effect is believed to be mediated through epigenetic changes. Although many genome-wide genetic and epigenetic association studies have been conducted for different diseases and traits, it is still far from clear to what extent the genomic loci and biological pathways identified in the genetic and epigenetic studies are shared. There is also a lack of statistical tools to assess these important aspects of disease mechanisms. In the present study, we describe a protocol for the integrated analysis of genome-wide genetic and epigenetic data based on permutation of a sum statistic for the combined effects in a locus or pathway. The method was then applied to published type 1 diabetes (T1D) genome-wide- and epigenome-wide-association studies data to identify genomic loci and biological pathways that are associated with T1D genetically and epigenetically. Through combined analysis, novel loci and pathways were also identified, which could add to our understanding of disease mechanisms of T1D as well as complex diseases in general. PMID:24071862

  11. Integrative functional genomic analysis unveils the differing dysregulated metabolic processes across hepatocellular carcinoma stages.

    PubMed

    Ramesh, Vignesh; Ganesan, Kumaresan

    2016-08-15

    Hepatocellular carcinoma (HCC) is a highly heterogeneous disease and the development of targeted therapeutics is still at an early stage. The 'omics' based genome-wide profiling comprising the transcriptome, miRNome and proteome are highly useful in identifying the deregulated molecular processes involved in hepatocarcinogenesis. One of the end products and processes of the central dogma being the metabolites and metabolic processes mediate the cellular functions. In recent years, metabolomics based investigations have revealed the major deregulated metabolic processes involved in carcinogenesis. However, the integrative analysis of the holistic metabolic processes with genomics is at an early stage. Since the gene-sets are highly useful in assessing the biological processes and pathways, we made an attempt to infer the deregulated cellular metabolic processes involved in HCC by employing metabolism associated gene-set enrichment analysis. Further, the metabolic process enrichment scores were integrated with the transcriptome profiles of HCC. Integrative analysis shows three distinct metabolic deregulations: i) hepatocyte function related molecular processes involving lipid/fatty acid/bile acid synthesis, ii) inflammatory processes with cytokine, sphingolipid & chondriotin sulphate metabolism and iii) enriched nucleotide metabolic process involving purine/pyrimidine & glucose mediated catabolic process, in hepatocarcinogenesis. The three distinct metabolic processes were found to occur both in tumor and liver cancer cell line profiles. Unsupervised hierarchical clustering of the metabolic processes along with clinical sample information has identified two major clusters based on AFP (alpha-fetoprotein) and metastasis. The study reveals the three major regulatory processes involved in HCC stages. PMID:27107678

  12. Using biological networks to integrate, visualize and analyze genomics data.

    PubMed

    Charitou, Theodosia; Bryan, Kenneth; Lynn, David J

    2016-01-01

    Network biology is a rapidly developing area of biomedical research and reflects the current view that complex phenotypes, such as disease susceptibility, are not the result of single gene mutations that act in isolation but are rather due to the perturbation of a gene's network context. Understanding the topology of these molecular interaction networks and identifying the molecules that play central roles in their structure and regulation is a key to understanding complex systems. The falling cost of next-generation sequencing is now enabling researchers to routinely catalogue the molecular components of these networks at a genome-wide scale and over a large number of different conditions. In this review, we describe how to use publicly available bioinformatics tools to integrate genome-wide 'omics' data into a network of experimentally-supported molecular interactions. In addition, we describe how to visualize and analyze these networks to identify topological features of likely functional relevance, including network hubs, bottlenecks and modules. We show that network biology provides a powerful conceptual approach to integrate and find patterns in genome-wide genomic data but we also discuss the limitations and caveats of these methods, of which researchers adopting these methods must remain aware. PMID:27036106

  13. PhytoPath: an integrative resource for plant pathogen genomics.

    PubMed

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D; Staines, Daniel M; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  14. An Integrative Method for Accurate Comparative Genome Mapping

    PubMed Central

    Swidan, Firas; Rocha, Eduardo P. C; Shmoish, Michael; Pinter, Ron Y

    2006-01-01

    We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying “maximal similar segments,” and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes. PMID:16933978

  15. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  16. Integrative Functional Genomics Implicates EPB41 Dysregulation in Hepatocellular Carcinoma Risk.

    PubMed

    Yang, Xinyu; Yu, Dianke; Ren, Yanli; Wei, Jinyu; Pan, Wenting; Zhou, Changchun; Zhou, Liqing; Liu, Yu; Yang, Ming

    2016-08-01

    Genome-wide association studies (GWASs) have provided many insights into cancer genetics. However, the molecular mechanisms of many susceptibility SNPs defined by GWASs in cancer heritability and in promoting cancer risk remain elusive. New research strategies, including functional evaluations, are warranted to systematically explore truly causal genetic variants. In this study, we developed an integrative functional genomics methodology to identify cancer susceptibility SNPs in transcription factor-binding sites across the whole genome. Employing integration of functional genomic data from c-Myc cistromics, 1000 Genomes, and the TRANSFAC matrix, we successfully annotated 12 SNPs present in the c-Myc cistrome with properties consistent with modulating c-Myc binding affinity in hepatocellular carcinoma (HCC). After genotyping these 12 SNPs in 1,806 HBV-related HCC case subjects and 1,708 control subjects, we identified a HCC susceptibility SNP, rs157224G>T, in Chinese populations (T allele: odds ratio = 1.64, 95% confidence interval = 1.32-2.02; p = 5.2 × 10(-6)). This polymorphism leads to HCC predisposition through modifying c-Myc-mediated transcriptional regulation of EPB41, with the risk rs157224T allele showing significantly decreased gene expression. Based on cell proliferation, wound healing, and transwell assays as well as the mouse xenograft model, we identify EPB41 as a HCC susceptibility gene in vitro and in vivo. Consistent with this notion, we note that EPB41 expression is significantly decreased in HCC tissue specimens, especially in portal vein metastasis or intrahepatic metastasis, compared to normal tissues. Our results highlight the involvement of regulatory genetic variants in HCC and provide pathogenic insights of this malignancy via a genome-wide approach. PMID:27453575

  17. Genome-wide analyses of LINE–LINE-mediated nonallelic homologous recombination

    PubMed Central

    Startek, Michał; Szafranski, Przemyslaw; Gambin, Tomasz; Campbell, Ian M.; Hixson, Patricia; Shaw, Chad A.; Stankiewicz, Paweł; Gambin, Anna

    2015-01-01

    Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE–LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE–LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE–LINE rearrangements. Our data indicate that LINE–LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability. PMID:25613453

  18. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  19. Stacking multiple transgenes at a selected genomic site via repeated recombinase-mediated DNA cassette exchanges.

    PubMed

    Li, Zhongsen; Moon, Bryan P; Xing, Aiqiu; Liu, Zhan-Bin; McCardell, Richard P; Damude, Howard G; Falco, S Carl

    2010-10-01

    Recombinase-mediated DNA cassette exchange (RMCE) has been successfully used to insert transgenes at previously characterized genomic sites in plants. Following the same strategy, groups of transgenes can be stacked to the same site through multiple rounds of RMCE. A gene-silencing cassette, designed to simultaneously silence soybean (Glycine max) genes fatty acid ω-6 desaturase 2 (FAD2) and acyl-acyl carrier protein thioesterase 2 (FATB) to improve oleic acid content, was first inserted by RMCE at a precharacterized genomic site in soybean. Selected transgenic events were subsequently retransformed with the second DNA construct containing a Yarrowia lipolytica diacylglycerol acyltransferase gene (DGAT1) to increase oil content by the enhancement of triacylglycerol biosynthesis and three other genes, a Corynebacterium glutamicum dihydrodipicolinate synthetase gene (DHPS), a barley (Hordeum vulgare) high-lysine protein gene (BHL8), and a truncated soybean cysteine synthase gene (CGS), to improve the contents of the essential amino acids lysine and methionine. Molecular characterization confirmed that the second RMCE successfully stacked the four overexpression cassettes to the previously integrated FAD2-FATB gene-silencing cassette. Phenotypic analyses indicated that all the transgenes expressed expected phenotypes. PMID:20720171

  20. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation

    PubMed Central

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi (“truffles”) with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites. PMID:24949275

  1. Integrative gene transfer in the truffle Tuber borchii by Agrobacterium tumefaciens-mediated transformation.

    PubMed

    Brenna, Andrea; Montanini, Barbara; Muggiano, Eleonora; Proietto, Marco; Filetici, Patrizia; Ottonello, Simone; Ballario, Paola

    2014-01-01

    Agrobacterium tumefaciens-mediated transformation is a powerful tool for reverse genetics and functional genomic analysis in a wide variety of plants and fungi. Tuber spp. are ecologically important and gastronomically prized fungi ("truffles") with a cryptic life cycle, a subterranean habitat and a symbiotic, but also facultative saprophytic lifestyle. The genome of a representative member of this group of fungi has recently been sequenced. However, because of their poor genetic tractability, including transformation, truffles have so far eluded in-depth functional genomic investigations. Here we report that A. tumefaciens can infect Tuber borchii mycelia, thereby conveying its transfer DNA with the production of stably integrated transformants. We constructed two new binary plasmids (pABr1 and pABr3) and tested them as improved transformation vectors using the green fluorescent protein as reporter gene and hygromycin phosphotransferase as selection marker. Transformants were stable for at least 12 months of in vitro culture propagation and, as revealed by TAIL- PCR analysis, integration sites appear to be heterogeneous, with a preference for repeat element-containing genome sites. PMID:24949275

  2. Evolution of simple sequence repeat-mediated phase variation in bacterial genomes.

    PubMed

    Bayliss, Christopher D; Palmer, Michael E

    2012-09-01

    Mutability as mechanism for rapid adaptation to environmental challenge is an alluringly simple concept whose apotheosis is realized in simple sequence repeats (SSR). Bacterial genomes of several species contain SSRs with a proven role in adaptation to environmental fluctuations. SSRs are hypermutable and generate reversible mutations in localized regions of bacterial genomes, leading to phase variable ON/OFF switches in gene expression. The application of genetic, bioinformatic, and mathematical/computational modeling approaches are revolutionizing our current understanding of how genomic molecular forces and environmental factors influence SSR-mediated adaptation and led to evolution of this mechanism of localized hypermutation in bacterial genomes. PMID:22954215

  3. Construction of an integrated database to support genomic sequence analysis

    SciTech Connect

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  4. The Npl3 hnRNP prevents R-loop-mediated transcription–replication conflicts and genome instability

    PubMed Central

    Santos-Pereira, José M.; Herrero, Ana B.; García-Rubio, María L.; Marín, Antonio; Moreno, Sergio; Aguilera, Andrés

    2013-01-01

    Transcription is a major obstacle for replication fork (RF) progression and a cause of genome instability. Part of this instability is mediated by cotranscriptional R loops, which are believed to increase by suboptimal assembly of the nascent messenger ribonucleoprotein particle (mRNP). However, no clear evidence exists that heterogeneous nuclear RNPs (hnRNPs), the basic mRNP components, prevent R-loop stabilization. Here we show that yeast Npl3, the most abundant RNA-binding hnRNP, prevents R-loop-mediated genome instability. npl3Δ cells show transcription-dependent and R-loop-dependent hyperrecombination and genome-wide replication obstacles as determined by accumulation of the Rrm3 helicase. Such obstacles preferentially occur at long and highly expressed genes, to which Npl3 is preferentially bound in wild-type cells, and are reduced by RNase H1 overexpression. The resulting replication stress confers hypersensitivity to double-strand break-inducing agents. Therefore, our work demonstrates that mRNP factors are critical for genome integrity and opens the option of using them as therapeutic targets in anti-cancer treatment. PMID:24240235

  5. MarinegenomicsDB: an integrated genome viewer for community-based annotation of genomes.

    PubMed

    Koyanagi, Ryo; Takeuchi, Takeshi; Hisata, Kanako; Gyoja, Fuki; Shoguchi, Eiichi; Satoh, Nori; Kawashima, Takeshi

    2013-10-01

    We constructed a web-based genome annotation platform, MarinegenomicsDB, to integrate genome data from various marine organisms including the pearl oyster Pinctada fucata and the coral Acropora digitifera. This newly developed viewer application provides open access to published data and a user-friendly environment for community-based manual gene annotation. Development on a flexible framework enables easy expansion of the website on demand. To date, more than 2000 genes have been annotated using this system. In the future, the website will be expanded to host a wider variety of data, more species, and different types of genome-wide analyses. The website is available at the following URL: http://marinegenomics.oist.jp. PMID:24125644

  6. Precision genome editing in plants via gene targeting and piggyBac-mediated marker excision

    PubMed Central

    Nishizawa-Yokoi, Ayako; Endo, Masaki; Ohtsuki, Namie; Saika, Hiroaki; Toki, Seiichi

    2015-01-01

    Precise genome engineering via homologous recombination (HR)-mediated gene targeting (GT) has become an essential tool in molecular breeding as well as in basic plant science. As HR-mediated GT is an extremely rare event, positive–negative selection has been used extensively in flowering plants to isolate cells in which GT has occurred. In order to utilize GT as a methodology for precision mutagenesis, the positive selectable marker gene should be completely eliminated from the GT locus. Here, we introduce targeted point mutations conferring resistance to herbicide into the rice acetolactate synthase (ALS) gene via GT with subsequent marker excision by piggyBac transposition. Almost all regenerated plants expressing piggyBac transposase contained exclusively targeted point mutations without concomitant re-integration of the transposon, resulting in these progeny showing a herbicide bispyribac sodium (BS)-tolerant phenotype. This approach was also applied successfully to the editing of a microRNA targeting site in the rice cleistogamy 1 gene. Therefore, our approach provides a general strategy for the targeted modification of endogenous genes in plants. PMID:25284193

  7. Integrated Genome-Based Studies of Shewanella Ecophysiology

    SciTech Connect

    Andrei L. Osterman, Ph.D.

    2012-12-17

    Integration of bioinformatics and experimental techniques was applied to mapping and characterization of the key components (pathways, enzymes, transporters, regulators) of the core metabolic machinery in Shewanella oneidensis and related species with main focus was on metabolic and regulatory pathways involved in utilization of various carbon and energy sources. Among the main accomplishments reflected in ten joint publications with other participants of Shewanella Federation are: (i) A systems-level reconstruction of carbohydrate utilization pathways in the genus of Shewanella (19 species). This analysis yielded reconstruction of 18 sugar utilization pathways including 10 novel pathway variants and prediction of > 60 novel protein families of enzymes, transporters and regulators involved in these pathways. Selected functional predictions were verified by focused biochemical and genetic experiments. Observed growth phenotypes were consistent with bioinformatic predictions providing strong validation of the technology and (ii) Global genomic reconstruction of transcriptional regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors, 8 riboswitches and 6 translational attenuators. Of those, 45 regulons were inferred directly from the genome context analysis, whereas others were propagated from previously characterized regulons in other species. Selected regulatory predictions were experimentally tested. Integration of this analysis with microarray data revealed overall consistency and provided additional layer of interactions between regulons. All the results were captured in the new database RegPrecise, which is a joint development with the LBNL team. A more detailed analysis of the individual subsystems, pathways and regulons in Shewanella spp included bioinfiormatics-based prediction and experimental characterization of: (i) N-Acetylglucosamine catabolic pathway; (ii)Lactate utilization machinery; (iii) Novel Nrt

  8. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/ PMID:23482072

  9. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace.

    PubMed

    Qu, Kun; Garamszegi, Sara; Wu, Felix; Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P; Lee, Brian T; Kuhn, Robert M; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y; Mesirov, Jill P

    2016-03-01

    Complex biomedical analyses require the use of multiple software tools in concert and remain challenging for much of the biomedical research community. We introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource that currently supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate integrative analysis by non-programmers, it offers a growing set of 'recipes', short workflows to guide investigators through high-utility analysis tasks. PMID:26780094

  10. Comparative genomic analysis of integral membrane transport proteins in ciliates.

    PubMed

    Kumar, Ujjwal; Saier, Milton H

    2015-01-01

    Integral membrane transport proteins homologous to those found in the Transporter Classification Database (TCDB; www.tcdb.org) were identified and bioinformatically characterized by transporter class, family, and substrate specificity in three ciliates, Paramecium tetraurelia (Para), Tetrahymena thermophila (Tetra), and Ichthyophthirius multifiliis (Ich). In these three organisms, 1,326 of 39,600 proteins (3.4%), 1,017 of 24,800 proteins (4.2%), and 504 out of 8,100 proteins (6.2%) integral membrane transport proteins were identified, respectively. Thus, an inverse relationship was observed between the % transporters identified and the number of total proteins per genome reported. This surprising observation provides insight into the evolutionary process, giving rise to genome reduction following whole genome duplication (as in the case of Para) or during pathogenic association with a host organism (Ich). Of these transport proteins in Para and Tetra, about 41% were channels (more than any other type of organism studied), 31% were secondary carriers (fewer than most eukaryotes) and 26% were primary active transporters, mostly ATP-hydrolysis driven (more than most other eukaryotes). In Ich, the number of channels was selectively reduced by 66%, relative to Para and Tetra. Para has four times more inorganic anion transporters than Tetra, and Ich has nonselectively lost most of these. Tetra and Ich preferentially transport sugars and monocarboxylates while Para prefers di- and tricarboxylates. These observations serve to characterize the transport proteins of these related ciliates, providing insight into their nutrition and metabolism. PMID:25099884

  11. An integrated semiconductor device enabling non-optical genome sequencing.

    PubMed

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-21

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome. PMID:21776081

  12. STINGRAY: system for integrated genomic resources and analysis

    PubMed Central

    2014-01-01

    Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. Conclusion STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/. PMID:24606808

  13. An integrated BAC/BIBAC-based physical and genetic map of the cotton genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Integrated genome-wide genetic and physical maps are crucial to many aspects of cotton genome research. We report a genome-wide BAC/BIBAC-based physical and genetic map of the upland cotton genome using a high-resolution and high-throughput capillary-based fingerprinting method. The map was constr...

  14. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  15. Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

    PubMed

    Moss, Nathan A; Bertin, Matthew J; Kleigrewe, Karin; Leão, Tiago F; Gerwick, Lena; Gerwick, William H

    2016-03-01

    Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313

  16. [From population genetics to population genomics of forest trees: integrated population genomics approach].

    PubMed

    Krutovskiĭ, K V

    2006-10-01

    Early works by Altukhov and his associates on pine and spruce laid the foundation for Russian population genetic studies on tree species with the use of molecular genetic markers. In recent years, these species have become especially popular as nontraditional eukaryotic models for population and evolutionary genomic research. Tree species with large, cross-pollinating native populations, high genetic and phenotypic variation, growing in diverse environments and affected by environmental changes during hundreds of years of their individual development, are an ideal model for studying the molecular genetic basis of adaptation. The great advance in this field is due to the rapid development of population genomics in the last few years. In the broad sense, population genomics is a novel, fast-developing discipline, combining traditional population genetic approaches with the genomic level of analysis. Thousands of genes with known function and sometimes known genomic localization can be simultaneously studied in many individuals. This opens new prospects for obtaining statistical estimates for a great number of genes and segregating elements. Mating system, gene exchange, reproductive population size, population disequilibrium, interaction among populations, and many other traditional problems of population genetics can be now studied using data on variation in many genes. Moreover, population genomic analysis allows one to distinguish factors that affect individual genes, alleles, or nucleotides (such as, for example, natural selection) from factors affecting the entire genome (e.g., demography). This paper presents a brief review of traditional methods of studying genetic variation in forest tree species and introduces a new, integrated population genomics approach. The main stages of the latter are : (1) selection of genes, which are tentatively involved in variation of adaptive traits, by means of a detailed examination of the regulation and the expression of

  17. TALEN-mediated genome editing: prospects and perspectives

    SciTech Connect

    Wright, DA; Li, T; Yang, B; Spalding, MH

    2014-08-15

    Genome editing is the practice of making predetermined and precise changes to a genome by controlling the location of DNA DSBs (double-strand breaks) and manipulating the cell's repair mechanisms. This technology results from harnessing natural processes that have taken decades and multiple lines of inquiry to understand. Through many false starts and iterative technology advances, the goal of genome editing is just now falling under the control of human hands as a routine and broadly applicable method. The present review attempts to define the technique and capture the discovery process while following its evolution from meganucleases and zinc finger nucleases to the current state of the art: TALEN (transcription-activator-like effector nuclease) technology. We also discuss factors that influence success, technical challenges, and future prospects of this quickly evolving area of study and application.

  18. Frequency and Spectrum of Genomic Integration of Recombinant Adeno-Associated Virus Serotype 8 Vector in Neonatal Mouse Liver▿

    PubMed Central

    Inagaki, Katsuya; Piao, Chuncheng; Kotchey, Nicole M.; Wu, Xiaolin; Nakai, Hiroyuki

    2008-01-01

    Neonatal injection of recombinant adeno-associated virus serotype 8 (rAAV8) vectors results in widespread transduction in multiple organs and therefore holds promise in neonatal gene therapy. On the other hand, insertional mutagenesis causing liver cancer has been implicated in rAAV-mediated neonatal gene transfer. Here, to better understand rAAV integration in neonatal livers, we investigated the frequency and spectrum of genomic integration of rAAV8 vectors in the liver following intraperitoneal injection of 2.0 × 1011 vector genomes at birth. This dose was sufficient to transduce a majority of hepatocytes in the neonatal period. In the first approach, we injected mice with a β-galactosidase-expressing vector at birth and quantified rAAV integration events by taking advantage of liver regeneration in a chronic hepatitis animal model and following partial hepatectomy. In the second approach, we performed a new, quantitative rAAV vector genome rescue assay by which we identified rAAV integration sites and quantified integrations. As a result, we find that at least ∼0.05% of hepatocytes contained rAAV integration, while the average copy number of integrated double-stranded vector genome per cell in the liver was ∼0.2, suggesting concatemer integration. Twenty-three of 34 integrations (68%) occurred in genes, but none of them were near the mir-341 locus, the common rAAV integration site found in mouse hepatocellular carcinoma. Thus, rAAV8 vector integration occurs preferentially in genes at a frequency of 1 in approximately 103 hepatocytes when a majority of hepatocytes are once transduced in the neonatal period. Further studies are warranted to elucidate the relationship between vector dose and integration frequency or spectrum. PMID:18614641

  19. Integrated Genome-Based Studies of Shewanella Echophysiology

    SciTech Connect

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high

  20. An integrative computational approach for prioritization of genomic variants.

    PubMed

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Cem, Meydan; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R; Mirzaa, Ghayda M; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E; Ross, M Elizabeth; Maltsev, Natalia; Gilliam, T Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  1. An Integrative Computational Approach for Prioritization of Genomic Variants

    PubMed Central

    Wang, Sheng; Meyden, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad

    2014-01-01

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest. PMID:25506935

  2. An integrative computational approach for prioritization of genomic variants

    SciTech Connect

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; Paciorkowski, Alex R.; Mirzaa, Ghayda M.; Dave, Paul; Agam, Gady; Xu, Jinbo; Al-Gazali, Lihadh; Mason, Christopher E.; Ross, M. Elizabeth; Maltsev, Natalia; Gilliam, T. Conrad; Huang, Qingyang

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.

  3. An integrative computational approach for prioritization of genomic variants

    DOE PAGESBeta

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; Meydan, Cem; Sulakhe, Dinanath; Poliakov, Alexander; Börnigen, Daniela; Xie, Bingqing; Taylor, Andrew; Ma, Jianzhu; et al

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less

  4. Integrative Genomics Identifies Gene Signature Associated with Melanoma Ulceration

    PubMed Central

    Toth, Reka; Vizkeleti, Laura; Herandez-Vargas, Hector; Lazar, Viktoria; Emri, Gabriella; Szatmari, Istvan; Herceg, Zdenko; Adany, Roza; Balazs, Margit

    2013-01-01

    Background Despite the extensive research approaches applied to characterise malignant melanoma, no specific molecular markers are available that are clearly related to the progression of this disease. In this study, our aims were to define a gene expression signature associated with the clinical outcome of melanoma patients and to provide an integrative interpretation of the gene expression -, copy number alterations -, and promoter methylation patterns that contribute to clinically relevant molecular functional alterations. Methods Gene expression profiles were determined using the Affymetrix U133 Plus2.0 array. The NimbleGen Human CGH Whole-Genome Tiling array was used to define CNAs, and the Illumina GoldenGate Methylation platform was applied to characterise the methylation patterns of overlapping genes. Results We identified two subclasses of primary melanoma: one representing patients with better prognoses and the other being characteristic of patients with unfavourable outcomes. We assigned 1,080 genes as being significantly correlated with ulceration, 987 genes were downregulated and significantly enriched in the p53, Nf-kappaB, and WNT/beta-catenin pathways. Through integrated genome analysis, we defined 150 downregulated genes whose expression correlated with copy number losses in ulcerated samples. These genes were significantly enriched on chromosome 6q and 10q, which contained a total of 36 genes. Ten of these genes were downregulated and involved in cell-cell and cell-matrix adhesion or apoptosis. The expression and methylation patterns of additional genes exhibited an inverse correlation, suggesting that transcriptional silencing of these genes is driven by epigenetic events. Conclusion Using an integrative genomic approach, we were able to identify functionally relevant molecular hotspots characterised by copy number losses and promoter hypermethylation in distinct molecular subtypes of melanoma that contribute to specific transcriptomic silencing

  5. A New Approach to Dissect Nuclear Organization: TALE-Mediated Genome Visualization (TGV).

    PubMed

    Miyanari, Yusuke

    2016-01-01

    Spatiotemporal organization of chromatin within the nucleus has so far remained elusive. Live visualization of nuclear remodeling could be a promising approach to understand its functional relevance in genome functions and mechanisms regulating genome architecture. Recent technological advances in live imaging of chromosomes begun to explore the biological roles of the movement of the chromatin within the nucleus. Here I describe a new technique, called TALE-mediated genome visualization (TGV), which allows us to visualize endogenous repetitive sequence including centromeric, pericentromeric, and telomeric repeats in living cells. PMID:26443216

  6. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two brea...

  7. Bilayer-thickness-mediated interactions between integral membrane proteins.

    PubMed

    Kahraman, Osman; Koch, Peter D; Klug, William S; Haselwandter, Christoph A

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  8. Bilayer-thickness-mediated interactions between integral membrane proteins

    NASA Astrophysics Data System (ADS)

    Kahraman, Osman; Koch, Peter D.; Klug, William S.; Haselwandter, Christoph A.

    2016-04-01

    Hydrophobic thickness mismatch between integral membrane proteins and the surrounding lipid bilayer can produce lipid bilayer thickness deformations. Experiment and theory have shown that protein-induced lipid bilayer thickness deformations can yield energetically favorable bilayer-mediated interactions between integral membrane proteins, and large-scale organization of integral membrane proteins into protein clusters in cell membranes. Within the continuum elasticity theory of membranes, the energy cost of protein-induced bilayer thickness deformations can be captured by considering compression and expansion of the bilayer hydrophobic core, membrane tension, and bilayer bending, resulting in biharmonic equilibrium equations describing the shape of lipid bilayers for a given set of bilayer-protein boundary conditions. Here we develop a combined analytic and numerical methodology for the solution of the equilibrium elastic equations associated with protein-induced lipid bilayer deformations. Our methodology allows accurate prediction of thickness-mediated protein interactions for arbitrary protein symmetries at arbitrary protein separations and relative orientations. We provide exact analytic solutions for cylindrical integral membrane proteins with constant and varying hydrophobic thickness, and develop perturbative analytic solutions for noncylindrical protein shapes. We complement these analytic solutions, and assess their accuracy, by developing both finite element and finite difference numerical solution schemes. We provide error estimates of our numerical solution schemes and systematically assess their convergence properties. Taken together, the work presented here puts into place an analytic and numerical framework which allows calculation of bilayer-mediated elastic interactions between integral membrane proteins for the complicated protein shapes suggested by structural biology and at the small protein separations most relevant for the crowded membrane

  9. Potential pitfalls of CRISPR/Cas9-mediated genome editing.

    PubMed

    Peng, Rongxue; Lin, Guigao; Li, Jinming

    2016-04-01

    Recently, a novel technique named the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas)9 system has been rapidly developed. This genome editing tool has improved our ability tremendously with respect to exploring the pathogenesis of diseases and correcting disease mutations, as well as phenotypes. With a short guide RNA, Cas9 can be precisely directed to target sites, and functions as an endonuclease to efficiently produce breaks in DNA double strands. Over the past 30 years, CRISPR has evolved from the 'curious sequences of unknown biological function' into a promising genome editing tool. As a result of the incessant development in the CRISPR/Cas9 system, Cas9 co-expressed with custom guide RNAs has been successfully used in a variety of cells and organisms. This genome editing technology can also be applied to synthetic biology, functional genomic screening, transcriptional modulation and gene therapy. However, although CRISPR/Cas9 has a broad range of action in science, there are several aspects that affect its efficiency and specificity, including Cas9 activity, target site selection and short guide RNA design, delivery methods, off-target effects and the incidence of homology-directed repair. In the present review, we highlight the factors that affect the utilization of CRISPR/Cas9, as well as possible strategies for handling any problems. Addressing these issues will allow us to take better advantage of this technique. In addition, we also review the history and rapid development of the CRISPR/Cas system from the time of its initial discovery in 2012. PMID:26535798

  10. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    SciTech Connect

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  11. Integrated Genomic and Gene Expression Profiling Identifies Two Major Genomic Circuits in Urothelial Carcinoma

    PubMed Central

    Lindgren, David; Sjödahl, Gottfrid; Lauss, Martin; Staaf, Johan; Chebil, Gunilla; Lövgren, Kristina; Gudjonsson, Sigurdur; Liedberg, Fredrik; Patschan, Oliver; Månsson, Wiking; Fernö, Mårten; Höglund, Mattias

    2012-01-01

    Similar to other malignancies, urothelial carcinoma (UC) is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21), and BCL2L1 (20q11). We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development. PMID:22685613

  12. Integrated cytogenetics and genomics analysis of transposable elements in the Nile tilapia, Oreochromis niloticus.

    PubMed

    Valente, Guilherme; Kocher, Thomas; Eickbush, Thomas; Simões, Rafael P; Martins, Cesar

    2016-06-01

    Integration of cytogenetics and genomics has become essential to a better view of architecture and function of genomes. Although the advances on genomic sequencing have contributed to study genes and genomes, the repetitive DNA fraction of the genome is still enigmatic and poorly understood. Among repeated DNAs, transposable elements (TEs) are major components of eukaryotic chromatin and their investigation has been hindered even after the availability of whole sequenced genomes. The cytogenetic mapping of TEs in chromosomes has proved to be of high value to integrate information from the micro level of nucleotide sequence to a cytological view of chromosomes. Different TEs have been cytogenetically mapped in cichlids; however, neither details about their genomic arrangement nor appropriated copy number are well defined by these approaches. The current study integrates TEs distribution in Nile tilapia Oreochromis niloticus genome based on cytogenetic and genomics/bioinformatics approach. The results showed that some elements are not randomly distributed and that some are genomic dependent on each other. Moreover, we found extensive overlap between genomics and cytogenetics data and that tandem duplication may be the major mechanism responsible for the genomic dynamics of TEs here analyzed. This paper provides insights in the genomic organization of TEs under an integrated view based on cytogenetics and genomics. PMID:26860923

  13. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  14. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  15. Red-Mediated Transposition and Final Release of the Mini-F Vector of a Cloned Infectious Herpesvirus Genome

    PubMed Central

    Wussow, Felix; Fickenscher, Helmut; Tischer, B. Karsten

    2009-01-01

    Bacterial artificial chromosomes (BACs) are well-established cloning vehicles for functional genomics and for constructing targeting vectors and infectious viral DNA clones. Red-recombination-based mutagenesis techniques have enabled the manipulation of BACs in Escherichia coli without any remaining operational sequences. Here, we describe that the F-factor-derived vector sequences can be inserted into a novel position and seamlessly removed from the present location of the BAC-cloned DNA via synchronous Red-recombination in E. coli in an en passant mutagenesis-based procedure. Using this technique, the mini-F elements of a cloned infectious varicella zoster virus (VZV) genome were specifically transposed into novel positions distributed over the viral DNA to generate six different BAC variants. In comparison to the other constructs, a BAC variant with mini-F sequences directly inserted into the junction of the genomic termini resulted in highly efficient viral DNA replication-mediated spontaneous vector excision upon virus reconstitution in transfected VZV-permissive eukaryotic cells. Moreover, the derived vector-free recombinant progeny exhibited virtually indistinguishable genome properties and replication kinetics to the wild-type virus. Thus, a sequence-independent, efficient, and easy-to-apply mini-F vector transposition procedure eliminates the last hurdle to perform virtually any kind of imaginable targeted BAC modifications in E. coli. The herpesviral terminal genomic junction was identified as an optimal mini-F vector integration site for the construction of an infectious BAC, which allows the rapid generation of mutant virus without any unwanted secondary genome alterations. The novel mini-F transposition technique can be a valuable tool to optimize, repair or restructure other established BACs as well and may facilitate the development of gene therapy or vaccine vectors. PMID:19997639

  16. Integrated genome-wide analysis of genomic changes and gene regulation in human adrenocortical tissue samples

    PubMed Central

    Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S.; Kebebew, Electron

    2015-01-01

    To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics. PMID:26446994

  17. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data.

    PubMed

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org. PMID:17932055

  18. Cas9-Mediated Genome Engineering in Drosophila melanogaster.

    PubMed

    Housden, Benjamin E; Perrimon, Norbert

    2016-01-01

    The recent development of the CRISPR-Cas9 system for genome engineering has revolutionized our ability to modify the endogenous DNA sequence of many organisms, including Drosophila This system allows alteration of DNA sequences in situ with single base-pair precision and is now being used for a wide variety of applications. To use the CRISPR system effectively, various design parameters must be considered, including single guide RNA target site selection and identification of successful editing events. Here, we review recent advances in CRISPR methodology in Drosophila and introduce protocols for some of the more difficult aspects of CRISPR implementation: designing and generating CRISPR reagents and detecting indel mutations by high-resolution melt analysis. PMID:27587786

  19. Androgen receptor-mediated non-genomic regulation of prostate cancer cell proliferation

    PubMed Central

    Liao, Ross S.; Ma, Shihong; Miao, Lu; Li, Rui; Yin, Yi

    2013-01-01

    Androgen receptor (AR)-mediated signaling is necessary for prostate cancer cell proliferation and an important target for therapeutic drug development. Canonically, AR signals through a genomic or transcriptional pathway, involving the translocation of androgen-bound AR to the nucleus, its binding to cognate androgen response elements on promoter, with ensuing modulation of target gene expression, leading to cell proliferation. However, prostate cancer cells can show dose-dependent proliferation responses to androgen within minutes, without the need for genomic AR signaling. This proliferation response known as the non-genomic AR signaling is mediated by cytoplasmic AR, which facilitates the activation of kinase-signaling cascades, including the Ras-Raf-1, phosphatidyl-inositol 3-kinase (PI3K)/Akt and protein kinase C (PKC), which in turn converge on mitogen-activated protein kinase (MAPK)/extracellular signal-regulated kinase (ERK) activation, leading to cell proliferation. Further, since activated ERK may also phosphorylate AR and its coactivators, the non-genomic AR signaling may enhance AR genomic activity. Non-genomic AR signaling may occur in an ERK-independent manner, via activation of mammalian target of rapamycin (mTOR) pathway, or modulation of intracellular Ca2+ concentration through plasma membrane G protein-coupled receptors (GPCRs). These data suggest that therapeutic strategies aimed at preventing AR nuclear translocation and genomic AR signaling alone may not completely abrogate AR signaling. Thus, elucidation of mechanisms that underlie non-genomic AR signaling may identify potential mechanisms of resistance to current anti-androgens and help developing novel therapies that abolish all AR signaling in prostate cancer. PMID:26816736

  20. Examination of host genome for the presence of integrated fragments of Solenopsis invicta virus 1

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A series of oligonucleotide primer pairs covering the entire genome of Solenopsis invicta virus 1 (SINV-1) were used to probe the Solenopsis invicta genome for integrated fragments of the viral genome. All of the oligonucleotide primer sets yielded amplicons of anticipated size from cDNA created f...

  1. Integration of optical devices and nanotechnology for conducting genome research

    NASA Astrophysics Data System (ADS)

    Chung, Pei-Yu; Parag, Parekh; Zhu, Zhi; Chegini, Claudine; Schultz, Gregory; Tan, Weihong; Jiang, Peng; Batich, Christopher

    2011-06-01

    SPR based sensing techniques utilize a spectroscopy for transducing biomolecular binding events to variations in spectra. This label-free and real-time technique has widely applied for conducting biomedical research. In this study, we present a spectroscopy-based SPR system for monitoring binding between human serum albumin and nucleic acid library. Compared with conventional SPR technique, this novel system utilizes cost-effective nanostructured arrays and a portable UV-Vis spectrometer. These advantages enable a promising development of a portable analytical device for widespread applications. Meanwhile, multispectral analysis used here also helps increase the sensitivity, and thus transducing the binding event to optical signal efficiently. The result demonstrates that this cost-effective and portable system could be applied for a future application of selecting target aptamer. Moreover, we also present surface enhanced Raman spectroscopy (SERS) on the nanostructured arrays in a label-free approach. This integration of multiple spectroscopy technologies is utilized for conducting genome research efficiently.

  2. Genome and proteome annotation: organization, interpretation and integration

    PubMed Central

    Reeves, Gabrielle A.; Talavera, David; Thornton, Janet M.

    2008-01-01

    Recent years have seen a huge increase in the generation of genomic and proteomic data. This has been due to improvements in current biological methodologies, the development of new experimental techniques and the use of computers as support tools. All these raw data are useless if they cannot be properly analysed, annotated, stored and displayed. Consequently, a vast number of resources have been created to present the data to the wider community. Annotation tools and databases provide the means to disseminate these data and to comprehend their biological importance. This review examines the various aspects of annotation: type, methodology and availability. Moreover, it puts a special interest on novel annotation fields, such as that of phenotypes, and highlights the recent efforts focused on the integrating annotations. PMID:19019817

  3. CRISPR/Cas9-mediated genome engineering of CHO cell factories: Application and perspectives.

    PubMed

    Lee, Jae Seong; Grav, Lise Marie; Lewis, Nathan E; Faustrup Kildegaard, Helene

    2015-07-01

    Chinese hamster ovary (CHO) cells are the most widely used production host for therapeutic proteins. With the recent emergence of CHO genome sequences, CHO cell line engineering has taken on a new aspect through targeted genome editing. The bacterial clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system enables rapid, easy and efficient engineering of mammalian genomes. It has a wide range of applications from modification of individual genes to genome-wide screening or regulation of genes. Facile genome editing using CRISPR/Cas9 empowers researchers in the CHO community to elucidate the mechanistic basis behind high level production of proteins and product quality attributes of interest. In this review, we describe the basis of CRISPR/Cas9-mediated genome editing and its application for development of next generation CHO cell factories while highlighting both future perspectives and challenges. As one of the main drivers for the CHO systems biology era, genome engineering with CRISPR/Cas9 will pave the way for rational design of CHO cell factories. PMID:26058577

  4. Application of oocyte cryopreservation technology in TALEN-mediated mouse genome editing.

    PubMed

    Nakagawa, Yoshiko; Sakuma, Tetsushi; Nakagata, Naomi; Yamasaki, Sho; Takeda, Naoki; Ohmuraya, Masaki; Yamamoto, Takashi

    2014-01-01

    Reproductive engineering techniques, such as in vitro fertilization (IVF) and cryopreservation of embryos or spermatozoa, are essential for preservation, reproduction, and transportation of genetically engineered mice. However, it has not yet been elucidated whether these techniques can be applied for the generation of genome-edited mice using engineered nucleases such as transcription activator-like effector nucleases (TALENs). Here, we demonstrate the usefulness of frozen oocytes fertilized in vitro using frozen sperm for TALEN-mediated genome editing in mice. We examined side-by-side comparisons concerning sperm (fresh vs. frozen), fertilization method (mating vs. IVF), and fertilized oocytes (fresh vs. frozen) for the source of oocytes used for TALEN injection; we found that fertilized oocytes created under all tested conditions were applicable for TALEN-mediated mutagenesis. In addition, we investigated whether the ages in weeks of parental female mice can affect the efficiency of gene modification, by comparing 5-week-old and 8-12-week-old mice as the source of oocytes used for TALEN injection. The genome editing efficiency of an endogenous gene was consistently 95-100% when either 5-week-old or 8-12-week-old mice were used with or without freezing the oocytes. Thus, our report describes the availability of freeze-thawed oocytes and oocytes from female mice at various weeks of age for TALEN-mediated genome editing, thus boosting the convenience of such innovative gene targeting strategies. PMID:25077765

  5. Integrated Genomic and Epigenomic Analysis of Breast Cancer Brain Metastasis

    PubMed Central

    Salhia, Bodour; Kiefer, Jeff; Ross, Julianna T. D.; Metapally, Raghu; Martinez, Rae Anne; Johnson, Kyle N.; DiPerna, Danielle M.; Paquette, Kimberly M.; Jung, Sungwon; Nasser, Sara; Wallstrom, Garrick; Tembe, Waibhav; Baker, Angela; Carpten, John; Resau, Jim; Ryken, Timothy; Sibenaller, Zita; Petricoin, Emanuel F.; Liotta, Lance A.; Ramanathan, Ramesh K.; Berens, Michael E.; Tran, Nhan L.

    2014-01-01

    The brain is a common site of metastatic disease in patients with breast cancer, which has few therapeutic options and dismal outcomes. The purpose of our study was to identify common and rare events that underlie breast cancer brain metastasis. We performed deep genomic profiling, which integrated gene copy number, gene expression and DNA methylation datasets on a collection of breast brain metastases. We identified frequent large chromosomal gains in 1q, 5p, 8q, 11q, and 20q and frequent broad-level deletions involving 8p, 17p, 21p and Xq. Frequently amplified and overexpressed genes included ATAD2, BRAF, DERL1, DNMTRB and NEK2A. The ATM, CRYAB and HSPB2 genes were commonly deleted and underexpressed. Knowledge mining revealed enrichment in cell cycle and G2/M transition pathways, which contained AURKA, AURKB and FOXM1. Using the PAM50 breast cancer intrinsic classifier, Luminal B, Her2+/ER negative, and basal-like tumors were identified as the most commonly represented breast cancer subtypes in our brain metastasis cohort. While overall methylation levels were increased in breast cancer brain metastasis, basal-like brain metastases were associated with significantly lower levels of methylation. Integrating DNA methylation data with gene expression revealed defects in cell migration and adhesion due to hypermethylation and downregulation of PENK, EDN3, and ITGAM. Hypomethylation and upregulation of KRT8 likely affects adhesion and permeability. Genomic and epigenomic profiling of breast brain metastasis has provided insight into the somatic events underlying this disease, which have potential in forming the basis of future therapeutic strategies. PMID:24489661

  6. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease. PMID:26395457

  7. Integrative bioinformatics for functional genome annotation: trawling for G protein-coupled receptors.

    PubMed

    Flower, Darren R; Attwood, Teresa K

    2004-12-01

    G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow. PMID:15561589

  8. An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes

    PubMed Central

    Hu, Ying; Shukla, Anjali; Ha, Ngoc-Han; Doran, Anthony; Faraji, Farhoud; Goldberger, Natalie; Lee, Maxwell P.; Keane, Thomas

    2016-01-01

    Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease. PMID:27074153

  9. An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes.

    PubMed

    Bai, Ling; Yang, Howard H; Hu, Ying; Shukla, Anjali; Ha, Ngoc-Han; Doran, Anthony; Faraji, Farhoud; Goldberger, Natalie; Lee, Maxwell P; Keane, Thomas; Hunter, Kent W

    2016-04-01

    Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease. PMID:27074153

  10. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    SciTech Connect

    Baliga, Nitin S

    2011-05-26

    applied to the manually curated training set. Applying this method to the data representing around a quarter of the fraction space for water soluble proteins in D. vulgaris, we obtained 854 reliable pair wise interactions. Further, we have developed algorithms to analyze and assign significance to protein interaction data from bait pull-down experiments and integrate these data with other systems biology data through associative biclustering in a parallel computing environment. We will 'fill-in' missing information in these interaction data using a 'Transitive Closure' algorithm and subsequently use 'Between Commonality Decomposition' algorithm to discover complexes within these large graphs of protein interactions. To characterize the metabolic activities of proteins and their complexes we are developing algorithms to deconvolute pure mass spectra, estimate chemical formula for m/z values, and fit isotopic fine structure to metabolomics data. We have discovered that in comparison to isotopic pattern fitting methods restricting the chemical formula by these two dimensions actually facilitates unique solutions for chemical formula generators. To understand how microbial functions are regulated we have developed complementary algorithms for reconstructing gene regulatory networks (GRNs). Whereas the network inference algorithms cMonkey and Inferelator developed enable de novo reconstruction of predictive models for GRNs from diverse systems biology data, the RegPrecise and RegPredict framework developed uses evolutionary comparisons of genomes from closely related organisms to reconstruct conserved regulons. We have integrated the two complementary algorithms to rapidly generate comprehensive models for gene regulation of understudied organisms. Our preliminary analyses of these reconstructed GRNs have revealed novel regulatory mechanisms and cis-regulatory motifs, as well asothers that are conserved across species. Finally, we are supporting scientific efforts in ENIGMA

  11. Mediator infrastructure for information integration and semantic data integration environment for biomedical research.

    PubMed

    Grethe, Jeffrey S; Ross, Edward; Little, David; Sanders, Brian; Gupta, Amarnath; Astakhov, Vadim

    2009-01-01

    This paper presents current progress in the development of semantic data integration environment which is a part of the Biomedical Informatics Research Network (BIRN; http://www.nbirn.net) project. BIRN is sponsored by the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). A goal is the development of a cyberinfrastructure for biomedical research that supports advance data acquisition, data storage, data management, data integration, data mining, data visualization, and other computing and information processing services over the Internet. Each participating institution maintains storage of their experimental or computationally derived data. Mediator-based data integration system performs semantic integration over the databases to enable researchers to perform analyses based on larger and broader datasets than would be available from any single institution's data. This paper describes recent revision of the system architecture, implementation, and capabilities of the semantically based data integration environment for BIRN. PMID:19623485

  12. Accessing integrated genomic data using GenoBase: A tutorial, Part 1

    SciTech Connect

    Overbeek, R.; Price, M.

    1993-01-01

    GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.

  13. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement.

    PubMed

    Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  14. Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement

    PubMed Central

    Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.

    2016-01-01

    Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667

  15. CRISPR/Cas9-Mediated Genome Editing of Mouse Small Intestinal Organoids.

    PubMed

    Schwank, Gerald; Clevers, Hans

    2016-01-01

    The CRISPR/Cas9 system is an RNA-guided genome-editing tool that has been recently developed based on the bacterial CRISPR-Cas immune defense system. Due to its versatility and simplicity, it rapidly became the method of choice for genome editing in various biological systems, including mammalian cells. Here we describe a protocol for CRISPR/Cas9-mediated genome editing in murine small intestinal organoids, a culture system in which somatic stem cells are maintained by self-renewal, while giving rise to all major cell types of the intestinal epithelium. This protocol allows the study of gene function in intestinal epithelial homeostasis and pathophysiology and can be extended to epithelial organoids derived from other internal mouse and human organs. PMID:27246017

  16. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  17. Integrating microarray gene expression object model and clinical document architecture for cancer genomics research.

    PubMed

    Park, Yu Rang; Lee, Hye Won; Kim, Ju Han

    2005-01-01

    Systematic integration of genomic-scale expression profiles with clinical information may facilitate cancer genomics research. MAGE-OM (Microarray Gene Expression Object Model) defines standard objects for genomic but not for clinical data. HL7 CDA (Clinical Document Architecture) is a document model for clinical information, describing syntax (generic structure) but not semantics. We designed a document template in XML Schema with additional constraints for CDA to define content semantics, enabling data model-level integration of MAGE-OM and CDA for cancer genomics research. PMID:16779360

  18. A Phenotype-Driven Dimension Reduction (PhDDR) Approach to Integrated Genomic Association Analyses

    PubMed Central

    Gao, Cuilan; Cheng, Cheng

    2013-01-01

    An immediate challenge in integrated genomic analysis involving several types of genomic factors all measured genome-wide is the ultra-high dimensionality. Screening all possible relationships among the genomic factors is an NP-hard problem; therefore in practice proper dimension reduction is necessary. In this paper we develop the Phenotype-Driven Dimension Reduction (PhDDR) approach to the analysis of gene co-expressions, and discuss its extensions to integration of other genetic factors. This approach is then illustrated by an application to gene co-expression analysis of treatment response of childhood leukemia. PMID:22255909

  19. Integrative Functional Genomics of Hepatitis C Virus Infection Identifies Host Dependencies in Complete Viral Replication Cycle

    PubMed Central

    Li, Qisheng; Zhang, Yong-Yuan; Chiu, Stephan; Hu, Zongyi; Lan, Keng-Hsin; Cha, Helen; Sodroski, Catherine; Zhang, Fang; Hsu, Ching-Sheng; Thomas, Emmanuel; Liang, T. Jake

    2014-01-01

    Recent functional genomics studies including genome-wide small interfering RNA (siRNA) screens demonstrated that hepatitis C virus (HCV) exploits an extensive network of host factors for productive infection and propagation. How these co-opted host functions interact with various steps of HCV replication cycle and exert pro- or antiviral effects on HCV infection remains largely undefined. Here we present an unbiased and systematic strategy to functionally interrogate HCV host dependencies uncovered from our previous infectious HCV (HCVcc) siRNA screen. Applying functional genomics approaches and various in vitro HCV model systems, including HCV pseudoparticles (HCVpp), single-cycle infectious particles (HCVsc), subgenomic replicons, and HCV cell culture systems (HCVcc), we identified and characterized novel host factors or pathways required for each individual step of the HCV replication cycle. Particularly, we uncovered multiple HCV entry factors, including E-cadherin, choline kinase α, NADPH oxidase CYBA, Rho GTPase RAC1 and SMAD family member 6. We also demonstrated that guanine nucleotide binding protein GNB2L1, E2 ubiquitin-conjugating enzyme UBE2J1, and 39 other host factors are required for HCV RNA replication, while the deubiquitinating enzyme USP11 and multiple other cellular genes are specifically involved in HCV IRES-mediated translation. Families of antiviral factors that target HCV replication or translation were also identified. In addition, various virologic assays validated that 66 host factors are involved in HCV assembly or secretion. These genes included insulin-degrading enzyme (IDE), a proviral factor, and N-Myc down regulated Gene 1 (NDRG1), an antiviral factor. Bioinformatics meta-analyses of our results integrated with literature mining of previously published HCV host factors allows the construction of an extensive roadmap of cellular networks and pathways involved in the complete HCV replication cycle. This comprehensive study of HCV host

  20. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources

    PubMed Central

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/ PMID:26589635

  1. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    PubMed

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. PMID:26589635

  2. Goldmine integrates information placing genomic ranges into meaningful biological contexts

    PubMed Central

    Bhasin, Jeffrey M.; Ting, Angela H.

    2016-01-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  3. Goldmine integrates information placing genomic ranges into meaningful biological contexts.

    PubMed

    Bhasin, Jeffrey M; Ting, Angela H

    2016-07-01

    Bioinformatic analysis often produces large sets of genomic ranges that can be difficult to interpret in the absence of genomic context. Goldmine annotates genomic ranges from any source with gene model and feature contexts to facilitate global descriptions and candidate loci discovery. We demonstrate the value of genomic context by using Goldmine to elucidate context dynamics in transcription factor binding and to reveal differentially methylated regions (DMRs) with context-specific functional correlations. The open source R package and documentation for Goldmine are available at http://jeffbhasin.github.io/goldmine. PMID:27257071

  4. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. PMID:25480115

  5. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

    PubMed Central

    Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

    2015-01-01

    Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors. PMID:26102582

  6. Zygote-mediated generation of genome-modified mice using Streptococcus thermophilus 1-derived CRISPR/Cas system.

    PubMed

    Fujii, Wataru; Kakuta, Shigeru; Yoshioka, Shin; Kyuwa, Shigeru; Sugiura, Koji; Naito, Kunihiko

    2016-08-26

    Mammalian zygote-mediated genome-engineering by CRISPR/Cas is currently used for the generation of genome-modified animals. Here we report that a Streptococcus thermophilus-1 derived orthologous CRISPR/Cas system, which recognizes the 5'-NNAGAA sequence as a protospacer adjacent motif (PAM), is useful in mouse zygotes and is applicable for generating knockout mice (87.5%) and targeted knock-in mice (45.5%). The induced mutation could be inherited in the next generation. This novel CRISPR/Cas can expand the feasibility of the zygote-mediated generation of genome-modified animals that require an exact mutation design. PMID:27318086

  7. DNA Damage Response and Spindle Assembly Checkpoint Function throughout the Cell Cycle to Ensure Genomic Integrity

    PubMed Central

    Lawrence, Katherine S.; Chau, Thinh; Engebrecht, JoAnne

    2015-01-01

    Errors in replication or segregation lead to DNA damage, mutations, and aneuploidies. Consequently, cells monitor these events and delay progression through the cell cycle so repair precedes division. The DNA damage response (DDR), which monitors DNA integrity, and the spindle assembly checkpoint (SAC), which responds to defects in spindle attachment/tension during metaphase of mitosis and meiosis, are critical for preventing genome instability. Here we show that the DDR and SAC function together throughout the cell cycle to ensure genome integrity in C. elegans germ cells. Metaphase defects result in enrichment of SAC and DDR components to chromatin, and both SAC and DDR are required for metaphase delays. During persistent metaphase arrest following establishment of bi-oriented chromosomes, stability of the metaphase plate is compromised in the absence of DDR kinases ATR or CHK1 or SAC components, MAD1/MAD2, suggesting SAC functions in metaphase beyond its interactions with APC activator CDC20. In response to DNA damage, MAD2 and the histone variant CENPA become enriched at the nuclear periphery in a DDR-dependent manner. Further, depletion of either MAD1 or CENPA results in loss of peripherally associated damaged DNA. In contrast to a SAC-insensitive CDC20 mutant, germ cells deficient for SAC or CENPA cannot efficiently repair DNA damage, suggesting that SAC mediates DNA repair through CENPA interactions with the nuclear periphery. We also show that replication perturbations result in relocalization of MAD1/MAD2 in human cells, suggesting that the role of SAC in DNA repair is conserved. PMID:25898113

  8. Genome-wide signatures of male-mediated migration shaping the Indian gene pool.

    PubMed

    ArunKumar, GaneshPrasad; Tatarinova, Tatiana V; Duty, Jeff; Rollo, Debra; Syama, Adhikarla; Arun, Varatharajan Santhakumari; Kavitha, Valampuri John; Triska, Petr; Greenspan, Bennett; Wells, R Spencer; Pitchappan, Ramasamy

    2015-09-01

    Multiple questions relating to contributions of cultural and demographical factors in the process of human geographical dispersal remain largely unanswered. India, a land of early human settlement and the resulting diversity is a good place to look for some of the answers. In this study, we explored the genetic structure of India using a diverse panel of 78 males genotyped using the GenoChip. Their genome-wide single-nucleotide polymorphism (SNP) diversity was examined in the context of various covariates that influence Indian gene pool. Admixture analysis of genome-wide SNP data showed high proportion of the Southwest Asian component in all of the Indian samples. Hierarchical clustering based on admixture proportions revealed seven distinct clusters correlating to geographical and linguistic affiliations. Convex hull overlay of Y-chromosomal haplogroups on the genome-wide SNP principal component analysis brought out distinct non-overlapping polygons of F*-M89, H*-M69, L1-M27, O2a-M95 and O3a3c1-M117, suggesting a male-mediated migration and expansion of the Indian gene pool. Lack of similar correlation with mitochondrial DNA clades indicated a shared genetic ancestry of females. We suggest that ancient male-mediated migratory events and settlement in various regional niches led to the present day scenario and peopling of India. PMID:25994871

  9. CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

    PubMed

    Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

    2015-12-17

    Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. PMID:26686651

  10. A high utility integrated map of the pig genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The domestic pig is being increasingly exploited as a system for modeling human disease. It also has substantial economic importance for meat-based protein production. Physical clone maps have underpinned large-scale genomic sequencing and enabled focused cloning efforts for many genome...

  11. Integrated and composite genome maps: the bovine example

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Combinations of genome maps representing different types of information are needed to link economically important phenotypic variation with underlying genomic variation in farmed animals. For the cow, data from two linkage populations and three radiation hybrid (RH) panels were combined to construc...

  12. CRISPR-mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice.

    PubMed

    Xu, Li; Park, Ki Ho; Zhao, Lixia; Xu, Jing; El Refaey, Mona; Gao, Yandi; Zhu, Hua; Ma, Jianjie; Han, Renzhi

    2016-03-01

    Duchenne muscular dystrophy (DMD) is a degenerative muscle disease caused by genetic mutations that lead to the disruption of dystrophin in muscle fibers. There is no curative treatment for this devastating disease. Clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) has emerged as a powerful tool for genetic manipulation and potential therapy. Here we demonstrate that CRIPSR-mediated genome editing efficiently excised a 23-kb genomic region on the X-chromosome covering the mutant exon 23 in a mouse model of DMD, and restored dystrophin expression and the dystrophin-glycoprotein complex at the sarcolemma of skeletal muscles in live mdx mice. Electroporation-mediated transfection of the Cas9/gRNA constructs in the skeletal muscles of mdx mice normalized the calcium sparks in response to osmotic shock. Adenovirus-mediated transduction of Cas9/gRNA greatly reduced the Evans blue dye uptake of skeletal muscles at rest and after downhill treadmill running. This study provides proof evidence for permanent gene correction in DMD. PMID:26449883

  13. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas

    PubMed Central

    2015-01-01

    BACKGROUND Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. METHODS We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. RESULTS Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. CONCLUSIONS The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q

  14. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of synteny with model fish genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper we generated DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is compose...

  15. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4.

    PubMed

    Gao, Song; Zhang, Liang; Rao, Venigalla B

    2016-05-19

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  16. Exclusion of small terminase mediated DNA threading models for genome packaging in bacteriophage T4

    PubMed Central

    Gao, Song; Zhang, Liang; Rao, Venigalla B.

    2016-01-01

    Tailed bacteriophages and herpes viruses use powerful molecular machines to package their genomes. The packaging machine consists of three components: portal, motor (large terminase; TerL) and regulator (small terminase; TerS). Portal, a dodecamer, and motor, a pentamer, form two concentric rings at the special five-fold vertex of the icosahedral capsid. Powered by ATPase, the motor ratchets DNA into the capsid through the portal channel. TerS is essential for packaging, particularly for genome recognition, but its mechanism is unknown and controversial. Structures of gear-shaped TerS rings inspired models that invoke DNA threading through the central channel. Here, we report that mutations of basic residues that line phage T4 TerS (gp16) channel do not disrupt DNA binding. Even deletion of the entire channel helix retained DNA binding and produced progeny phage in vivo. On the other hand, large oligomers of TerS (11-mers/12-mers), but not small oligomers (trimers to hexamers), bind DNA. These results suggest that TerS oligomerization creates a large outer surface, which, but not the interior of the channel, is critical for function, probably to wrap viral genome around the ring during packaging initiation. Hence, models involving TerS-mediated DNA threading may be excluded as an essential mechanism for viral genome packaging. PMID:26984529

  17. CRISPR/Cas9-Mediated Genome Editing in Soybean Hairy Roots.

    PubMed

    Cai, Yupeng; Chen, Li; Liu, Xiujie; Sun, Shi; Wu, Cunxiang; Jiang, Bingjun; Han, Tianfu; Hou, Wensheng

    2015-01-01

    As a new technology for gene editing, the CRISPR (clustered regularly interspaced short palindromic repeat)/Cas (CRISPR-associated) system has been rapidly and widely used for genome engineering in various organisms. In the present study, we successfully applied type II CRISPR/Cas9 system to generate and estimate genome editing in the desired target genes in soybean (Glycine max (L.) Merrill.). The single-guide RNA (sgRNA) and Cas9 cassettes were assembled on one vector to improve transformation efficiency, and we designed a sgRNA that targeted a transgene (bar) and six sgRNAs that targeted different sites of two endogenous soybean genes (GmFEI2 and GmSHR). The targeted DNA mutations were detected in soybean hairy roots. The results demonstrated that this customized CRISPR/Cas9 system shared the same efficiency for both endogenous and exogenous genes in soybean hairy roots. We also performed experiments to detect the potential of CRISPR/Cas9 system to simultaneously edit two endogenous soybean genes using only one customized sgRNA. Overall, generating and detecting the CRISPR/Cas9-mediated genome modifications in target genes of soybean hairy roots could rapidly assess the efficiency of each target loci. The target sites with higher efficiencies can be used for regular soybean transformation. Furthermore, this method provides a powerful tool for root-specific functional genomics studies in soybean. PMID:26284791

  18. Genetic and statistical study of HIV integration in the human genome

    NASA Astrophysics Data System (ADS)

    Sequeira, Inês J.; Gonçalves, Juliana; Moreira, Elsa; Mexia, João T.; Rueff, José; Brás, Aldina

    2013-10-01

    Integration of the human immunodeficiency virus (HIV) DNA into human genome is essential for HIV-induced disease. The human genome is organized into chromosomes and within these we can define the chromosomal fragile sites. Our aim is to contribute to help clarifying the integration sites preferences of HIV1 and HIV2 in fragile or non-fragile regions. Here we apply statistical techniques, namely non-parametric tests and analysis of variance for analyzing two sets of data of HIV1 and HIV2 integrations in the human genome. The results show that the integrations occur significantly with more intensity in the non-fragile regions of the human genome and that the HIV1 in particular has the major contribution to this fact. This study could have implications in human disease.

  19. MAR-mediated integration of plasmid vectors for in vivo gene transfer and regulation

    PubMed Central

    2013-01-01

    Background The in vivo transfer of naked plasmid DNA into organs such as muscles is commonly used to assess the expression of prophylactic or therapeutic genes in animal disease models. Results In this study, we devised vectors allowing a tight regulation of transgene expression in mice from such non-viral vectors using a doxycycline-controlled network of activator and repressor proteins. Using these vectors, we demonstrate proper physiological response as consequence of the induced expression of two therapeutically relevant proteins, namely erythropoietin and utrophin. Kinetic studies showed that the induction of transgene expression was only transient, unless epigenetic regulatory elements termed Matrix Attachment Regions, or MAR, were inserted upstream of the regulated promoters. Using episomal plasmid rescue and quantitative PCR assays, we observed that similar amounts of plasmids remained in muscles after electrotransfer with or without MAR elements, but that a significant portion had integrated into the muscle fiber chromosomes. Interestingly, the MAR elements were found to promote plasmid genomic integration but to oppose silencing effects in vivo, thereby mediating long-term expression. Conclusions This study thus elucidates some of the determinants of transient or sustained expression from the use of non-viral regulated vectors in vivo. PMID:24295286

  20. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    SciTech Connect

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  1. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes.

    PubMed

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  2. ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes

    PubMed Central

    Yoshimi, Kazuto; Kunihiro, Yayoi; Kaneko, Takehito; Nagahora, Hitoshi; Voigt, Birger; Mashimo, Tomoji

    2016-01-01

    The CRISPR-Cas system is a powerful tool for generating genetically modified animals; however, targeted knock-in (KI) via homologous recombination remains difficult in zygotes. Here we show efficient gene KI in rats by combining CRISPR-Cas with single-stranded oligodeoxynucleotides (ssODNs). First, a 1-kb ssODN co-injected with guide RNA (gRNA) and Cas9 messenger RNA produce GFP-KI at the rat Thy1 locus. Then, two gRNAs with two 80-bp ssODNs direct efficient integration of a 5.5-kb CAG-GFP vector into the Rosa26 locus via ssODN-mediated end joining. This protocol also achieves KI of a 200-kb BAC containing the human SIRPA locus, concomitantly knocking out the rat Sirpa gene. Finally, three gRNAs and two ssODNs replace 58-kb of the rat Cyp2d cluster with a 6.2-kb human CYP2D6 gene. These ssODN-mediated KI protocols can be applied to any target site with any donor vector without the need to construct homology arms, thus simplifying genome engineering in living organisms. PMID:26786405

  3. Human genome-wide expression analysis reorients the study of inflammatory mediators and biomechanics in osteoarthritis.

    PubMed

    Sandy, J D; Chan, D D; Trevino, R L; Wimmer, M A; Plaas, A

    2015-11-01

    A major objective of this article is to examine the research implications of recently available genome-wide expression profiles of cartilage from human osteoarthritis (OA) joints. We propose that, when viewed in the light of extensive earlier work, this novel data provides a unique opportunity to reorient the design of experimental systems toward clinical relevance. Specifically, in the area of cartilage explant biology, this will require a fresh evaluation of existing paradigms, so as to optimize the choices of tissue source, cytokine/growth factor/nutrient addition, and biomechanical environment for discovery. Within this context, we firstly discuss the literature on the nature and role of potential catabolic mediators in OA pathology, including data from human OA cartilage, animal models of OA, and ex vivo studies. Secondly, due to the number and breadth of studies on IL-1β in this area, a major focus of the article is a critical analysis of the design and interpretation of cartilage studies where IL-1β has been used as a model cytokine. Thirdly, the article provides a data-driven perspective (including genome-wide analysis of clinical samples, studies on mutant mice, and clinical trials), which concludes that IL-1β should be replaced by soluble mediators such as IL-17 or TGF-β1, which are much more likely to mimic the disease in OA model systems. We also discuss the evidence that changes in early OA can be attributed to the activity of such soluble mediators, whereas late-stage disease results more from a chronic biomechanical effect on the matrix and cells of the remaining cartilage and on other local mediator-secreting cells. Lastly, an updated protocol for in vitro studies with cartilage explants and chondrocytes (including the use of specific gene expression arrays) is provided to motivate more disease-relevant studies on the interplay of cytokines, growth factors, and biomechanics on cellular behavior. PMID:26521740

  4. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    NEALSON, KENNETH H.

    2013-10-15

    products of dissimilatory iron reduction. Geochim. Cosmochim. Acta. 74:574-583. 10. Karpinets, T.V., A.Y Obraztsova, Y. Wang, D.D. Schmoyer, G.H. Kora, B.H. Park, M.H. Serres, M.F. Ropmine, M.L. Land, T.B. Kothe, J.K. Fredrickson, K.H. Nealson, and E.C. Uberbacher 2010. Conserved synteny at the protein family level reveals genes underlying Shewanella species? cold tolerance and predicts their novel phenotypes. Funct. Integr. Genomics 10: 97 ? 110. (DOI 10.1007/s10143-009-0142-y) 11. Bretschger, O., A.C.M. Cheung, F. Mansfeld, and K.H. Nealson. 2010. Comparative microbial fuel cell evaluations of Shewanella spp. Electroanalysis 22: 883-894. 12. McLean, J.S., G. Wanger, Y.A. Gorby, M. Wainstein, J. McQuaid, Shun?ichi Ishii, O. Bretschger, H. Beyanal, K.H. Nealson. 2010. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Env. Sci. Technol. 44:2721-2717. 13. El-Naggar, M., G. Wanger, K.M. Leung, T.D. Yuzvinsky, G. Southam, J. Yang, W.M. Lau, K.H. Nealson, and Y.A. Gorby. 2010. Electrical Transport Along Bacterial Nanowires from Shewanella oneidensis MR-1 Proc. Nat. Acad. Sci. USA 107:18127-18131. 14. Biffinger, J.C., L.A. Fitzgerald, R. Ray, B.J. Little, S.E. Lizewski, E.R. Petersen, B.R. Ringeisen, W.C. Sanders, P.E. Sheehan, J.J. Pietron, J.W. Baldwin, L.J. Nadeau, G.R. Johnson, M. Ribbens, S.E. Finkel, K.H. Nealson. 2010. The utility of Shewanella japonica for microbial fuel cells. Bioresource Technol. 102:290-297. 15. Rodionov, D. , C. Yang, X. Li, I. Rodionova, Y. Wang, A.Y. Obraztsova, O. P. Zagnitko, R. Overbeek, M. F. Romine, S. Reed, J.K. Fredrickson, K.H. Nealson, A.L. Osterman. 2010. Genomic encyclopedia of sugar utilization pathways in the Shewanella genus. BMC Genomics 2010, 11:494 16. Kan, J., L. Hsu, A.C.M. Cheung, M. Pirbazari, and K.H. Nealson. 2011. Current production by bacterial communities in microbial fuel cells enriched from wastewater sludge

  5. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  6. The Mediator complex: a central integrator of transcription

    PubMed Central

    Allen, Benjamin L.; Taatjes, Dylan J.

    2016-01-01

    The RNA polymerase II (pol II) enzyme transcribes all protein-coding and most non-coding RNA genes and is globally regulated by Mediator, a large, conformationally flexible protein complex with variable subunit composition (for example, a four-subunit CDK8 module can reversibly associate). These biochemical characteristics are fundamentally important for Mediator's ability to control various processes important for transcription, including organization of chromatin architecture and regulation of pol II pre-initiation, initiation, re-initiation, pausing, and elongation. Although Mediator exists in all eukaryotes, a variety of Mediator functions appear to be specific to metazoans, indicative of more diverse regulatory requirements. PMID:25693131

  7. NDRG1 links p53 with proliferation-mediated centrosome homeostasis and genome stability

    PubMed Central

    Croessmann, Sarah; Wong, Hong Yuen; Zabransky, Daniel J.; Chu, David; Mendonca, Janet; Sharma, Anup; Mohseni, Morassa; Rosen, D. Marc; Scharpf, Robert B.; Cidado, Justin; Cochran, Rory L.; Parsons, Heather A.; Dalton, W. Brian; Erlanger, Bracha; Button, Berry; Cravero, Karen; Kyker-Snowman, Kelly; Beaver, Julia A.; Kachhap, Sushant; Hurley, Paula J.; Lauring, Josh; Park, Ben Ho

    2015-01-01

    The tumor protein 53 (TP53) tumor suppressor gene is the most frequently somatically altered gene in human cancers. Here we show expression of N-Myc down-regulated gene 1 (NDRG1) is induced by p53 during physiologic low proliferative states, and mediates centrosome homeostasis, thus maintaining genome stability. When placed in physiologic low-proliferating conditions, human TP53 null cells fail to increase expression of NDRG1 compared with isogenic wild-type controls and TP53 R248W knockin cells. Overexpression and RNA interference studies demonstrate that NDRG1 regulates centrosome number and amplification. Mechanistically, NDRG1 physically associates with γ-tubulin, a key component of the centrosome, with reduced association in p53 null cells. Strikingly, TP53 homozygous loss was mutually exclusive of NDRG1 overexpression in over 96% of human cancers, supporting the broad applicability of these results. Our study elucidates a mechanism of how TP53 loss leads to abnormal centrosome numbers and genomic instability mediated by NDRG1. PMID:26324937

  8. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  9. An integrated encyclopedia of DNA elements in the human genome.

    PubMed

    2012-09-01

    The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research. PMID:22955616

  10. Genomic and Genetic Analysis of Bordetella Bacteriophages Encoding Reverse Transcriptase-Mediated Tropism-Switching Cassettes

    PubMed Central

    Liu, Minghsun; Gingery, Mari; Doulatov, Sergei R.; Liu, Yichin; Hodes, Asher; Baker, Stephen; Davis, Paul; Simmonds, Mark; Churcher, Carol; Mungall, Karen; Quail, Michael A.; Preston, Andrew; Harvill, Eric T.; Maskell, Duncan J.; Eiserling, Frederick A.; Parkhill, Julian; Miller, Jeff F.

    2004-01-01

    Liu et al. recently described a group of related temperate bacteriophages that infect Bordetella subspecies and undergo a unique template-dependent, reverse transcriptase-mediated tropism switching phenomenon (Liu et al., Science 295: 2091-2094, 2002). Tropism switching results from the introduction of single nucleotide substitutions at defined locations in the VR1 (variable region 1) segment of the mtd (major tropism determinant) gene, which determines specificity for receptors on host bacteria. In this report, we describe the complete nucleotide sequences of the 42.5- to 42.7-kb double-stranded DNA genomes of three related phage isolates and characterize two additional regions of variability. Forty-nine coding sequences were identified. Of these coding sequences, bbp36 contained VR2 (variable region 2), which is highly dynamic and consists of a variable number of identical 19-bp repeats separated by one of three 5-bp spacers, and bpm encodes a DNA adenine methylase with unusual site specificity and a homopolymer tract that functions as a hotspot for frameshift mutations. Morphological and sequence analysis suggests that these Bordetella phage are genetic hybrids of P22 and T7 family genomes, lending further support to the idea that regions encoding protein domains, single genes, or blocks of genes are readily exchanged between bacterial and phage genomes. Bordetella bacteriophages are capable of transducing genetic markers in vitro, and by using animal models, we demonstrated that lysogenic conversion can take place in the mouse respiratory tract during infection. PMID:14973019

  11. Integrating Genomes, Brain and Behavior in the Study of Songbirds

    PubMed Central

    Clayton, David F.; Balakrishnan, Christopher N.; London, Sarah E.

    2010-01-01

    Songbirds share some essential traits but are extraordinarily diverse, allowing comparative analyses aimed at identifying specific genotype–phenotype associations. This diversity encompasses traits like vocal communication and complex social behaviors that are of great interest to humans, but that are not well represented in other accessible research organisms. Many songbirds are readily observable in nature and thus afford unique insight into the links between environment and organism. The distinctive organization of the songbird brain will facilitate analysis of genomic links to brain and behavior. Access to the zebra finch genome sequence will, therefore, prompt new questions and provide the ability to answer those questions. PMID:19788884

  12. [Integration sites and their characteristic analysis of piggyBac transposon in cattle genome].

    PubMed

    Du, Xin-Hua; Gao, Xue; Zhang, Lu-Pei; Gao, Hui-Jiang; Li, Jun-Ya; Xu, Shang-Zhong

    2013-06-01

    As a useful tool for genetic engineering, piggyBac (PB) transposons have been widely used in more than one species of transgenosis or generating mutation studies. At present, the studies about PB transposons in cattle were few. In order to get the PB transposon integration sites and summarize its characteristics in bovine genome, donor plasmid of PB[CMV-EGFP] and helper-dependent plasmid of pcDNA-PBase were constructed and transferred into bovine fibroblasts by Amaxa basic nucleofector kit for primary mammalian fibroblasts. Cell clones stably transfected were obtained after screening by G-418. Genomic DNA of transgenic cells was extracted and the integration sites of PB transposon were detected by genome walking technology. Eight integration sites were obtained in bovine genome, although only 5 sites were mapped on chromosomes 1, 2, 11, and X chromosome. We found that PB transposon was inserted into the "TTAA" location and integrated into the intergenic non-regulatory sites between two genes. Analysis of the composition of the five bases, which was close to the side of the PB integration sites "TTAA", showed that PB 5' tended to be inserted into region rich in GC (62.5%). From the study, we got that transposition occurred in cattle genome by PB transposons and the integration site information acquired from the research will provide theoretical references for bovine study by PB transposon. PMID:23774022

  13. An integrated functional genomics approach identifies the regulatory network directed by brachyury (T) in chordoma.

    PubMed

    Nelson, Andrew C; Pillay, Nischalan; Henderson, Stephen; Presneau, Nadège; Tirabosco, Roberto; Halai, Dina; Berisha, Fitim; Flicek, Paul; Stemple, Derek L; Stern, Claudio D; Wardle, Fiona C; Flanagan, Adrienne M

    2012-11-01

    Chordoma is a rare malignant tumour of bone, the molecular marker of which is the expression of the transcription factor, brachyury. Having recently demonstrated that silencing brachyury induces growth arrest in a chordoma cell line, we now seek to identify its downstream target genes. Here we use an integrated functional genomics approach involving shRNA-mediated brachyury knockdown, gene expression microarray, ChIP-seq experiments, and bioinformatics analysis to achieve this goal. We confirm that the T-box binding motif of human brachyury is identical to that found in mouse, Xenopus, and zebrafish development, and that brachyury acts primarily as an activator of transcription. Using human chordoma samples for validation purposes, we show that brachyury binds 99 direct targets and indirectly influences the expression of 64 other genes, thereby acting as a master regulator of an elaborate oncogenic transcriptional network encompassing diverse signalling pathways including components of the cell cycle, and extracellular matrix components. Given the wide repertoire of its active binding and the relative specific localization of brachyury to the tumour cells, we propose that an RNA interference-based gene therapy approach is a plausible therapeutic avenue worthy of investigation. PMID:22847733

  14. Integrative genomic testing of cancer survival using semiparametric linear transformation models.

    PubMed

    Huang, Yen-Tsung; Cai, Tianxi; Kim, Eunhee

    2016-07-20

    The wide availability of multi-dimensional genomic data has spurred increasing interests in integrating multi-platform genomic data. Integrative analysis of cancer genome landscape can potentially lead to deeper understanding of the biological process of cancer. We integrate epigenetics (DNA methylation and microRNA expression) and gene expression data in tumor genome to delineate the association between different aspects of the biological processes and brain tumor survival. To model the association, we employ a flexible semiparametric linear transformation model that incorporates both the main effects of these genomic measures as well as the possible interactions among them. We develop variance component tests to examine different coordinated effects by testing various subsets of model coefficients for the genomic markers. A Monte Carlo perturbation procedure is constructed to approximate the null distribution of the proposed test statistics. We further propose omnibus testing procedures to synthesize information from fitting various parsimonious sub-models to improve power. Simulation results suggest that our proposed testing procedures maintain proper size under the null and outperform standard score tests. We further illustrate the utility of our procedure in two genomic analyses for survival of glioblastoma multiforme patients. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26887583

  15. Site-specific T-DNA integration in Arabidopsis thaliana mediated by the combined action of CRE recombinase and ϕC31 integrase.

    PubMed

    De Paepe, Annelies; De Buck, Sylvie; Nolf, Jonah; Van Lerberge, Els; Depicker, Ann

    2013-07-01

    Random T-DNA integration into the plant host genome can be problematic for a variety of reasons, including potentially variable transgene expression as a result of different integration positions and multiple T-DNA copies, the risk of mutating the host genome and the difficulty of stacking well-defined traits. Therefore, recombination systems have been proposed to integrate the T-DNA at a pre-selected site in the host genome. Here, we demonstrate the capacity of the ϕC31 integrase (INT) for efficient targeted T-DNA integration. Moreover, we show that the iterative site-specific integration system (ISSI), which combines the activities of the CRE recombinase and INT, enables the targeting of genes to a pre-selected site with the concomitant removal of the resident selectable marker. To begin, plants expressing both the CRE and INT recombinase and containing the target attP site were constructed. These plants were supertransformed with a T-DNA vector harboring the loxP site, the attB sites, a selectable marker and an expression cassette encoding a reporter protein. Three out of the 35 transformants obtained (9%) showed transgenerational site-specific integration (SSI) of this T-DNA and removal of the resident selectable marker, as demonstrated by PCR, Southern blot and segregation analysis. In conclusion, our results show the applicability of the ISSI system for precise and targeted Agrobacterium-mediated integration, allowing the serial integration of transgenic DNA sequences in plants. PMID:23574114

  16. A physical map of the highly heterozygous Populus genome: integration with the genome sequence and genetic map

    SciTech Connect

    Kelleher, Colin; CHIU, Dr. R.; Shin, Dr. H.; Krywinski, Martin; Fjell, Chris; Wilkin, Jennifer; Yin, Tongming; Difazio, Stephen P.

    2007-01-01

    As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 {+-} 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.

  17. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Cucurbitaceae includes important crops as cucumber, melon, watermelon, and squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular li...

  18. Integrating genomics and plant breeding: whither the breeders?

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Plant breeding has been practiced >5,000 years as an art and >100 years as a science. Selection provides the means where populations are improved for product, such as yield or composition, or for crop protection, such as pest and stress resistance. Such activities have not required use of genomic te...

  19. Integrated genomic approaches to enhance genetic resistance in chickens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The chicken has led the way amongst agricultural animal species in infectious disease control and, in particular, selection for genetic resistance. The generation of the chicken genome sequence and the availability of other empowering tools and resources greatly enhance the ability to select for enh...

  20. Integrating genomics into applied tropical fruit breeding programs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The plant genetics group at the SHRS is divided into three CRIS projects. All three are in the thematic National Program (NP) 301, Plant Microbial and Insect Genetic Resources, Genomics and Genetic Improvement. A major germplasm/breeding CRIS was established in 1998 for improving and preserving orna...

  1. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    SciTech Connect

    TIEDJE, JAMES M; KONSTANTINIDIS, KOSTAS; WORDEN, MARK

    2014-01-08

    The aim of the work reported is to study Shewanella population genomics, and to understand the evolution, ecophysiology, and speciation of Shewanella. The tasks supporting this aim are: to study genetic and ecophysiological bases defining the core and diversification of Shewanella species; to determine gene content patterns along redox gradients; and to Investigate the evolutionary processes, patterns and mechanisms of Shewanella.

  2. Integrated genome-based studies of Shewanella ecophysiology

    SciTech Connect

    Segre Daniel; Beg Qasim

    2012-02-14

    This project was a component of the Shewanella Federation and, as such, contributed to the overall goal of applying the genomic tools to better understand eco-physiology and speciation of respiratory-versatile members of Shewanella genus. Our role at Boston University was to perform bioreactor and high throughput gene expression microarrays, and combine dynamic flux balance modeling with experimentally obtained transcriptional and gene expression datasets from different growth conditions. In the first part of project, we designed the S. oneidensis microarray probes for Affymetrix Inc. (based in California), then we identified the pathways of carbon utilization in the metal-reducing marine bacterium Shewanella oneidensis MR-1, using our newly designed high-density oligonucleotide Affymetrix microarray on Shewanella cells grown with various carbon sources. Next, using a combination of experimental and computational approaches, we built algorithm and methods to integrate the transcriptional and metabolic regulatory networks of S. oneidensis. Specifically, we combined mRNA microarray and metabolite measurements with statistical inference and dynamic flux balance analysis (dFBA) to study the transcriptional response of S. oneidensis MR-1 as it passes through exponential, stationary, and transition phases. By measuring time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different nutrient compositions (minimal lactate and nutritionally rich LB medium), we obtain detailed snapshots of the regulatory strategies used by this bacterium to cope with gradually changing nutrient availability. In addition to traditional clustering, which provides a first indication of major regulatory trends and transcription factors activities, we developed and implemented a new computational approach for Dynamic Detection of Transcriptional Triggers (D2T2). This new method allows us to infer a putative topology of transcriptional dependencies

  3. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells.

    PubMed

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H

    2015-09-22

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  4. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells

    PubMed Central

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H.

    2015-01-01

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis. PMID:26324940

  5. Unusual RNA plant virus integration in the soybean genome leads to the production of small RNAs.

    PubMed

    da Fonseca, Guilherme Cordenonsi; de Oliveira, Luiz Felipe Valter; de Morais, Guilherme Loss; Abdelnor, Ricardo Vilela; Nepomuceno, Alexandre Lima; Waterhouse, Peter M; Farinelli, Laurent; Margis, Rogerio

    2016-05-01

    Horizontal gene transfer (HGT) is known to be a major force in genome evolution. The acquisition of genes from viruses by eukaryotic genomes is a well-studied example of HGT, including rare cases of non-retroviral RNA virus integration. The present study describes the integration of cucumber mosaic virus RNA-1 into soybean genome. After an initial metatranscriptomic analysis of small RNAs derived from soybean, the de novo assembly resulted a 3029-nt contig homologous to RNA-1. The integration of this sequence in the soybean genome was confirmed by DNA deep sequencing. The locus where the integration occurred harbors the full RNA-1 sequence followed by the partial sequence of an endogenous mRNA and another sequence of RNA-1 as an inverted repeat and allowing the formation of a hairpin structure. This region recombined into a retrotransposon located inside an exon of a soybean gene. The nucleotide similarity of the integrated sequence compared to other Cucumber mosaic virus sequences indicates that the integration event occurred recently. We described a rare event of non-retroviral RNA virus integration in soybean that leads to the production of a double-stranded RNA in a similar fashion to virus resistance RNAi plants. PMID:26993236

  6. HiCPlotter integrates genomic data with interaction matrices.

    PubMed

    Akdemir, Kadir Caner; Chin, Lynda

    2015-01-01

    Metazoan genomic material is folded into stable non-randomly arranged chromosomal structures that are tightly associated with transcriptional regulation and DNA replication. Various factors including regulators of pluripotency, long non-coding RNAs, or the presence of architectural proteins have been implicated in regulation and assembly of the chromatin architecture. Therefore, comprehensive visualization of this multi-faceted structure is important to unravel the connections between nuclear architecture and transcriptional regulation. Here, we present an easy-to-use open-source visualization tool, HiCPlotter, to facilitate juxtaposition of Hi-C matrices with diverse genomic assay outputs, as well as to compare interaction matrices between various conditions. https://github.com/kcakdemir/HiCPlotter. PMID:26392354

  7. Breaking bad: R-loops and genome integrity.

    PubMed

    Sollier, Julie; Cimprich, Karlene A

    2015-09-01

    R-loops, nucleic acid structures consisting of an RNA-DNA hybrid and displaced single-stranded (ss) DNA, are ubiquitous in organisms from bacteria to mammals. First described in bacteria where they initiate DNA replication, it now appears that R-loops regulate diverse cellular processes such as gene expression, immunoglobulin (Ig) class switching, and DNA repair. Changes in R-loop regulation induce DNA damage and genome instability, and recently it was shown that R-loops are associated with neurodegenerative disorders. We discuss recent developments in the field; in particular, the regulation and effects of R-loops in cells, their effect on genomic and epigenomic stability, and their potential contribution to the origin of diseases including cancer and neurodegenerative disorders. PMID:26045257

  8. Genome-wide siRNA screen for mediators of NF-κB activation

    PubMed Central

    Gewurz, Benjamin E.; Towfic, Fadi; Mar, Jessica C.; Shinners, Nicholas P.; Takasaki, Kaoru; Zhao, Bo; Cahir-McFarland, Ellen D.; Quackenbush, John; Xavier, Ramnik J.; Kieff, Elliott

    2012-01-01

    Although canonical NFκB is frequently critical for cell proliferation, survival, or differentiation, NFκB hyperactivation can cause malignant, inflammatory, or autoimmune disorders. Despite intensive study, mammalian NFκB pathway loss-of-function RNAi analyses have been limited to specific protein classes. We therefore undertook a human genome-wide siRNA screen for novel NFκB activation pathway components. Using an Epstein Barr virus latent membrane protein (LMP1) mutant, the transcriptional effects of which are canonical NFκB-dependent, we identified 155 proteins significantly and substantially important for NFκB activation in HEK293 cells. These proteins included many kinases, phosphatases, ubiquitin ligases, and deubiquinating enzymes not previously known to be important for NFκB activation. Relevance to other canonical NFκB pathways was extended by finding that 118 of the 155 LMP1 NF-κB activation pathway components were similarly important for IL-1β–, and 79 for TNFα–mediated NFκB activation in the same cells. MAP3K8, PIM3, and six other enzymes were uniquely relevant to LMP1-mediated NFκB activation. Most novel pathway components functioned upstream of IκB kinase complex (IKK) activation. Robust siRNA knockdown effects were confirmed for all mRNAs or proteins tested. Although multiple ZC3H-family proteins negatively regulate NFκB, ZC3H13 and ZC3H18 were activation pathway components. ZC3H13 was critical for LMP1, TNFα, and IL-1β NFκB-dependent transcription, but not for IKK activation, whereas ZC3H18 was critical for IKK activation. Down-modulators of LMP1 mediated NFκB activation were also identified. These experiments identify multiple targets to inhibit or stimulate LMP1-, IL-1β–, or TNFα–mediated canonical NFκB activation. PMID:22308454

  9. Genome maintenance and transcription integrity in aging and disease

    PubMed Central

    Wolters, Stefanie; Schumacher, Björn

    2013-01-01

    DNA damage contributes to cancer development and aging. Congenital syndromes that affect DNA repair processes are characterized by cancer susceptibility, developmental defects, and accelerated aging (Schumacher et al., 2008). DNA damage interferes with DNA metabolism by blocking replication and transcription. DNA polymerase blockage leads to replication arrest and can gives rise to genome instability. Transcription, on the other hand, is an essential process for utilizing the information encoded in the genome. DNA damage that interferes with transcription can lead to apoptosis and cellular senescence. Both processes are powerful tumor suppressors (Bartek and Lukas, 2007). Cellular response mechanisms to stalled RNA polymerase II complexes have only recently started to be uncovered. Transcription-coupled DNA damage responses might thus play important roles for the adjustments to DNA damage accumulation in the aging organism (Garinis et al., 2009). Here we review human disorders that are caused by defects in genome stability to explore the role of DNA damage in aging and disease. We discuss how the nucleotide excision repair system functions at the interface of transcription and repair and conclude with concepts how therapeutic targeting of transcription might be utilized in the treatment of cancer. PMID:23443494

  10. An integrated computational pipeline and database to support whole-genome sequence annotation

    PubMed Central

    Mungall, CJ; Misra, S; Berman, BP; Carlson, J; Frise, E; Harris, N; Marshall, B; Shu, S; Kaminker, JS; Prochnik, SE; Smith, CD; Smith, E; Tupy, JL; Wiel, C; Rubin, GM; Lewis, SE

    2002-01-01

    We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture. PMID:12537570

  11. Multiplex genomic walking: Integration of the wet lab and computer lab into a single prototyping environment

    SciTech Connect

    Gillevet, P.M.

    1993-12-31

    The authors are presently sequencing the entire genome of Mycoplasma capricolum, one of the smallest of free living organisms by a Multiplex Genomic Walking strategy. This technique involves the repetitive hybridization of sequencing membranes with oligonucleotide probes to acquire sequence data in discrete steps along the genome. The technique allows one to walk a genome in a directed manner eliminating the problems associated with random shotgun assembly. Furthermore, the repetitive stripping and hybridization process is relatively simple to reproduce and has the potential to be easily automated. The Genetic Data Environment (GDE), an X Windows based Graphic User Interface has allowed the seamless integration of a core multiple sequence editor with pre-existing external sequence analysis programs and internally developed programs into a single prototypic environment. This system has facilitated linkage of the 9 Harvard Genome Lab`s internal database and automated data control systems into one Graphic User Interface which can handle the archiving and analysis of both random fluorescent sequencing data and genomic walking data from the Mycoplasma project. Finally, it has facilitated the integration of the Genomic sequence data into a PROLOG database environment for the comparative analysis of Mycoplasma capricolum and other organisms.

  12. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de). PMID:15888676

  13. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system

    SciTech Connect

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D.; Markowitz, Victor M.; Kyrpides, Nikos C.

    2009-05-01

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across a statistically significant and phylogeneticaly diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate and explore gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  14. ITEP: An integrated toolkit for exploration of microbial pan-genomes

    PubMed Central

    2014-01-01

    Background Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. Results We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP’s capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network

  15. Genomic characterization of viral integration sites in HPV-related cancers.

    PubMed

    Bodelon, Clara; Untereiner, Michael E; Machiela, Mitchell J; Vinokurova, Svetlana; Wentzensen, Nicolas

    2016-11-01

    Persistent infection with carcinogenic human papillomaviruses (HPV) causes the majority of anogenital cancers and a subset of head and neck cancers. The HPV genome is frequently found integrated into the host genome of invasive cancers. The mechanisms of how it may promote disease progression are not well understood. Thoroughly characterizing integration events can provide insights into HPV carcinogenesis. Individual studies have reported limited number of integration sites in cell lines and human samples. We performed a systematic review of published integration sites in HPV-related cancers and conducted a pooled analysis to formally test for integration hotspots and genomic features enriched in integration events using data from the Encyclopedia of DNA Elements (ENCODE). Over 1,500 integration sites were reported in the literature, of which 90.8% (N = 1,407) were in human tissues. We found 10 cytobands enriched for integration events, three previously reported ones (3q28, 8q24.21 and 13q22.1) and seven additional ones (2q22.3, 3p14.2, 8q24.22, 14q24.1, 17p11.1, 17q23.1 and 17q23.2). Cervical infections with HPV18 were more likely to have breakpoints in 8q24.21 (p = 7.68 × 10(-4) ) than those with HPV16. Overall, integration sites were more likely to be in gene regions than expected by chance (p = 6.93 × 10(-9) ). They were also significantly closer to CpG regions, fragile sites, transcriptionally active regions and enhancers. Few integration events occurred within 50 Kb of known cervical cancer driver genes. This suggests that HPV integrates in accessible regions of the genome, preferentially genes and enhancers, which may affect the expression of target genes. PMID:27343048

  16. Efficient CRISPR/Cas9-Mediated Genome Editing in Mice by Zygote Electroporation of Nuclease.

    PubMed

    Qin, Wenning; Dion, Stephanie L; Kutny, Peter M; Zhang, Yingfan; Cheng, Albert W; Jillette, Nathaniel L; Malhotra, Ankit; Geurts, Aron M; Chen, Yi-Guang; Wang, Haoyi

    2015-06-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas) system is an adaptive immune system in bacteria and archaea that has recently been exploited for genome engineering. Mutant mice can be generated in one step through direct delivery of the CRISPR/Cas9 components into a mouse zygote. Although the technology is robust, delivery remains a bottleneck, as it involves manual injection of the components into the pronuclei or the cytoplasm of mouse zygotes, which is technically demanding and inherently low throughput. To overcome this limitation, we employed electroporation as a means to deliver the CRISPR/Cas9 components, including Cas9 messenger RNA, single-guide RNA, and donor oligonucleotide, into mouse zygotes and recovered live mice with targeted nonhomologous end joining and homology-directed repair mutations with high efficiency. Our results demonstrate that mice carrying CRISPR/Cas9-mediated targeted mutations can be obtained with high efficiency by zygote electroporation. PMID:25819794

  17. Site-specific in situ amplification of the integrated polyomavirus genome: a case for a context-specific over-replication model of gene amplification.

    PubMed

    Syu, L J; Fluck, M M

    1997-08-01

    The fate of the genome of the polyoma (Py) tumor virus following integration in the chromosomes of transformed rat FR3T3 cells was re-examined. The viral sequences were integrated at a single transformant-specific chromosomal site in each of 22 transformants tested. In situ amplification of the viral sequences was observed in 24 of 34 transformants analyzed. Large T antigen, the unique viral function involved in initiating DNA replication from the viral origin, was essential for the amplification process. There was an absolute requirement for a reiteration of viral sequences and the extent of the reiteration affected the degree of amplification. The reiteration may be important for homologous recombination-mediated resolution of in situ amplified sequences. Among 11 transformants harboring a 1 to 2 kb repeat, the degree of amplification was transformant-specific and varied over a wide range. At the high end of the spectrum, the genome copy number increased 1300-fold at steady state, while at the low end, amplification was below twofold. Some aspect of the host chromatin at the site integration that affected viral gene expression, also directly or indirectly modulated the amplification. Use of high-resolution electrophoresis for the analysis of the integrated amplified sequences revealed a recurring novel pattern, consisting of a ladder with numerous bands separated by a constant distance approximately the size of the Py genome. We suggest that this pattern was generated by conversion of the amplified viral genomes to head to tail linear arrays with cell to cell variations in the number of genome repeats at single, transformant-specific, chromosomal sites. In light of the known "out of schedule" firing of the Py origin, we propose an "onion skin" structure intermediate and present a homologous recombination model for the conversion from onion skins to linear arrays. The relevance of the in situ amplification of the Py genome to cellular gene amplification is

  18. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  19. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  20. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Cancer.gov

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  1. Natural bone fragmentation in the blind cave-dwelling fish, Astyanax mexicanus: candidate gene identification through integrative comparative genomics.

    PubMed

    Gross, Joshua B; Stahl, Bethany A; Powers, Amanda K; Carlson, Brian M

    2016-01-01

    Animals that colonize dark and nutrient-poor subterranean environments evolve numerous extreme phenotypes. These include dramatic changes to the craniofacial complex, many of which are under genetic control. These phenotypes can demonstrate asymmetric genetic signals wherein a QTL is detected on one side of the face but not the other. The causative gene(s) underlying QTL are difficult to identify with limited genomic resources. We approached this task by searching for candidate genes mediating fragmentation of the third suborbital bone (SO3) directly inferior to the orbit of the eye. We integrated positional genomic information using emerging Astyanax resources, and linked these intervals to homologous (syntenic) regions of the Danio rerio genome. We identified a discrete, approximately 6 Mb, conserved region wherein the gene causing SO3 fragmentation likely resides. We interrogated this interval for genes demonstrating significant differential expression using mRNA-seq analysis of cave and surface morphs across life history. We then assessed genes with known roles in craniofacial evolution and development based on GO term annotation. Finally, we screened coding sequence alterations in this region, identifying two key genes: transforming growth factor β3 (tgfb3) and bone morphogenetic protein 4 (bmp4). Of these candidates, tgfb3 is most promising as it demonstrates significant differential expression across multiple stages of development, maps close (<1 Mb) to the fragmentation critical locus, and is implicated in a variety of other animal systems (including humans) in non-syndromic clefting and malformations of the cranial sutures. Both abnormalities are analogous to the failure-to-fuse phenotype that we observe in SO3 fragmentation. This integrative approach will enable discovery of the causative genetic lesions leading to complex craniofacial features analogous to human craniofacial disorders. This work underscores the value of cave-dwelling fish as a

  2. VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites.

    PubMed

    Calabria, Andrea; Leo, Simone; Benedicenti, Fabrizio; Cesana, Daniela; Spinozzi, Giulio; Orsini, Massimilano; Merella, Stefania; Stupka, Elia; Zanetti, Gianluigi; Montini, Eugenio

    2014-01-01

    The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings. PMID:25342980

  3. The Plant Genome Integrative Explorer Resource: PlantGenIE.org.

    PubMed

    Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R

    2015-12-01

    Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. PMID:26192091

  4. A 4103 marker integrated physical and comparative map of the horse genome

    PubMed Central

    Raudsepp, Terje; Gustafson-Seabury, Ashley; Durkin, Keith; Wagner, Michelle L.; Goh, Glenda; Seabury, Christopher M.; Brinkmeyer-Langford, Candice; Lee, Eun-Joon; Agarwala, Richa; Rice, Edward Stallknecht; Schäffer, Alejandro A.; Skow, Loren C.; Tozaki, Teruaki; Yasue, Hiroshi; Penedo, M. Cecilia T.; Lyons, Leslie A.; Khazanehdari, Kamal A.; Binns, Matthew M.; MacLeod, James N.; Distl, Ottmar; Guérin, Gérard; Leeb, Tosso; Mickelson, James R.; Chowdhary, Bhanu P.

    2008-01-01

    A comprehensive second-generation whole genome radiation hybrid (RH II), cytogenetic and comparative map of the horse genome (2n=64) has been developed using the 5000rad horse × hamster radiation hybrid panel and fluorescence in situ hybridization (FISH). The map contains 4,103 markers (3,816 RH, 1,144 FISH) assigned to all 31 pairs of autosomes and the X chromosome. The RH maps of individual chromosomes are anchored and oriented using 857 cytogenetic markers. The overall resolution of the map is one marker per 775 kilobase-pairs (kb), which represents a more than five-fold improvement over the first-generation map. The RH II incorporates 920 markers shared jointly with the two recently reported meiotic maps. Consequently the two maps were aligned with the RH II maps of individual autosomes and the X chromosome. Additionally, a comparative map of the horse genome was generated by connecting 1,904 loci on the horse map with genome sequences available for eight diverse vertebrates to highlight regions of evolutionarily conserved syntenies, linkages and chromosomal breakpoints. The integrated map thus obtained presents the most comprehensive information on the physical and comparative organization of the equine genome and will assist future assemblies of whole genome BAC fingerprint maps and the genome sequence. It will also serve as a tool to identify genes governing health, disease and performance traits in horses and assist us in understanding the evolution of the equine genome in relation to other species. PMID:18931483

  5. Tomato genomic resources database: an integrated repository of useful tomato genomic information for basic and applied research.

    PubMed

    Suresh, B Venkata; Roy, Riti; Sahu, Kamlesh; Misra, Gopal; Chattopadhyay, Debasis

    2014-01-01

    Tomato Genomic Resources Database (TGRD) allows interactive browsing of tomato genes, micro RNAs, simple sequence repeats (SSRs), important quantitative trait loci and Tomato-EXPEN 2000 genetic map altogether or separately along twelve chromosomes of tomato in a single window. The database is created using sequence of the cultivar Heinz 1706. High quality single nucleotide polymorphic (SNP) sites between the genes of Heinz 1706 and the wild tomato S. pimpinellifolium LA1589 are also included. Genes are classified into different families. 5'-upstream sequences (5'-US) of all the genes and their tissue-specific expression profiles are provided. Sequences of the microRNA loci and their putative target genes are catalogued. Genes and 5'-US show presence of SSRs and SNPs. SSRs located in the genomic, genic and 5'-US can be analysed separately for the presence of any particular motif. Primer sequences for all the SSRs and flanking sequences for all the genic SNPs have been provided. TGRD is a user-friendly web-accessible relational database and uses CMAP viewer for graphical scanning of all the features. Integration and graphical presentation of important genomic information will facilitate better and easier use of tomato genome. TGRD can be accessed as an open source repository at http://59.163.192.91/tomato2/. PMID:24466070

  6. Tomato Genomic Resources Database: An Integrated Repository of Useful Tomato Genomic Information for Basic and Applied Research

    PubMed Central

    Suresh, B. Venkata; Roy, Riti; Sahu, Kamlesh; Misra, Gopal; Chattopadhyay, Debasis

    2014-01-01

    Tomato Genomic Resources Database (TGRD) allows interactive browsing of tomato genes, micro RNAs, simple sequence repeats (SSRs), important quantitative trait loci and Tomato-EXPEN 2000 genetic map altogether or separately along twelve chromosomes of tomato in a single window. The database is created using sequence of the cultivar Heinz 1706. High quality single nucleotide polymorphic (SNP) sites between the genes of Heinz 1706 and the wild tomato S. pimpinellifolium LA1589 are also included. Genes are classified into different families. 5′-upstream sequences (5′-US) of all the genes and their tissue-specific expression profiles are provided. Sequences of the microRNA loci and their putative target genes are catalogued. Genes and 5′-US show presence of SSRs and SNPs. SSRs located in the genomic, genic and 5′-US can be analysed separately for the presence of any particular motif. Primer sequences for all the SSRs and flanking sequences for all the genic SNPs have been provided. TGRD is a user-friendly web-accessible relational database and uses CMAP viewer for graphical scanning of all the features. Integration and graphical presentation of important genomic information will facilitate better and easier use of tomato genome. TGRD can be accessed as an open source repository at http://59.163.192.91/tomato2/. PMID:24466070

  7. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology.

    PubMed

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho; Park, Kwang-Wook

    2016-03-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs. PMID:27051344

  8. Production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology

    PubMed Central

    Kang, Jung-Taek; Kwon, Dae-Kee; Park, A-Rum; Lee, Eun-Jin; Yun, Yun-Jin; Ji, Dal-Young; Lee, Kiho

    2016-01-01

    Recent developments in genome editing technology using meganucleases demonstrate an efficient method of producing gene edited pigs. In this study, we examined the effectiveness of the transcription activator-like effector nuclease (TALEN) system in generating specific mutations on the pig genome. Specific TALEN was designed to induce a double-strand break on exon 9 of the porcine α1,3-galactosyltransferase (GGTA1) gene as it is the main cause of hyperacute rejection after xenotransplantation. Human decay-accelerating factor (hDAF) gene, which can produce a complement inhibitor to protect cells from complement attack after xenotransplantation, was also integrated into the genome simultaneously. Plasmids coding for the TALEN pair and hDAF gene were transfected into porcine cells by electroporation to disrupt the porcine GGTA1 gene and express hDAF. The transfected cells were then sorted using a biotin-labeled IB4 lectin attached to magnetic beads to obtain GGTA1 deficient cells. As a result, we established GGTA1 knockout (KO) cell lines with biallelic modification (35.0%) and GGTA1 KO cell lines expressing hDAF (13.0%). When these cells were used for somatic cell nuclear transfer, we successfully obtained live GGTA1 KO pigs expressing hDAF. Our results demonstrate that TALEN-mediated genome editing is efficient and can be successfully used to generate gene edited pigs. PMID:27051344

  9. Integrating genetic and genomic information into effective cancer care in diverse populations

    PubMed Central

    Fashoyin-Aje, L.; Sanghavi, K.; Bjornard, K.; Bodurtha, J.

    2013-01-01

    This paper provides an overview of issues in the integration of genetic (related to hereditary DNA) and genomic (related to genes and their functions) information in cancer care for individuals and families who are part of health care systems worldwide, from low to high resourced. National and regional cancer plans have the potential to integrate genetic and genomic information with a goal of identifying and helping individuals and families with and at risk of cancer. Healthcare professionals and the public have the opportunity to increase their genetic literacy and communication about cancer family history to enhance cancer control, prevention, and tailored therapies. PMID:24001763

  10. Tc1-like Transposase Thm3 of Silver Carp (Hypophthalmichthys molitrix) Can Mediate Gene Transposition in the Genome of Blunt Snout Bream (Megalobrama amblycephala)

    PubMed Central

    Guo, Xiu-Ming; Zhang, Qian-Qian; Sun, Yi-Wen; Jiang, Xia-Yun; Zou, Shu-Ming

    2015-01-01

    Tc1-like transposons consist of an inverted repeat sequence flanking a transposase gene that exhibits similarity to the mobile DNA element, Tc1, of the nematode, Caenorhabditis elegans. They are widely distributed within vertebrate genomes including teleost fish; however, few active Tc1-like transposases have been discovered. In this study, 17 Tc1-like transposon sequences were isolated from 10 freshwater fish species belonging to the families Cyprinidae, Adrianichthyidae, Cichlidae, and Salmonidae. We conducted phylogenetic analyses of these sequences using previously isolated Tc1-like transposases and report that 16 of these elements comprise a new subfamily of Tc1-like transposons. In particular, we show that one transposon, Thm3 from silver carp (Hypophthalmichthys molitrix; Cyprinidae), can encode a 335-aa transposase with apparently intact domains, containing three to five copies in its genome. We then coinjected donor plasmids harboring 367 bp of the left end and 230 bp of the right end of the nonautonomous silver carp Thm1 cis-element along with capped Thm3 transposase RNA into the embryos of blunt snout bream (Megalobrama amblycephala; one- to two-cell embryos). This experiment revealed that the average integration rate could reach 50.6% in adult fish. Within the blunt snout bream genome, the TA dinucleotide direct repeat, which is the signature of Tc1-like family of transposons, was created adjacent to both ends of Thm1 at the integration sites. Our results indicate that the silver carp Thm3 transposase can mediate gene insertion by transposition within the genome of blunt snout bream genome, and that this occurs with a TA position preference. PMID:26438298