Science.gov

Sample records for cancer genome sequences

  1. Cancer genome-sequencing study design.

    PubMed

    Mwenifumbo, Jill C; Marra, Marco A

    2013-05-01

    Discoveries from cancer genome sequencing have the potential to translate into advances in cancer prevention, diagnostics, prognostics, treatment and basic biology. Given the diversity of downstream applications, cancer genome-sequencing studies need to be designed to best fulfil specific aims. Knowledge of second-generation cancer genome-sequencing study design also facilitates assessment of the validity and importance of the rapidly growing number of published studies. In this Review, we focus on the practical application of second-generation sequencing technology (also known as next-generation sequencing) to cancer genomics and discuss how aspects of study design and methodological considerations - such as the size and composition of the discovery cohort - can be tailored to serve specific research aims.

  2. Comprehensive genome sequencing of the liver cancer genome.

    PubMed

    Nakagawa, Hidewaki; Shibata, Tatsuhiro

    2013-11-01

    Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related death worldwide. Recently, comprehensive whole genome and exome sequencing analyses for HCC revealed new cancer-associated genes and a variety of genomic alterations. In particular, frequent genetic alterations of the chromatin remodeling genes were observed, suggesting a new potential therapeutic target for HCC. Sequencing analysis has further identified the molecular complexities of multicentric lesions and intratumoral heterogeneity. Detailed analyses of the somatic substitution pattern of the cancer genome and the HBV virus genome integration sites by using whole-genome sequencing will elucidate the molecular basis and diverse etiological factors involved in liver cancer development.

  3. Cancer whole-genome sequencing: present and future.

    PubMed

    Nakagawa, H; Wardell, C P; Furuta, M; Taniguchi, H; Fujimoto, A

    2015-12-03

    Recent explosive advances in next-generation sequencing technology and computational approaches to massive data enable us to analyze a number of cancer genome profiles by whole-genome sequencing (WGS). To explore cancer genomic alterations and their diversity comprehensively, global and local cancer genome-sequencing projects, including ICGC and TCGA, have been analyzing many types of cancer genomes mainly by exome sequencing. However, there is limited information on somatic mutations in non-coding regions including untranslated regions, introns, regulatory elements and non-coding RNAs, and rearrangements, sometimes producing fusion genes, and pathogen detection in cancer genomes remain widely unexplored. WGS approaches can detect these unexplored mutations, as well as coding mutations and somatic copy number alterations, and help us to better understand the whole landscape of cancer genomes and elucidate functions of these unexplored genomic regions. Analysis of cancer genomes using the present WGS platforms is still primitive and there are substantial improvements to be made in sequencing technologies, informatics and computer resources. Taking account of the extreme diversity of cancer genomes and phenotype, it is also required to analyze much more WGS data and integrate these with multi-omics data, functional data and clinical-pathological data in a large number of sample sets to interpret them more fully and efficiently.

  4. Optimizing cancer genome sequencing and analysis

    PubMed Central

    Griffith, Malachi; Miller, Christopher A.; Griffith, Obi L.; Krysiak, Kilannin; Skidmore, Zachary L.; Ramu, Avinash; Walker, Jason R.; Dang, Ha X.; Trani, Lee; Larson, David E.; Demeter, Ryan T.; Wendl, Michael C.; McMichael, Joshua F.; Austin, Rachel E.; Magrini, Vincent; McGrath, Sean D.; Ly, Amy; Kulkarni, Shashikant; Cordes, Matthew G.; Fronick, Catrina C.; Fulton, Robert S.; Maher, Christopher A.; Ding, Li; Klco, Jeffery M.; Mardis, Elaine R.; Ley, Timothy J.; Wilson, Richard K.

    2015-01-01

    Summary Tumors are typically sequenced to depths of 75–100× (exome) or 30–50× (whole genome). We demonstrate that current sequencing paradigms are inadequate for tumors that are impure, aneuploid or clonally heterogeneous. To reassess optimal sequencing strategies, we performed ultra-deep (up to ~312×) whole genome sequencing (WGS) and exome capture (up to ~433×) of a primary acute myeloid leukemia, its subsequent relapse, and a matched normal skin sample. We tested multiple alignment and variant calling algorithms and validated ~200,000 putative SNVs by sequencing them to depths of ~1,000×. Additional targeted sequencing provided over 10,000× coverage and ddPCR assays provided up to ~250,000× sampling of selected sites. We evaluated the effects of different library generation approaches, depth of sequencing, and analysis strategies on the ability to effectively characterize a complex tumor. This dataset, representing the most comprehensively sequenced tumor described to date, will serve as an invaluable community resource (dbGaP accession id phs000159). PMID:26645048

  5. Cancer Genome Sequencing: Understanding Malignancy as a Disease of the Genome, its Conformation, and its Evolution

    PubMed Central

    Patel, Lalit R.; Nykter, Matti; Chen, Kexin; Zhang, Wei

    2013-01-01

    Advances in cancer genomics have been propelled by the steady evolution of molecular profiling technologies. Over the past decade, high-throughput sequencing technologies have matured to the point necessary to support disease-specific shotgun sequencing. This has compelled whole-genome sequencing studies across a broad panel of malignancies. The emergence of high-throughput sequencing technologies has inspired new chemical and computational techniques enabling interrogation of cancer-specific genomic and transcriptomic variants, previously unannotated genes, and chromatin structure. Finally, recent progress in single-cell sequencing holds great promise for studies interrogating the consequences of tumor evolution in cancers presenting with genomic heterogeneity. PMID:23111104

  6. The clinical potential and challenges of sequencing cancer genomes for personalized medical genomics.

    PubMed

    Cloonan, Nicole; Waddell, Nic; Grimmond, Sean M

    2010-11-01

    Next-generation sequencing is revolutionizing the way in which genomic-scale biological research is performed, and its effects are beginning to be translated medically. Large-scale international collaborations for the comprehensive sequencing of the genome, epigenome, and transcriptomes of cancers and corresponding 'normal' (germ-line) DNA are heralding the start of personalized medical genomics. The promise of eliminating conjecture when determining treatment approaches is certainly appealing for both patients and clinicians; however, several major issues must be resolved before next-generation sequencing will be adopted as a routine clinical tool for patients. This feature review explores the clinical potential and challenges of studying cancer genomes for personalized medical genomics.

  7. Returning individual research results for genome sequences of pancreatic cancer

    PubMed Central

    2014-01-01

    Background Disclosure of individual results to participants in genomic research is a complex and contentious issue. There are many existing commentaries and opinion pieces on the topic, but little empirical data concerning actual cases describing how individual results have been returned. Thus, the real life risks and benefits of disclosing individual research results to participants are rarely if ever presented as part of this debate. Methods The Australian Pancreatic Cancer Genome Initiative (APGI) is an Australian contribution to the International Cancer Genome Consortium (ICGC), that involves prospective sequencing of tumor and normal genomes of study participants with pancreatic cancer in Australia. We present three examples that illustrate different facets of how research results may arise, and how they may be returned to individuals within an ethically defensible and clinically practical framework. This framework includes the necessary elements identified by others including consent, determination of the significance of results and which to return, delineation of the responsibility for communication and the clinical pathway for managing the consequences of returning results. Results Of 285 recruited patients, we returned results to a total of 25 with no adverse events to date. These included four that were classified as medically actionable, nine as clinically significant and eight that were returned at the request of the treating clinician. Case studies presented depict instances where research results impacted on cancer susceptibility, current treatment and diagnosis, and illustrate key practical challenges of developing an effective framework. Conclusions We suggest that return of individual results is both feasible and ethically defensible but only within the context of a robust framework that involves a close relationship between researchers and clinicians. PMID:24963353

  8. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    PubMed

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes.

  9. Predictive genomics: a cancer hallmark network framework for predicting tumor clinical phenotypes using genome sequencing data.

    PubMed

    Wang, Edwin; Zaman, Naif; Mcgee, Shauna; Milanese, Jean-Sébastien; Masoudi-Nejad, Ali; O'Connor-McCourt, Maureen

    2015-02-01

    Tumor genome sequencing leads to documenting thousands of DNA mutations and other genomic alterations. At present, these data cannot be analyzed adequately to aid in the understanding of tumorigenesis and its evolution. Moreover, we have little insight into how to use these data to predict clinical phenotypes and tumor progression to better design patient treatment. To meet these challenges, we discuss a cancer hallmark network framework for modeling genome sequencing data to predict cancer clonal evolution and associated clinical phenotypes. The framework includes: (1) cancer hallmarks that can be represented by a few molecular/signaling networks. 'Network operational signatures' which represent gene regulatory logics/strengths enable to quantify state transitions and measures of hallmark traits. Thus, sets of genomic alterations which are associated with network operational signatures could be linked to the state/measure of hallmark traits. The network operational signature transforms genotypic data (i.e., genomic alterations) to regulatory phenotypic profiles (i.e., regulatory logics/strengths), to cellular phenotypic profiles (i.e., hallmark traits) which lead to clinical phenotypic profiles (i.e., a collection of hallmark traits). Furthermore, the framework considers regulatory logics of the hallmark networks under tumor evolutionary dynamics and therefore also includes: (2) a self-promoting positive feedback loop that is dominated by a genomic instability network and a cell survival/proliferation network is the main driver of tumor clonal evolution. Surrounding tumor stroma and its host immune systems shape the evolutionary paths; (3) cell motility initiating metastasis is a byproduct of the above self-promoting loop activity during tumorigenesis; (4) an emerging hallmark network which triggers genome duplication dominates a feed-forward loop which in turn could act as a rate-limiting step for tumor formation; (5) mutations and other genomic alterations have

  10. Genome Sequencing.

    PubMed

    Verma, Mansi; Kulshrestha, Samarth; Puri, Ayush

    2017-01-01

    Genome sequencing is an important step toward correlating genotypes with phenotypic characters. Sequencing technologies are important in many fields in the life sciences, including functional genomics, transcriptomics, oncology, evolutionary biology, forensic sciences, and many more. The era of sequencing has been divided into three generations. First generation sequencing involved sequencing by synthesis (Sanger sequencing) and sequencing by cleavage (Maxam-Gilbert sequencing). Sanger sequencing led to the completion of various genome sequences (including human) and provided the foundation for development of other sequencing technologies. Since then, various techniques have been developed which can overcome some of the limitations of Sanger sequencing. These techniques are collectively known as "Next-generation sequencing" (NGS), and are further classified into second and third generation technologies. Although NGS methods have many advantages in terms of speed, cost, and parallelism, the accuracy and read length of Sanger sequencing is still superior and has confined the use of NGS mainly to resequencing genomes. Consequently, there is a continuing need to develop improved real time sequencing techniques. This chapter reviews some of the options currently available and provides a generic workflow for sequencing a genome.

  11. A somatic reference standard for cancer genome sequencing

    PubMed Central

    Craig, David W.; Nasser, Sara; Corbett, Richard; Chan, Simon K.; Murray, Lisa; Legendre, Christophe; Tembe, Waibhav; Adkins, Jonathan; Kim, Nancy; Wong, Shukmei; Baker, Angela; Enriquez, Daniel; Pond, Stephanie; Pleasance, Erin; Mungall, Andrew J.; Moore, Richard A.; McDaniel, Timothy; Ma, Yussanne; Jones, Steven J. M.; Marra, Marco A.; Carpten, John D.; Liang, Winnie S.

    2016-01-01

    Large-scale multiplexed identification of somatic alterations in cancer has become feasible with next generation sequencing (NGS). However, calibration of NGS somatic analysis tools has been hampered by a lack of tumor/normal reference standards. We thus performed paired PCR-free whole genome sequencing of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. We generated mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Results were combined with previously generated data allowing for comparison to a fourth lineage on earlier NGS technology. Aggregate variant detection led to the identification of consensus variants, including key events that represent hallmark mutation types including amplified BRAF V600E, a CDK2NA small deletion, a 12 kb PTEN deletion, and a dinucleotide TERT promoter substitution. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions. PMID:27094764

  12. A somatic reference standard for cancer genome sequencing.

    PubMed

    Craig, David W; Nasser, Sara; Corbett, Richard; Chan, Simon K; Murray, Lisa; Legendre, Christophe; Tembe, Waibhav; Adkins, Jonathan; Kim, Nancy; Wong, Shukmei; Baker, Angela; Enriquez, Daniel; Pond, Stephanie; Pleasance, Erin; Mungall, Andrew J; Moore, Richard A; McDaniel, Timothy; Ma, Yussanne; Jones, Steven J M; Marra, Marco A; Carpten, John D; Liang, Winnie S

    2016-04-20

    Large-scale multiplexed identification of somatic alterations in cancer has become feasible with next generation sequencing (NGS). However, calibration of NGS somatic analysis tools has been hampered by a lack of tumor/normal reference standards. We thus performed paired PCR-free whole genome sequencing of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. We generated mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Results were combined with previously generated data allowing for comparison to a fourth lineage on earlier NGS technology. Aggregate variant detection led to the identification of consensus variants, including key events that represent hallmark mutation types including amplified BRAF V600E, a CDK2NA small deletion, a 12 kb PTEN deletion, and a dinucleotide TERT promoter substitution. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.

  13. Analyzing Somatic Genome Rearrangements in Human Cancers by Using Whole-Exome Sequencing | Office of Cancer Genomics

    Cancer.gov

    Although exome sequencing data are generated primarily to detect single-nucleotide variants and indels, they can also be used to identify a subset of genomic rearrangements whose breakpoints are located in or near exons. Using >4,600 tumor and normal pairs across 15 cancer types, we identified over 9,000 high confidence somatic rearrangements, including a large number of gene fusions.

  14. Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer.

    PubMed

    Murchison, Elizabeth P; Schulz-Trieglaff, Ole B; Ning, Zemin; Alexandrov, Ludmil B; Bauer, Markus J; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R; Cheetham, R Keira; Cheng, William; Connor, Thomas R; Cox, Anthony J; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J; Harris, Simon R; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J; Wedge, David C; Woods, Gregory M; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M J; Carter, Nigel P; Papenfuss, Anthony T; Futreal, P Andrew; Campbell, Peter J; Yang, Fengtang; Bentley, David R; Evers, Dirk J; Stratton, Michael R

    2012-02-17

    The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations.

  15. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

    PubMed Central

    Alioto, Tyler S.; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D.; Hovig, Eivind; Heisler, Lawrence E.; Beck, Timothy A.; Simpson, Jared T.; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S.; Butler, Adam P.; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W.; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C.; Gut, Marta; Denroche, Robert E.; Harding, Nicholas J.; Yamaguchi, Takafumi N.; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G.; Anderson, Charlotte L.; Waddell, Nicola; Pearson, John V.; Grimmond, Sean M.; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A.; López-Otín, Carlos; Campo, Elías; Campbell, Peter J.; Boutros, Paul C.; Puente, Xose S.; Gerhard, Daniela S.; Pfister, Stefan M.; McPherson, John D.; Hudson, Thomas J.; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T. W.; Gut, Ivo G.

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  16. Whole genome sequencing defines the genetic heterogeneity of familial pancreatic cancer

    PubMed Central

    Roberts, Nicholas J.; Norris, Alexis L.; Petersen, Gloria M.; Bondy, Melissa L.; Brand, Randall; Gallinger, Steven; Kurtz, Robert C.; Olson, Sara H.; Rustgi, Anil K.; Schwartz, Ann G.; Stoffel, Elena; Syngal, Sapna; Zogopoulos, George; Ali, Syed Z.; Axilbund, Jennifer; Chaffee, Kari G.; Chen, Yun-Ching; Cote, Michele L.; Childs, Erica J.; Douville, Christopher; Goes, Fernando S.; Herman, Joseph M.; Iacobuzio-Donahue, Christine; Kramer, Melissa; Makohon-Moore, Alvin; McCombie, Richard W.; McMahon, K. Wyatt; Niknafs, Noushin; Parla, Jennifer; Pirooznia, Mehdi; Potash, James B.; Rhim, Andrew D.; Smith, Alyssa L.; Wang, Yuxuan; Wolfgang, Christopher L.; Wood, Laura D.; Zandi, Peter P.; Goggins, Michael; Karchin, Rachel; Eshleman, James R.; Papadopoulos, Nickolas; Kinzler, Kenneth W.; Vogelstein, Bert; Hruban, Ralph H.; Klein, Alison P.

    2015-01-01

    Pancreatic cancer is projected to become the second leading cause of cancer-related death in the United States by 2020. A familial aggregation of pancreatic cancer has been established, but the cause of this aggregation in most families is unknown. To determine the genetic basis of susceptibility in these families, we sequenced the germline genome of 638 familial pancreatic cancer patients. We also sequenced the exomes of 39 familial pancreatic adenocarcinomas. Our analyses support the role of previously identified familial pancreatic cancer susceptibility genes such as BRCA2, CDKN2A and ATM, and identify novel candidate genes harboring rare, deleterious germline variants for further characterization. We also show how somatic point mutations that occur during hematopoiesis can affect the interpretation of genome-wide studies of hereditary traits. Our observations have important implications for the etiology of pancreatic cancer and for the identification of susceptibility genes in other common cancer types. PMID:26658419

  17. Clinical applications of next generation sequencing in cancer: from panels, to exomes, to genomes

    PubMed Central

    Shen, Tony; Pajaro-Van de Stadt, Stefan Hans; Yeat, Nai Chien; Lin, Jimmy C.-H.

    2015-01-01

    This article will review recent impact of massively parallel next-generation sequencing (NGS) in our understanding and treatment of cancer. While whole exome sequencing (WES) remains popular and effective as a method of genetically profiling different cancers, advances in sequencing technology has enabled an increasing number of whole-genome based studies. Clinically, NGS has been used or is being developed for genetic screening, diagnostics, and clinical assessment. Though challenges remain, clinicians are in the early stages of using genetic data to make treatment decisions for cancer patients. As the integration of NGS in the study and treatment of cancer continues to mature, we believe that the field of cancer genomics will need to move toward more complete 100% genome sequencing. Current technologies and methods are largely limited to coding regions of the genome. A number of recent studies have demonstrated that mutations in non-coding regions may have direct tumorigenic effects or lead to genetic instability. Non-coding regions represent an important frontier in cancer genomics. PMID:26136771

  18. Tumor Genomic Profiling in Breast Cancer Patients Using Targeted Massively Parallel Sequencing

    DTIC Science & Technology

    2015-04-30

    exome sequencing to identify genomic mechanisms of therapeutic resistance The goal of this aim is perfotm whole exome sequencing in breast cancer...identify novel resistance mechanisms . Of the 72 patients described above, we have been able to obtain matched pre-treatment primary tissues from 28...advanced breast cancer, and novel strategies to overcome resistance mechanisms . If widely deployed, implementation of this approach may open new

  19. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer.

    PubMed

    Morrison, Carl D; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C; Johnson, Candace S; Trump, Donald L

    2014-02-11

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as "stitchers," to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication-licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer.

  20. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  1. Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events

    PubMed Central

    Liu, Jinfeng; Lee, William; Jiang, Zhaoshi; Chen, Zhongqiang; Jhunjhunwala, Suchit; Haverty, Peter M.; Gnad, Florian; Guan, Yinghui; Gilbert, Houston N.; Stinson, Jeremy; Klijn, Christiaan; Guillory, Joseph; Bhatt, Deepali; Vartanian, Steffan; Walter, Kimberly; Chan, Jocelyn; Holcomb, Thomas; Dijkgraaf, Peter; Johnson, Stephanie; Koeman, Julie; Minna, John D.; Gazdar, Adi F.; Stern, Howard M.; Hoeflich, Klaus P.; Wu, Thomas D.; Settleman, Jeff; de Sauvage, Frederic J.; Gentleman, Robert C.; Neve, Richard M.; Stokoe, David; Modrusan, Zora; Seshagiri, Somasekar; Shames, David S.; Zhang, Zemin

    2012-01-01

    Lung cancer is a highly heterogeneous disease in terms of both underlying genetic lesions and response to therapeutic treatments. We performed deep whole-genome sequencing and transcriptome sequencing on 19 lung cancer cell lines and three lung tumor/normal pairs. Overall, our data show that cell line models exhibit similar mutation spectra to human tumor samples. Smoker and never-smoker cancer samples exhibit distinguishable patterns of mutations. A number of epigenetic regulators, including KDM6A, ASH1L, SMARCA4, and ATAD2, are frequently altered by mutations or copy number changes. A systematic survey of splice-site mutations identified 106 splice site mutations associated with cancer specific aberrant splicing, including mutations in several known cancer-related genes. RAC1b, an isoform of the RAC1 GTPase that includes one additional exon, was found to be preferentially up-regulated in lung cancer. We further show that its expression is significantly associated with sensitivity to a MAP2K (MEK) inhibitor PD-0325901. Taken together, these data present a comprehensive genomic landscape of a large number of lung cancer samples and further demonstrate that cancer-specific alternative splicing is a widespread phenomenon that has potential utility as therapeutic biomarkers. The detailed characterizations of the lung cancer cell lines also provide genomic context to the vast amount of experimental data gathered for these lines over the decades, and represent highly valuable resources for cancer biology. PMID:23033341

  2. The current use and attitudes towards tumor genome sequencing in breast cancer.

    PubMed

    Gingras, I; Sonnenblick, A; de Azambuja, E; Paesmans, M; Delaloge, S; Aftimos, Philippe; Piccart, M J; Sotiriou, C; Ignatiadis, M; Azim, H A

    2016-03-02

    There is increasing availability of technologies that can interrogate the genomic landscape of an individual tumor; however, their impact on daily practice remains uncertain. We conducted a 28-item survey to investigate the current attitudes towards the integration of tumor genome sequencing in breast cancer management. A link to the survey was communicated via newsletters of several oncological societies, and dedicated mailing by academic research groups. Multivariable logistic regression modeling was carried out to determine the relationship between predictors and outcomes. 215 physicians participated to the survey. The majority were medical oncologists (88%), practicing in Europe (70%) and working in academic institutions (66%). Tumor genome sequencing was requested by 82 participants (38%), of whom 21% reported low confidence in their genomic knowledge, and 56% considered tumor genome sequencing to be poorly accessible. In multivariable analysis, having time allocated to research (OR 3.37, 95% CI 1.84-6.15, p < 0.0001), working in Asia (OR 5.76, 95% CI 1.57 - 21.15, p = 0.01) and having institutional guidelines for molecular sequencing (OR 2.09, 95% 0.99-4.42, p = 0.05) were associated with a higher probability of use. In conclusion, our survey indicates that tumor genome sequencing is sometimes used, albeit not widely, in guiding management of breast cancer patients.

  3. The current use and attitudes towards tumor genome sequencing in breast cancer

    PubMed Central

    Gingras, I.; Sonnenblick, A.; de Azambuja, E.; Paesmans, M.; Delaloge, S.; Aftimos, Philippe; Piccart, M. J.; Sotiriou, C.; Ignatiadis, M.; Azim, H. A.

    2016-01-01

    There is increasing availability of technologies that can interrogate the genomic landscape of an individual tumor; however, their impact on daily practice remains uncertain. We conducted a 28-item survey to investigate the current attitudes towards the integration of tumor genome sequencing in breast cancer management. A link to the survey was communicated via newsletters of several oncological societies, and dedicated mailing by academic research groups. Multivariable logistic regression modeling was carried out to determine the relationship between predictors and outcomes. 215 physicians participated to the survey. The majority were medical oncologists (88%), practicing in Europe (70%) and working in academic institutions (66%). Tumor genome sequencing was requested by 82 participants (38%), of whom 21% reported low confidence in their genomic knowledge, and 56% considered tumor genome sequencing to be poorly accessible. In multivariable analysis, having time allocated to research (OR 3.37, 95% CI 1.84–6.15, p < 0.0001), working in Asia (OR 5.76, 95% CI 1.57 – 21.15, p = 0.01) and having institutional guidelines for molecular sequencing (OR 2.09, 95% 0.99–4.42, p = 0.05) were associated with a higher probability of use. In conclusion, our survey indicates that tumor genome sequencing is sometimes used, albeit not widely, in guiding management of breast cancer patients. PMID:26931736

  4. Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

    Cancer.gov

    Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.

  5. Complete Genome Sequence of Helicobacter pylori Strain 29CaP Isolated from a Mexican Patient with Gastric Cancer

    PubMed Central

    Mucito-Varela, Eduardo; Castillo-Rojas, Gonzalo; Cevallos, Miguel A.; Lozano, Luis; Merino, Enrique; López-Leal, Gamaliel

    2016-01-01

    Helicobacter pylori infection is a risk factor for the development of gastric cancer and other gastroduodenal diseases. We report here the complete genome sequence of H. pylori strain 29CaP, isolated from a Mexican patient with gastric cancer. The genomic data analysis revealed a cag-negative H. pylori strain that contains a prophage sequence. PMID:26769924

  6. Complete Genome Sequence of Helicobacter pylori Strain 29CaP Isolated from a Mexican Patient with Gastric Cancer.

    PubMed

    Mucito-Varela, Eduardo; Castillo-Rojas, Gonzalo; Cevallos, Miguel A; Lozano, Luis; Merino, Enrique; López-Leal, Gamaliel; López-Vidal, Yolanda

    2016-01-14

    Helicobacter pylori infection is a risk factor for the development of gastric cancer and other gastroduodenal diseases. We report here the complete genome sequence of H. pylori strain 29CaP, isolated from a Mexican patient with gastric cancer. The genomic data analysis revealed a cag-negative H. pylori strain that contains a prophage sequence.

  7. The Tip of the Iceberg: Clinical Implications of Genomic Sequencing Projects in Head and Neck Cancer

    PubMed Central

    Birkeland, Andrew C.; Ludwig, Megan L.; Meraj, Taha S.; Brenner, J. Chad; Prince, Mark E.

    2015-01-01

    Recent genomic sequencing studies have provided valuable insight into genetic aberrations in head and neck squamous cell carcinoma. Despite these great advances, certain hurdles exist in translating genomic findings to clinical care. Further correlation of genetic findings to clinical outcomes, additional analyses of subgroups of head and neck cancers and follow-up investigation into genetic heterogeneity are needed. While the development of targeted therapy trials is of key importance, numerous challenges exist in establishing and optimizing such programs. This review discusses potential upcoming steps for further genetic evaluation of head and neck cancers and implementation of genetic findings into precision medicine trials. PMID:26506389

  8. Breast cancer genomics from microarrays to massively parallel sequencing: paradigms and new insights.

    PubMed

    Ng, Charlotte K Y; Schultheis, Anne M; Bidard, Francois-Clement; Weigelt, Britta; Reis-Filho, Jorge S

    2015-02-23

    Rapid advancements in massively parallel sequencing methods have enabled the analysis of breast cancer genomes at an unprecedented resolution, which have revealed the remarkable heterogeneity of the disease. As a result, we now accept that despite originating in the breast, estrogen receptor (ER)-positive and ER-negative breast cancers are completely different diseases at the molecular level. It has become apparent that there are very few highly recurrently mutated genes such as TP53, PIK3CA, and GATA3, that no two breast cancers display an identical repertoire of somatic genetic alterations at base-pair resolution and that there might not be a single highly recurrently mutated gene that defines each of the "intrinsic" subtypes of breast cancer (ie, basal-like, HER2-enriched, luminal A, and luminal B). Breast cancer heterogeneity, however, extends beyond the diversity between tumors. There is burgeoning evidence to demonstrate that at least some primary breast cancers are composed of multiple, genetically diverse clones at diagnosis and that metastatic lesions may differ in their repertoire of somatic genetic alterations when compared with their respective primary tumors. Several biological phenomena may shape the reported intratumor genetic heterogeneity observed in breast cancers, including the different mutational processes and multiple types of genomic instability. Harnessing the emerging concepts of the diversity of breast cancer genomes and the phenomenon of intratumor genetic heterogeneity will be essential for the development of optimal methods for diagnosis, disease monitoring, and the matching of patients to the drugs that would benefit them the most.

  9. Return of Results from Genomic Sequencing: A Policy Discussion of Secondary Findings for Cancer Predisposition

    PubMed Central

    Johnson, Kimberly J.; Gehlert, Sarah

    2014-01-01

    Advances in DNA sequencing technology now allow for the rapid genome-wide identification of inherited and acquired genetic variants including those that have been identified as pathogenic alleles for a number of diseases including cancer. Whole genome and exome sequencing are increasingly becoming a part of both clinical practice and research studies. In 2013 the American College of Medical Genetics and Genomics (ACMG) recommended that results of pathogenic genetic variants in 56 genes, nearly half of which comprise cancer genes (including BRCA1, BRCA2, TP53, MLH1, MLH2, MSH6, PMS2, and APC),be returned to patients who have their genome sequenced independent of the purpose for the test. This recommendation has been highly controversial for several reasons, particularly the recommendation that individuals be returned secondary findings of disease causing variants for adult onset conditions regardless of age and without consideration of patient preferences. In addition, the policy regarding returning results of secondary findings from genomic sequencing studies in research settings is currently unclear. In response to these emerging ethical issues, the Washington University Brown School in St. Louis, MO, United Stateshosted a policy forum entitled “First do no harm: Genetic privacy in the age of genomic sequencing” on February 25th, 2014. The forum included a panel of experts to discuss their views on ethical issues related to return of results in both the clinical and research settings. In this report, we highlight key issues related to return of results from genome sequencing tests that emerged during the forum. PMID:25229012

  10. Home - The Cancer Genome Atlas - Cancer Genome - TCGA

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  11. Landscape of somatic mutations in 560 breast cancer whole-genome sequences

    SciTech Connect

    Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; Ramakrishna, Manasa; Glodzik, Dominik; Zou, Xueqing; Martincorena, Inigo; Alexandrov, Ludmil B.; Martin, Sancha; Wedge, David C.; Van Loo, Peter; Ju, Young Seok; Smid, Marcel; Brinkman, Arie B.; Morganella, Sandro; Aure, Miriam R.; Lingjærde, Ole Christian; Langerod, Anita; Ringner, Markus; Ahn, Sung -Min; Boyault, Sandrine; Brock, Jane E.; Broeks, Annegien; Butler, Adam; Desmedt, Christine; Dirix, Luc; Dronov, Serge; Fatima, Aquila; Foekens, John A.; Gerstung, Moritz; Hooijer, Gerrit K. J.; Jang, Se Jin; Jones, David R.; Kim, Hyung -Yong; King, Tari A.; Krishnamurthy, Savitri; Lee, Hee Jin; Lee, Jeong -Yeon; Li, Yilong; McLaren, Stuart; Menzies, Andrew; Mustonen, Ville; O’Meara, Sarah; Pauporte, Iris; Pivot, Xavier; Purdie, Colin A.; Raine, Keiran; Ramakrishnan, Kamna; Rodríguez-Gonzalez, F. German; Romieu, Gilles; Sieuwerts, Anieta M.; Simpson, Peter T.; Shepherd, Rebecca; Stebbings, Lucy; Stefansson, Olafur A.; Teague, Jon; Tommasi, Stefania; Treilleux, Isabelle; Van den Eynden, Gert G.; Vermeulen, Peter; Vincent-Salomon, Anne; Yates, Lucy; Caldas, Carlos; Veer, Laura van’t; Tutt, Andrew; Knappskog, Stian; Tan, Benita Kiat Tee; Jonkers, Jos; Borg, Ake; Ueno, Naoto T.; Sotiriou, Christos; Viari, Alain; Futreal, P. Andrew; Campbell, Peter J.; Span, Paul N.; Van Laere, Steven; Lakhani, Sunil R.; Eyfjord, Jorunn E.; Thompson, Alastair M.; Birney, Ewan; Stunnenberg, Hendrik G.; van de Vijver, Marc J.; Martens, John W. M.; Borresen-Dale, Anne -Lise; Richardson, Andrea L.; Kong, Gu; Thomas, Gilles; Stratton, Michael R.

    2016-05-02

    Here, we analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.

  12. Landscape of somatic mutations in 560 breast cancer whole genome sequences

    PubMed Central

    Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; Ramakrishna, Manasa; Glodzik, Dominik; Zou, Xueqing; Martincorena, Inigo; Alexandrov, Ludmil B.; Martin, Sancha; Wedge, David C.; Van Loo, Peter; Ju, Young Seok; Smid, Marcel; Brinkman, Arie B; Morganella, Sandro; Aure, Miriam R.; Lingjærde, Ole Christian; Langerød, Anita; Ringnér, Markus; Ahn, Sung-Min; Boyault, Sandrine; Brock, Jane E.; Broeks, Annegien; Butler, Adam; Desmedt, Christine; Dirix, Luc; Dronov, Serge; Fatima, Aquila; Foekens, John A.; Gerstung, Moritz; Hooijer, Gerrit KJ; Jang, Se Jin; Jones, David R.; Kim, Hyung-Yong; King, Tari A.; Krishnamurthy, Savitri; Lee, Hee Jin; Lee, Jeong-Yeon; Li, Yilong; McLaren, Stuart; Menzies, Andrew; Mustonen, Ville; O’Meara, Sarah; Pauporté, Iris; Pivot, Xavier; Purdie, Colin A.; Raine, Keiran; Ramakrishnan, Kamna; Rodríguez-González, F. Germán; Romieu, Gilles; Sieuwerts, Anieta M.; Simpson, Peter T; Shepherd, Rebecca; Stebbings, Lucy; Stefansson, Olafur A; Teague, Jon; Tommasi, Stefania; Treilleux, Isabelle; Van den Eynden, Gert G.; Vermeulen, Peter; Vincent-Salomon, Anne; Yates, Lucy; Caldas, Carlos; van’t Veer, Laura; Tutt, Andrew; Knappskog, Stian; Tan, Benita Kiat Tee; Jonkers, Jos; Borg, Åke; Ueno, Naoto T; Sotiriou, Christos; Viari, Alain; Futreal, P. Andrew; Campbell, Peter J; Span, Paul N.; Van Laere, Steven; Lakhani, Sunil R; Eyfjord, Jorunn E.; Thompson, Alastair M.; Birney, Ewan; Stunnenberg, Hendrik G; van de Vijver, Marc J; Martens, John W.M.; Børresen-Dale, Anne-Lise; Richardson, Andrea L.; Kong, Gu; Thomas, Gilles; Stratton, Michael R.

    2016-01-01

    We analysed whole genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. 93 protein-coding cancer genes carried likely driver mutations. Some non-coding regions exhibited high mutation frequencies but most have distinctive structural features probably causing elevated mutation rates and do not harbour driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed 12 base substitution and six rearrangement signatures. Three rearrangement signatures, characterised by tandem duplications or deletions, appear associated with defective homologous recombination based DNA repair: one with deficient BRCA1 function; another with deficient BRCA1 or BRCA2 function; the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operative, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer. PMID:27135926

  13. Clinical genomics information management software linking cancer genome sequence and clinical decisions.

    PubMed

    Watt, Stuart; Jiao, Wei; Brown, Andrew M K; Petrocelli, Teresa; Tran, Ben; Zhang, Tong; McPherson, John D; Kamel-Reid, Suzanne; Bedard, Philippe L; Onetto, Nicole; Hudson, Thomas J; Dancey, Janet; Siu, Lillian L; Stein, Lincoln; Ferretti, Vincent

    2013-09-01

    Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician.

  14. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine

    PubMed Central

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer. PMID:24479672

  15. Sequencing technologies and genome sequencing.

    PubMed

    Pareek, Chandra Shekhar; Smoczynski, Rafal; Tretyn, Andrzej

    2011-11-01

    The high-throughput - next generation sequencing (HT-NGS) technologies are currently the hottest topic in the field of human and animals genomics researches, which can produce over 100 times more data compared to the most sophisticated capillary sequencers based on the Sanger method. With the ongoing developments of high throughput sequencing machines and advancement of modern bioinformatics tools at unprecedented pace, the target goal of sequencing individual genomes of living organism at a cost of $1,000 each is seemed to be realistically feasible in the near future. In the relatively short time frame since 2005, the HT-NGS technologies are revolutionizing the human and animal genome researches by analysis of chromatin immunoprecipitation coupled to DNA microarray (ChIP-chip) or sequencing (ChIP-seq), RNA sequencing (RNA-seq), whole genome genotyping, genome wide structural variation, de novo assembling and re-assembling of genome, mutation detection and carrier screening, detection of inherited disorders and complex human diseases, DNA library preparation, paired ends and genomic captures, sequencing of mitochondrial genome and personal genomics. In this review, we addressed the important features of HT-NGS like, first generation DNA sequencers, birth of HT-NGS, second generation HT-NGS platforms, third generation HT-NGS platforms: including single molecule Heliscope™, SMRT™ and RNAP sequencers, Nanopore, Archon Genomics X PRIZE foundation, comparison of second and third HT-NGS platforms, applications, advances and future perspectives of sequencing technologies on human and animal genome research.

  16. Comprehensive genomic sequencing and the molecular profiles of clinically advanced breast cancer.

    PubMed

    Ross, Jeffrey S; Gay, Laurie M

    2017-02-01

    Targeting specific mutations that have arisen within a tumour is a promising means of increasing the efficacy of treatments, and breast cancer is no exception to this new paradigm of personalised medicine. Traditional DNA sequencing methods used to characterise clinical cancer specimens and impact treatment decisions are highly sensitive, but are often limited in their scope to known mutational hot spots. Next-generation sequencing (NGS) technologies can also test for these well-known hot spots, as well as identifying insertions and deletions, copy number changes such as ERBB2 (HER2) gene amplification, and a wide array of fusion or rearrangement events. By rapidly analysing many genes in parallel, NGS technologies can make efficient use of precious biopsy material. Comprehensive genomic profiling (CGP) by NGS can reveal targetable, clinically relevant genomic alterations that can stratify tumours by predicted sensitivity to a variety of therapies, including HER2- or MTOR-targeted therapies, immunotherapies, and other kinase inhibitors. Many clinically relevant genomic alterations would not be identified by IHC or hotspot testing, but can be detected by NGS. In addition to the most common breast carcinoma subtypes, rare subtypes analysed with CGP also harbour clinically relevant genomic alterations that can potentially direct therapy selection, illustrating that CGP is a powerful tool for guiding treatment across all breast cancer subtypes.

  17. Center for Cancer Genomics | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing app

  18. TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence - TCGA

    Cancer.gov

    Carolyn Hutter, Ph.D., Program Director of NHGRI's Division of Genomic Medicine, discusses the expansion of TCGA's Pan-Cancer efforts to include the Pan-Cancer Analysis of Whole Genomes (PAWG) project.

  19. A genome-wide view of microsatellite instability: old stories of cancer mutations revisited with new sequencing technologies

    PubMed Central

    Kim, Tae-Min; Park, Peter J

    2014-01-01

    Microsatellites are simple tandem repeats that are present at millions of loci in the human genome. Microsatellite instability (MSI) refers to DNA slippage events on microsatellites that occur frequently in cancer genomes when there is a defect in the DNA mismatch repair system. These somatic mutations can result in inactivation of tumor suppressor genes or disrupt other non-coding regulatory sequences, thereby playing a role in carcinogenesis. Here, we will discuss the ways in which high-throughput sequencing data can facilitate a genome- or exome-wide discovery and more detailed investigation of MSI events in microsatellite-unstable cancer genomes. We will address the methodological aspects of this approach and highlight insights from recent analyses of colorectal and endometrial cancer genomes from The Cancer Genome Atlas project. These include identification of novel MSI targets within and across tumor types and the relationship between the likelihood of MSI events to chromatin structure. Given the increasing popularity of exome and genome sequencing of cancer genomes, a comprehensive characterization of MSI may serve as a valuable marker of cancer evolution and aid in a search for therapeutic targets. PMID:25371413

  20. Center for Cancer Genomics | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approaches, CCG aims to accelerate structural, functional and computational research to explore cancer mechanisms, discover new cancer targets, and develop new therapeutics.

  1. Beyond genome sequencing: lineage tracking with barcodes to study the dynamics of evolution, infection, and cancer.

    PubMed

    Blundell, Jamie R; Levy, Sasha F

    2014-12-01

    Evolving cellular communities, such as the gut microbiome, pathogenic infections, and cancer, consist of large populations of ~10(7)-10(14) cells. Because of their large population sizes, adaptation within these populations can be driven by many beneficial mutations that never rise above extremely low frequencies. Genome sequencing methods such as clonal, single cell, or whole population sequencing are poorly suited to detect these rare beneficial lineages, and, more generally, to characterize which mutations are most important to the population dynamics. Here, we introduce an alternative approach: high-resolution lineage tracking with DNA barcodes. In contrast to whole genome sequencing, lineage tracking can detect a beneficial mutation at an extremely low frequency within the population, and estimate its time of occurrence and fitness effect. Many lineage trajectories can be observed in parallel, allowing one to observe the population dynamics in exquisite detail. We describe some of the technical and analytical challenges to lineage tracking with DNA barcodes and discuss its applications to studies of evolution, infectious disease and cancer.

  2. Landscape of somatic mutations in 560 breast cancer whole-genome sequences

    DOE PAGES

    Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; ...

    2016-05-02

    Here, we analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, anothermore » with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.« less

  3. The cancer genome

    PubMed Central

    Stratton, Michael R.; Campbell, Peter J.; Futreal, P. Andrew

    2010-01-01

    All cancers arise as a result of changes that have occurred in the DNA sequence of the genomes of cancer cells. Over the past quarter of a century much has been learnt about these mutations and the abnormal genes that operate in human cancers. We are now, however, moving into an era in which it will be possible to obtain the complete DNA sequence of large numbers of cancer genomes. These studies will provide us with a detailed and comprehensive perspective on how individual cancers have developed. PMID:19360079

  4. Exome and genome sequencing of nasopharynx cancer identifies NF-κB pathway activating mutations

    PubMed Central

    Li, Yvonne Y; Chung, Grace T. Y.; Lui, Vivian W. Y.; To, Ka-Fai; Ma, Brigette B. Y.; Chow, Chit; Woo, John K, S.; Yip, Kevin Y.; Seo, Jeongsun; Hui, Edwin P.; Mak, Michael K. F.; Rusan, Maria; Chau, Nicole G.; Or, Yvonne Y. Y.; Law, Marcus H. N.; Law, Peggy P. Y.; Liu, Zoey W. Y.; Ngan, Hoi-Lam; Hau, Pok-Man; Verhoeft, Krista R.; Poon, Peony H. Y.; Yoo, Seong-Keun; Shin, Jong-Yeon; Lee, Sau-Dan; Lun, Samantha W. M.; Jia, Lin; Chan, Anthony W. H.; Chan, Jason Y. K.; Lai, Paul B. S.; Fung, Choi-Yi; Hung, Suet-Ting; Wang, Lin; Chang, Ann Margaret V.; Chiosea, Simion I.; Hedberg, Matthew L.; Tsao, Sai-Wah; van Hasselt, Andrew C.; Chan, Anthony T. C.; Grandis, Jennifer R.; Hammerman, Peter S.; Lo, Kwok-Wai

    2017-01-01

    Nasopharyngeal carcinoma (NPC) is an aggressive head and neck cancer characterized by Epstein-Barr virus (EBV) infection and dense lymphocyte infiltration. The scarcity of NPC genomic data hinders the understanding of NPC biology, disease progression and rational therapy design. Here we performed whole-exome sequencing (WES) on 111 micro-dissected EBV-positive NPCs, with 15 cases subjected to further whole-genome sequencing (WGS), to determine its mutational landscape. We identified enrichment for genomic aberrations of multiple negative regulators of the NF-κB pathway, including CYLD, TRAF3, NFKBIA and NLRC5, in a total of 41% of cases. Functional analysis confirmed inactivating CYLD mutations as drivers for NPC cell growth. The EBV oncoprotein latent membrane protein 1 (LMP1) functions to constitutively activate NF-κB signalling, and we observed mutual exclusivity among tumours with somatic NF-κB pathway aberrations and LMP1-overexpression, suggesting that NF-κB activation is selected for by both somatic and viral events during NPC pathogenesis. PMID:28098136

  5. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    PubMed Central

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  6. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing.

    PubMed

    Frampton, Garrett M; Fichtenholtz, Alex; Otto, Geoff A; Wang, Kai; Downing, Sean R; He, Jie; Schnall-Levin, Michael; White, Jared; Sanford, Eric M; An, Peter; Sun, James; Juhn, Frank; Brennan, Kristina; Iwanik, Kiel; Maillet, Ashley; Buell, Jamie; White, Emily; Zhao, Mandy; Balasubramanian, Sohail; Terzic, Selmira; Richards, Tina; Banning, Vera; Garcia, Lazaro; Mahoney, Kristen; Zwirko, Zac; Donahue, Amy; Beltran, Himisha; Mosquera, Juan Miguel; Rubin, Mark A; Dogan, Snjezana; Hedvat, Cyrus V; Berger, Michael F; Pusztai, Lajos; Lechner, Matthias; Boshoff, Chris; Jarosz, Mirna; Vietz, Christine; Parker, Alex; Miller, Vincent A; Ross, Jeffrey S; Curran, John; Cronin, Maureen T; Stephens, Philip J; Lipson, Doron; Yelensky, Roman

    2013-11-01

    As more clinically relevant cancer genes are identified, comprehensive diagnostic approaches are needed to match patients to therapies, raising the challenge of optimization and analytical validation of assays that interrogate millions of bases of cancer genomes altered by multiple mechanisms. Here we describe a test based on massively parallel DNA sequencing to characterize base substitutions, short insertions and deletions (indels), copy number alterations and selected fusions across 287 cancer-related genes from routine formalin-fixed and paraffin-embedded (FFPE) clinical specimens. We implemented a practical validation strategy with reference samples of pooled cell lines that model key determinants of accuracy, including mutant allele frequency, indel length and amplitude of copy change. Test sensitivity achieved was 95-99% across alteration types, with high specificity (positive predictive value >99%). We confirmed accuracy using 249 FFPE cancer specimens characterized by established assays. Application of the test to 2,221 clinical cases revealed clinically actionable alterations in 76% of tumors, three times the number of actionable alterations detected by current diagnostic tests.

  7. Analyzing Somatic Genome Rearrangements in Human Cancers by Using Whole-Exome Sequencing

    PubMed Central

    Yang, Lixing; Lee, Mi-Sook; Lu, Hengyu; Oh, Doo-Yi; Kim, Yeon Jeong; Park, Donghyun; Park, Gahee; Ren, Xiaojia; Bristow, Christopher A.; Haseley, Psalm S.; Lee, Soohyun; Pantazi, Angeliki; Kucherlapati, Raju; Park, Woong-Yang; Scott, Kenneth L.; Choi, Yoon-La; Park, Peter J.

    2016-01-01

    Although exome sequencing data are generated primarily to detect single-nucleotide variants and indels, they can also be used to identify a subset of genomic rearrangements whose breakpoints are located in or near exons. Using >4,600 tumor and normal pairs across 15 cancer types, we identified over 9,000 high confidence somatic rearrangements, including a large number of gene fusions. We find that the 5′ fusion partners of functional fusions are often housekeeping genes, whereas the 3′ fusion partners are enriched in tyrosine kinases. We establish the oncogenic potential of ROR1-DNAJC6 and CEP85L-ROS1 fusions by showing that they can promote cell proliferation in vitro and tumor formation in vivo. Furthermore, we found that ∼4% of the samples have massively rearranged chromosomes, many of which are associated with upregulation of oncogenes such as ERBB2 and TERT. Although the sensitivity of detecting structural alterations from exomes is considerably lower than that from whole genomes, this approach will be fruitful for the multitude of exomes that have been and will be generated, both in cancer and in other diseases. PMID:27153396

  8. Flexible positions, managed hopes: the promissory bioeconomy of a whole genome sequencing cancer study.

    PubMed

    Haase, Rachel; Michie, Marsha; Skinner, Debra

    2015-04-01

    Genomic research has rapidly expanded its scope and ambition over the past decade, promoted by both public and private sectors as having the potential to revolutionize clinical medicine. This promissory bioeconomy of genomic research and technology is generated by, and in turn generates, the hopes and expectations shared by investors, researchers and clinicians, patients, and the general public alike. Examinations of such bioeconomies have often focused on the public discourse, media representations, and capital investments that fuel these "regimes of hope," but also crucial are the more intimate contexts of small-scale medical research, and the private hopes, dreams, and disappointments of those involved. Here we examine one local site of production in a university-based clinical research project that sought to identify novel cancer predisposition genes through whole genome sequencing in individuals at high risk for cancer. In-depth interviews with 24 adults who donated samples to the study revealed an ability to shift flexibly between positioning themselves as research participants on the one hand, and as patients or as family members of patients, on the other. Similarly, interviews with members of the research team highlighted the dual nature of their positions as researchers and as clinicians. For both parties, this dual positioning shaped their investment in the project and valuing of its possible outcomes. In their narratives, all parties shifted between these different relational positions as they managed hopes and expectations for the research project. We suggest that this flexibility facilitated study implementation and participation in the face of potential and probable disappointment on one or more fronts, and acted as a key element in the resilience of this local promissory bioeconomy. We conclude that these multiple dimensions of relationality and positionality are inherent and essential in the creation of any complex economy, "bio" or otherwise.

  9. Flexible Positions, Managed Hopes: The Promissory Bioeconomy of a Whole Genome Sequencing Cancer Study

    PubMed Central

    Haase, Rachel; Michie, Marsha; Skinner, Debra

    2015-01-01

    Genomic research has rapidly expanded its scope and ambition over the past decade, promoted by both public and private sectors as having the potential to revolutionize clinical medicine. This promissory bioeconomy of genomic research and technology is generated by, and in turn generates, the hopes and expectations shared by investors, researchers and clinicians, patients, and the general public alike. Examinations of such bioeconomies have often focused on the public discourse, media representations, and capital investments that fuel these “regimes of hope,” but also crucial are the more intimate contexts of small-scale medical research, and the private hopes, dreams, and disappointments of those involved. Here we examine one local site of production in a university-based clinical research project that sought to identify novel cancer predisposition genes through whole genome sequencing in individuals at high risk for cancer. In-depth interviews with 24 adults who donated samples to the study revealed an ability to shift flexibly between positioning themselves as research participants on the one hand, and as patients or as family members of patients, on the other. Similarly, interviews with members of the research team highlighted the dual nature of their positions as researchers and as clinicians. For both parties, this dual positioning shaped their investment in the project and valuing of its possible outcomes. In their narratives, all parties shifted between these different relational positions as they managed hopes and expectations for the research project. We suggest that this flexibility facilitated study implementation and participation in the face of potential and probable disappointment on one or more fronts, and acted as a key element in the resilience of this local promissory bioeconomy. We conclude that these multiple dimensions of relationality and positionality are inherent and essential in the creation of any complex economy, “bio” or

  10. Draft Genome Sequence of Kluyvera intestini Strain GT-16 Isolated from the Stomach of a Patient with Gastric Cancer

    PubMed Central

    Tetz, Victor

    2016-01-01

    Here, we report the complete genome sequence of the novel, non-spore-forming Kluyvera intestini strain GT-16, isolated from the stomach of a patient with gastric cancer. The genome is 5,868,299 bp in length with a G+C content of 53.0%. It possesses 5,350 predicted protein-coding genes encoding virulence factors and antibiotic resistance proteins. PMID:28007864

  11. Draft Genome Sequence of Elizabethkingia anophelis Strain EM361-97 Isolated from the Blood of a Cancer Patient

    PubMed Central

    Lin, Jiun-Nong; Yang, Chih-Hui; Lai, Chung-Hsu; Huang, Yi-Han

    2016-01-01

    Elizabethkingia anophelis EM361-97 was isolated from the blood of a patient with nasopharyngeal carcinoma and lung cancer. We report the draft genome sequence of EM361-97, which contains a G+C content of 35.7% and 3,611 candidate protein-encoding genes. PMID:27789647

  12. Draft Genome Sequence of Elizabethkingia anophelis Strain EM361-97 Isolated from the Blood of a Cancer Patient.

    PubMed

    Lin, Jiun-Nong; Yang, Chih-Hui; Lai, Chung-Hsu; Huang, Yi-Han; Lin, Hsi-Hsun

    2016-10-27

    Elizabethkingia anophelis EM361-97 was isolated from the blood of a patient with nasopharyngeal carcinoma and lung cancer. We report the draft genome sequence of EM361-97, which contains a G+C content of 35.7% and 3,611 candidate protein-encoding genes.

  13. WholeGenome Sequencing of High-Risk Families to Identify New Mutational Mechanisms of Breast Cancer Predisposition

    DTIC Science & Technology

    2014-10-01

    levels in the whole genome sequencing data of two patients from a severely affected breast cancer Family 1041 . All Shared Rare Excluding IBD0...Family 1041 . Figure 1. Non-IBD0 regions for Family 1041 . The largest region overlaps BRCA1 on chromosome 17. chr1 chr2 chr3 chr4 chr17...binding site motif score using position weight matrices. We show in Figure 2 an example of a variant from breast cancer Family 1041 that was shared

  14. Genomic Datasets for Cancer Research

    Cancer.gov

    A variety of datasets from genome-wide association studies of cancer and other genotype-phenotype studies, including sequencing and molecular diagnostic assays, are available to approved investigators through the Extramural National Cancer Institute Data Access Committee.

  15. Local sequence assembly reveals a high-resolution profile of somatic structural variations in 97 cancer genomes.

    PubMed

    Zhuang, Jiali; Weng, Zhiping

    2015-09-30

    Genomic structural variations (SVs) are pervasive in many types of cancers. Characterizing their underlying mechanisms and potential molecular consequences is crucial for understanding the basic biology of tumorigenesis. Here, we engineered a local assembly-based algorithm (laSV) that detects SVs with high accuracy from paired-end high-throughput genomic sequencing data and pinpoints their breakpoints at single base-pair resolution. By applying laSV to 97 tumor-normal paired genomic sequencing datasets across six cancer types produced by The Cancer Genome Atlas Research Network, we discovered that non-allelic homologous recombination is the primary mechanism for generating somatic SVs in acute myeloid leukemia. This finding contrasts with results for the other five types of solid tumors, in which non-homologous end joining and microhomology end joining are the predominant mechanisms. We also found that the genes recursively mutated by single nucleotide alterations differed from the genes recursively mutated by SVs, suggesting that these two types of genetic alterations play different roles during cancer progression. We further characterized how the gene structures of the oncogene JAK1 and the tumor suppressors KDM6A and RB1 are affected by somatic SVs and discussed the potential functional implications of intergenic SVs.

  16. A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers

    PubMed Central

    Ferrari, Anthony; Vincent-Salomon, Anne; Pivot, Xavier; Sertier, Anne-Sophie; Thomas, Emilie; Tonon, Laurie; Boyault, Sandrine; Mulugeta, Eskeatnaf; Treilleux, Isabelle; MacGrogan, Gaëtan; Arnould, Laurent; Kielbassa, Janice; Le Texier, Vincent; Blanché, Hélène; Deleuze, Jean-François; Jacquemier, Jocelyne; Mathieu, Marie-Christine; Penault-Llorca, Frédérique; Bibeau, Frédéric; Mariani, Odette; Mannina, Cécile; Pierga, Jean-Yves; Trédan, Olivier; Bachelot, Thomas; Bonnefoi, Hervé; Romieu, Gilles; Fumoleau, Pierre; Delaloge, Suzette; Rios, Maria; Ferrero, Jean-Marc; Tarpin, Carole; Bouteille, Catherine; Calvo, Fabien; Gut, Ivo Glynne; Gut, Marta; Martin, Sancha; Nik-Zainal, Serena; Stratton, Michael R.; Pauporté, Iris; Saintigny, Pierre; Birnbaum, Daniel; Viari, Alain; Thomas, Gilles

    2016-01-01

    HER2-positive breast cancer has long proven to be a clinically distinct class of breast cancers for which several targeted therapies are now available. However, resistance to the treatment associated with specific gene expressions or mutations has been observed, revealing the underlying diversity of these cancers. Therefore, understanding the full extent of the HER2-positive disease heterogeneity still remains challenging. Here we carry out an in-depth genomic characterization of 64 HER2-positive breast tumour genomes that exhibit four subgroups, based on the expression data, with distinctive genomic features in terms of somatic mutations, copy-number changes or structural variations. The results suggest that, despite being clinically defined by a specific gene amplification, HER2-positive tumours melt into the whole luminal–basal breast cancer spectrum rather than standing apart. The results also lead to a refined ERBB2 amplicon of 106 kb and show that several cases of amplifications are compatible with a breakage–fusion–bridge mechanism. PMID:27406316

  17. Utility of comprehensive genomic sequencing for detecting HER2-positive colorectal cancer.

    PubMed

    Shimada, Yoshifumi; Yagi, Ryoma; Kameyama, Hitoshi; Nagahashi, Masayuki; Ichikawa, Hiroshi; Tajima, Yosuke; Okamura, Takuma; Nakano, Mae; Nakano, Masato; Sato, Yo; Matsuzawa, Takeaki; Sakata, Jun; Kobayashi, Takashi; Nogami, Hitoshi; Maruyama, Satoshi; Takii, Yasumasa; Kawasaki, Takashi; Homma, Kei-Ichi; Izutsu, Hiroshi; Kodama, Keisuke; Ring, Jennifer E; Protopopov, Alexei; Lyle, Stephen; Okuda, Shujiro; Akazawa, Kohei; Wakai, Toshifumi

    2017-02-21

    HER2-targeted therapy is considered effective for KRAS codon 12/13 wild-type, HER2-positive metastatic colorectal cancer (CRC). In general, HER2 status is determined by the use of immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH). Comprehensive genomic sequencing (CGS) enables the detection of gene mutations and copy number alterations including KRAS mutation and HER2 amplification; however, little is known about the utility of CGS for detecting HER2-positive CRC. To assess its utility, we retrospectively investigated 201 patients with stage I-IV CRC. The HER2 status of the primary site was assessed using IHC and FISH, and HER2 amplification of the primary site was also assessed using CGS, and the findings of these approaches were compared in each patient. CGS successfully detected alterations in 415 genes including KRAS codon 12/13 mutation and HER2 amplification. Fifty-nine (29%) patients had a KRAS codon 12/13 mutation. Ten (5%) patients were diagnosed as HER2-positive because of HER2 IHC 3+, and the same 10 (5%) patients had HER2 amplification evaluated using CGS. The results of HER2 status and HER2 amplification were completely identical in all 201 patients (P < 0.001). Nine of the 10 HER2-positive patients were KRAS 12/13 wild-type and were considered possible candidates for HER2-targeted therapy. CGS has the same utility as IHC and FISH for detecting HER2-positive patients who are candidates for HER2-targeted therapy, and facilitates precision medicine and tailor-made treatment.

  18. Surveying Breast Cancer's Genomic Landscape.

    PubMed

    2016-07-01

    An in-depth analysis has produced the most comprehensive portrait to date of the myriad genomic alterations involved in breast cancer. In sequencing the whole genomes of 560 breast cancers and combining this information with published data from another 772 breast tumors, the research team uncovered several new genes and mutational signatures that potentially influence this disease.

  19. Prenatal Whole Genome Sequencing

    PubMed Central

    Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

    2014-01-01

    With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means. PMID:22777977

  20. Toward nanoscale genome sequencing.

    PubMed

    Ryan, Declan; Rahimi, Maryam; Lund, John; Mehta, Ranjana; Parviz, Babak A

    2007-09-01

    This article reports on the state-of-the-art technologies that sequence DNA using miniaturized devices. The article considers the miniaturization of existing technologies for sequencing DNA and the opportunities for cost reduction that 'on-chip' devices can deliver. The ability to construct nano-scale structures and perform measurements using novel nano-scale effects has provided new opportunities to identify nucleotides directly using physical, and not chemical, methods. The challenges that these technologies need to overcome to provide a US$1000-genome sequencing technology are also presented.

  1. Preferences for return of incidental findings from genome sequencing among women diagnosed with breast cancer at a young age.

    PubMed

    Kaphingst, K A; Ivanovich, J; Biesecker, B B; Dresser, R; Seo, J; Dressler, L G; Goodfellow, P J; Goodman, M S

    2016-03-01

    While experts have made recommendations, information is needed regarding what genome sequencing results patients would want returned. We investigated what results women diagnosed with breast cancer at a young age would want returned and why. We conducted 60 semi-structured, in-person individual interviews with women diagnosed with breast cancer at age 40 or younger. We examined interest in six types of incidental findings and reasons for interest or disinterest in each type. Two coders independently coded interview transcripts; analysis was conducted using NVivo 10. Most participants were at least somewhat interested in all six result types, but strongest interest was in actionable results (i.e. variants affecting risk of a preventable or treatable disease and treatment response). Reasons for interest varied between different result types. Some participants were not interested or ambivalent about results not seen as currently actionable. Participants wanted to be able to choose what results are returned. Participants distinguished between types of individual genome sequencing results, with different reasons for wanting different types of information. The findings suggest that a focus on actionable results can be a common ground for all stakeholders in developing a policy for returning individual genome sequencing results.

  2. The Genome-Wide Analysis of Carcinoembryonic Antigen Signaling by Colorectal Cancer Cells Using RNA Sequencing

    PubMed Central

    Gorbunova, Anna; Evsyukov, Igor; Rayko, Michael; Gapon, Svetlana; Bozhokina, Ekaterina; Shishkin, Alexander; O’Brien, Stephen J.

    2016-01-01

    Сarcinoembryonic antigen (CEA, CEACAM5, CD66) is a promoter of metastasis in epithelial cancers that is widely used as a prognostic clinical marker of metastasis. The aim of this study is to identify the network of genes that are associated with CEA-induced colorectal cancer liver metastasis. We compared the genome-wide transcriptomic profiles of CEA positive (MIP101 clone 8) and CEA negative (MIP 101) colorectal cancer cell lines with different metastatic potential in vivo. The CEA-producing cells displayed quantitative changes in the level of expression for 100 genes (over-expressed or down-regulated). They were confirmed by quantitative RT-PCR. The KEGG pathway analysis identified 4 significantly enriched pathways: cytokine-cytokine receptor interaction, MAPK signaling pathway, TGF-beta signaling pathway and pyrimidine metabolism. Our results suggest that CEA production by colorectal cancer cells triggers colorectal cancer progression by inducing the epithelial- mesenchymal transition, increasing tumor cell invasiveness into the surrounding tissues and suppressing stress and apoptotic signaling. The novel gene expression distinctions establish the relationships between the existing cancer markers and implicate new potential biomarkers for colorectal cancer hepatic metastasis. PMID:27583792

  3. Genomic Rearrangements in Prostate Cancer

    PubMed Central

    Barbieri, Christopher E.; Rubin, Mark A.

    2014-01-01

    Purpose of review Genomic instability is a fundamental feature of human cancer, leading to the activation of oncogenes and inactivation of tumor suppressors. In prostate cancer, structural genomic rearrangements, resulting in gene fusions, amplifications and deletions, are a critical mechanism effecting these alterations. Here we review recent literature regarding the importance of genomic rearrangements in the pathogenesis of prostate cancer and the potential impact on patient care. Recent findings Next generation sequencing has revealed a striking abundance, complexity, and heterogeneity of genomic rearrangements in prostate cancer. These recent studies have nominated a number of processes in predisposing prostate cancer to genomic rearrangements, including androgen-induced transcription. Summary Structural rearrangements are the critical mechanism resulting in the characteristic genomic changes associated with prostate cancer pathogenesis and progression. Future studies will determine if the impact of these events on tumor phenotypes can be translated to clinical utility for patient prognosis and choices of management strategies. PMID:25393273

  4. Sequencing the maize genome.

    PubMed

    Martienssen, Robert A; Rabinowicz, Pablo D; O'Shaughnessy, Andrew; McCombie, W Richard

    2004-04-01

    Sequencing of complex genomes can be accomplished by enriching shotgun libraries for genes. In maize, gene-enrichment by copy-number normalization (high C(0)t) and methylation filtration (MF) have been used to generate up to two-fold coverage of the gene-space with less than 1 million sequencing reads. Simulations using sequenced bacterial artificial chromosome (BAC) clones predict that 5x coverage of gene-rich regions, accompanied by less than 1x coverage of subclones from BAC contigs, will generate high-quality mapped sequence that meets the needs of geneticists while accommodating unusually high levels of structural polymorphism. By sequencing several inbred strains, we propose a strategy for capturing this polymorphism to investigate hybrid vigor or heterosis.

  5. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges

    PubMed Central

    Liu, Biao; Morrison, Carl D.; Johnson, Candace S.; Trump, Donald L.; Qin, Maochun; Conroy, Jeffrey C.; Wang, Jianmin; Liu, Song

    2013-01-01

    Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections. PMID:24240121

  6. Cancer systems biology in the genome sequencing era: part 2, evolutionary dynamics of tumor clonal networks and drug resistance.

    PubMed

    Wang, Edwin; Zou, Jinfeng; Zaman, Naif; Beitel, Lenore K; Trifiro, Mark; Paliouras, Miltiadis

    2013-08-01

    A tumor often consists of multiple cell subpopulations (clones). Current chemo-treatments often target one clone of a tumor. Although the drug kills that clone, other clones overtake it and the tumor recurs. Genome sequencing and computational analysis allows to computational dissection of clones from tumors, while singe-cell genome sequencing including RNA-Seq allows profiling of these clones. This opens a new window for treating a tumor as a system in which clones are evolving. Future cancer systems biology studies should consider a tumor as an evolving system with multiple clones. Therefore, topics discussed in Part 2 of this review include evolutionary dynamics of clonal networks, early-warning signals (e.g., genome duplication events) for formation of fast-growing clones, dissecting tumor heterogeneity, and modeling of clone-clone-stroma interactions for drug resistance. The ultimate goal of the future systems biology analysis is to obtain a 'whole-system' understanding of a tumor and therefore provides a more efficient and personalized management strategies for cancer patients.

  7. From human genome to cancer genome: The first decade

    PubMed Central

    Wheeler, David A.; Wang, Linghua

    2013-01-01

    The realization that cancer progression required the participation of cellular genes provided one of several key rationales, in 1986, for embarking on the human genome project. Only with a reference genome sequence could the full spectrum of somatic changes leading to cancer be understood. Since its completion in 2003, the human reference genome sequence has fulfilled its promise as a foundational tool to illuminate the pathogenesis of cancer. Herein, we review the key historical milestones in cancer genomics since the completion of the genome, and some of the novel discoveries that are shaping our current understanding of cancer. PMID:23817046

  8. Towards Sequencing Cotton (Gossypium) Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly. Generating larger amounts of sequence data more quickly does not address the difficulties of sequencing and assembling complex genomes de novo. The cotton genomes represent a...

  9. Identification of cancer predisposition variants in apparently healthy individuals using a next-generation sequencing-based family genomics approach.

    PubMed

    Karageorgos, Ioannis; Mizzi, Clint; Giannopoulou, Efstathia; Pavlidis, Cristiana; Peters, Brock A; Zagoriti, Zoi; Stenson, Peter D; Mitropoulos, Konstantinos; Borg, Joseph; Kalofonos, Haralabos P; Drmanac, Radoje; Stubbs, Andrew; van der Spek, Peter; Cooper, David N; Katsila, Theodora; Patrinos, George P

    2015-06-20

    Cancer, like many common disorders, has a complex etiology, often with a strong genetic component and with multiple environmental factors contributing to susceptibility. A considerable number of genomic variants have been previously reported to be causative of, or associated with, an increased risk for various types of cancer. Here, we adopted a next-generation sequencing approach in 11 members of two families of Greek descent to identify all genomic variants with the potential to predispose family members to cancer. Cross-comparison with data from the Human Gene Mutation Database identified a total of 571 variants, from which 47 % were disease-associated polymorphisms, 26 % disease-associated polymorphisms with additional supporting functional evidence, 19 % functional polymorphisms with in vitro/laboratory or in vivo supporting evidence but no known disease association, 4 % putative disease-causing mutations but with some residual doubt as to their pathological significance, and 3 % disease-causing mutations. Subsequent analysis, focused on the latter variant class most likely to be involved in cancer predisposition, revealed two variants of prime interest, namely MSH2 c.2732T>A (p.L911R) and BRCA1 c.2955delC, the first of which is novel. KMT2D c.13895delC and c.1940C>A variants are additionally reported as incidental findings. The next-generation sequencing-based family genomics approach described herein has the potential to be applied to other types of complex genetic disorder in order to identify variants of potential pathological significance.

  10. Identification of cancer risk lncRNAs and cancer risk pathways regulated by cancer risk lncRNAs based on genome sequencing data in human cancers

    PubMed Central

    Li, Yiran; Li, Wan; Liang, Binhua; Li, Liansheng; Wang, Li; Huang, Hao; Guo, Shanshan; Wang, Yahui; He, Yuehan; Chen, Lina; He, Weiming

    2016-01-01

    Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. The complexity of cancer can be reduced to a small number of underlying principles like cancer hallmarks which could govern the transformation of normal cells to cancer. Besides, the growth and metastasis of cancer often relate to combined effects of long non-coding RNAs (lncRNAs). Here, we performed comprehensive analysis for lncRNA expression profiles and clinical data of six types of human cancer patients from The Cancer Genome Atlas (TCGA), and identified six risk pathways and twenty three lncRNAs. In addition, twenty three cancer risk lncRNAs which were closely related to the occurrence or development of cancer had a good classification performance for samples of testing datasets of six cancer datasets. More important, these lncRNAs were able to separate samples in the entire cancer dataset into high-risk group and low-risk group with significantly different overall survival (OS), which was further validated in ten validation datasets. In our study, the robust and effective cancer biomarkers were obtained from cancer datasets which had information of normal-tumor samples. Overall, our research can provide a new perspective for the further study of clinical diagnosis and treatment of cancer. PMID:27991568

  11. Identification of cancer risk lncRNAs and cancer risk pathways regulated by cancer risk lncRNAs based on genome sequencing data in human cancers.

    PubMed

    Li, Yiran; Li, Wan; Liang, Binhua; Li, Liansheng; Wang, Li; Huang, Hao; Guo, Shanshan; Wang, Yahui; He, Yuehan; Chen, Lina; He, Weiming

    2016-12-19

    Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. The complexity of cancer can be reduced to a small number of underlying principles like cancer hallmarks which could govern the transformation of normal cells to cancer. Besides, the growth and metastasis of cancer often relate to combined effects of long non-coding RNAs (lncRNAs). Here, we performed comprehensive analysis for lncRNA expression profiles and clinical data of six types of human cancer patients from The Cancer Genome Atlas (TCGA), and identified six risk pathways and twenty three lncRNAs. In addition, twenty three cancer risk lncRNAs which were closely related to the occurrence or development of cancer had a good classification performance for samples of testing datasets of six cancer datasets. More important, these lncRNAs were able to separate samples in the entire cancer dataset into high-risk group and low-risk group with significantly different overall survival (OS), which was further validated in ten validation datasets. In our study, the robust and effective cancer biomarkers were obtained from cancer datasets which had information of normal-tumor samples. Overall, our research can provide a new perspective for the further study of clinical diagnosis and treatment of cancer.

  12. Alternative preprocessing of RNA-Sequencing data in The Cancer Genome Atlas leads to improved analysis results

    PubMed Central

    Rahman, Mumtahena; Jackson, Laurie K.; Johnson, W. Evan; Li, Dean Y.; Bild, Andrea H.; Piccolo, Stephen R.

    2015-01-01

    Motivation: The Cancer Genome Atlas (TCGA) RNA-Sequencing data are used widely for research. TCGA provides ‘Level 3’ data, which have been processed using a pipeline specific to that resource. However, we have found using experimentally derived data that this pipeline produces gene-expression values that vary considerably across biological replicates. In addition, some RNA-Sequencing analysis tools require integer-based read counts, which are not provided with the Level 3 data. As an alternative, we have reprocessed the data for 9264 tumor and 741 normal samples across 24 cancer types using the Rsubread package. We have also collated corresponding clinical data for these samples. We provide these data as a community resource. Results: We compared TCGA samples processed using either pipeline and found that the Rsubread pipeline produced fewer zero-expression genes and more consistent expression levels across replicate samples than the TCGA pipeline. Additionally, we used a genomic-signature approach to estimate HER2 (ERBB2) activation status for 662 breast-tumor samples and found that the Rsubread data resulted in stronger predictions of HER2 pathway activity. Finally, we used data from both pipelines to classify 575 lung cancer samples based on histological type. This analysis identified various non-coding RNA that may influence lung-cancer histology. Availability and implementation: The RNA-Sequencing and clinical data can be downloaded from Gene Expression Omnibus (accession number GSE62944). Scripts and code that were used to process and analyze the data are available from https://github.com/srp33/TCGA_RNASeq_Clinical. Contact: stephen_piccolo@byu.edu or andreab@genetics.utah.edu Supplementary information: Supplementary material is available at Bioinformatics online. PMID:26209429

  13. Office of Cancer Genomics |

    Cancer.gov

    The mission of the NCI’s Office of Cancer Genomics (OCG) is to enhance the understanding of the molecular mechanisms of cancer, advance and accelerate genomics science and technology development, and efficiently translate the genomics data to improve cancer research, prevention, early detection, diagnosis and treatment.

  14. Genome Sequence Databases (Overview): Sequencing and Assembly

    SciTech Connect

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  15. The Global Cancer Genomics Consortium: interfacing genomics and cancer medicine.

    PubMed

    2012-08-01

    The Global Cancer Genomics Consortium (GCGC) is an international collaborative platform that amalgamates cancer biologists, cutting-edge genomics, and high-throughput expertise with medical oncologists and surgical oncologists; they address the most important translational questions that are central to cancer research and treatment. The annual GCGC symposium was held at the Advanced Centre for Treatment Research and Education in Cancer, Mumbai, India, from November 9 to 11, 2011. The symposium showcased international next-generation sequencing efforts that explore cancer-specific transcriptomic changes, single-nucleotide polymorphism, and copy number variations in various types of cancers, as well as the structural genomics approach to develop new therapeutic targets and chemical probes. From the spectrum of studies presented at the symposium, it is evident that the translation of emerging cancer genomics knowledge into clinical applications can only be achieved through the integration of multidisciplinary expertise. In summary, the GCGC symposium provided practical knowledge on structural and cancer genomics approaches, as well as an exclusive platform for focused cancer genomics endeavors.

  16. Fungal Genome Sequencing and Bioenergy

    SciTech Connect

    Baker, Scott E.; Thykaer, Jette; Adney, William S.; Brettin, T.; Brockman, Fred J.; D'haeseleer, Patrik; Martinez, Antonio D.; Miller, R. M.; Rokhsar, Daniel S.; Schadt, Christopher W.; Torok, Tamas; Tuskan, Gerald; Bennett, Joan W.; Berka, Randy; Briggs, Steve; Heitman, Joseph; Taylor, John; Turgeon, Barbara G.; Werner-Washburne, Maggie; Himmel, Michael E.

    2008-09-30

    To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

  17. Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden.

    PubMed

    Cazier, J-B; Rao, S R; McLean, C M; Walker, A K; Walker, A L; Wright, B J; Jaeger, E E M; Kartsonaki, C; Marsden, L; Yau, C; Camps, C; Kaisaki, P; Taylor, J; Catto, J W; Tomlinson, I P M; Kiltie, A E; Hamdy, F C

    2014-04-29

    Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression.

  18. Venturia carpophila draft genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Venturia carpophila causes peach scab, a disease that renders peach fruit unmarketable. We report a high-quality draft genome sequence (36.9 Mb) of V. carpophila from an isolate collected from a peach tree in central Georgia in the United States. The genome sequence described will be a useful resour...

  19. Targeted Sequencing of the Mitochondrial Genome of Women at High Risk of Breast Cancer without Detectable Mutations in BRCA1/2

    PubMed Central

    Blein, Sophie; Barjhoux, Laure; Damiola, Francesca; Dondon, Marie-Gabrielle; Eon-Marchais, Séverine; Marcou, Morgane; Caron, Olivier; Lortholary, Alain; Buecher, Bruno; Berthet, Pascaline; Noguès, Catherine; Lasset, Christine; Gauthier-Villars, Marion; Mazoyer, Sylvie; Stoppa-Lyonnet, Dominique; Andrieu, Nadine; Cox, David G.

    2015-01-01

    Breast Cancer is a complex multifactorial disease for which high-penetrance mutations have been identified. Approaches used to date have identified genomic features explaining about 50% of breast cancer heritability. A number of low- to medium penetrance alleles (per-allele odds ratio < 1.5 and 4.0, respectively) have been identified, suggesting that the remaining heritability is likely to be explained by the cumulative effect of such alleles and/or by rare high-penetrance alleles. Relatively few studies have specifically explored the mitochondrial genome for variants potentially implicated in breast cancer risk. For these reasons, we propose an exploration of the variability of the mitochondrial genome in individuals diagnosed with breast cancer, having a positive breast cancer family history but testing negative for BRCA1/2 pathogenic mutations. We sequenced the mitochondrial genome of 436 index breast cancer cases from the GENESIS study. As expected, no pathogenic genomic pattern common to the 436 women included in our study was observed. The mitochondrial genes MT-ATP6 and MT-CYB were observed to carry the highest number of variants in the study. The proteins encoded by these genes are involved in the structure of the mitochondrial respiration chain, and variants in these genes may impact reactive oxygen species production contributing to carcinogenesis. More functional and epidemiological studies are needed to further investigate to what extent variants identified may influence familial breast cancer risk. PMID:26406445

  20. Genomic DNA of MCF-7 breast cancer cells not an ideal choice as positive control for PCR amplification based detection of Mouse Mammary Tumor Virus-Like Sequences.

    PubMed

    Kulkarni, Bhushan B; Hiremath, Shivaprakash V; Kulkarni, Suyamindra S; Hallikeri, Umesh R; Patil, Basavaraj R; Gai, Pramod B

    2013-11-01

    The identification of the etiology of breast cancer is a crucial research issue for the development of an effective preventive and treatment strategies. Researchers are exploring the possible involvement of Mouse Mammary Tumor Virus (MMTV) in causing human breast cancer. Hence, it becomes very important to use a consistent positive control agent in PCR amplification based detection of MMTV-Like Sequence (MMTV-LS) in human breast cancer for accurate and reproducible results. This study was done to investigate the feasibility of using genomic DNA of MCF-7 breast cancer cells to detect MMTV-LS using PCR amplification based detection. MMTV env and SAG gene located at the 3' long terminal repeat (LTR) sequences were targeted for the PCR based detection. No amplification was observed in case of the genomic DNA of MCF-7 breast cancer cells. However, the 2.7 kb DNA fragment comprising MMTV env and SAG LTR sequences yielded the products of desired size. From these results it can be concluded that Genomic DNA of MCF-7 cell is not a suitable choice as positive control for PCR or RT-PCR based detection of MMTV-LS. It is also suggested that plasmids containing the cloned genes or sequences of MMTV be used as positive control for detection of MMTV-LS.

  1. Genome instability mechanisms and the structure of cancer genomes.

    PubMed

    Cassidy, Liam D; Venkitaraman, Ashok R

    2012-02-01

    Genomic instability is a hallmark of cancer cells, and arises from the aberrations that these cells exhibit in the normal biological mechanisms that repair and replicate the genome, or ensure its accurate segregation during cell division. Increasingly detailed descriptions of cancer genomes have begun to emerge from next-generation sequencing (NGS), providing snapshots of their nature and heterogeneity in different cancers at different stages in their evolution. Here, we attempt to extract from these sequencing studies insights into the role of genome instability mechanisms in carcinogenesis, and to identify challenges impeding further progress.

  2. A comparison of isolated circulating tumor cells and tissue biopsies using whole-genome sequencing in prostate cancer

    PubMed Central

    Chen, Jie-Fu; Lin, Millicent; Li, Fuqiang; Wu, Kui; Wu, Hanjie; Lichterman, Jake; Wan, Haolei; Lu, Chia-Lun; OuYang, William; Ni, Ming; Wang, Linlin; Li, Guibo; Lee, Tom; Zhang, Xiuqing; Yang, Jonathan; Rettig, Matthew; Chung, Leland W.K.; Yang, Huanming; Li, Ker-Chau; Hou, Yong; Tseng, Hsian-Rong; Hou, Shuang; Xu, Xun; Wang, Jun; Posadas, Edwin M.

    2015-01-01

    Previous studies have demonstrated focal but limited molecular similarities between circulating tumor cells (CTCs) and biopsies using isolated genetic assays. We hypothesized that molecular similarity between CTCs and tissue exists at the single cell level when characterized by whole genome sequencing (WGS). By combining the NanoVelcro CTC Chip with laser capture microdissection (LCM), we developed a platform for single-CTC WGS. We performed this procedure on CTCs and tissue samples from a patient with advanced prostate cancer who had serial biopsies over the course of his clinical history. We achieved 30X depth and ≥ 95% coverage. Twenty-nine percent of the somatic single nucleotide variations (SSNVs) identified were founder mutations that were also identified in CTCs. In addition, 86% of the clonal mutations identified in CTCs could be traced back to either the primary or metastatic tumors. In this patient, we identified structural variations (SVs) including an intrachromosomal rearrangement in chr3 and an interchromosomal rearrangement between chr13 and chr15. These rearrangements were shared between tumor tissues and CTCs. At the same time, highly heterogeneous short structural variants were discovered in PTEN, RB1, and BRCA2 in all tumor and CTC samples. Using high-quality WGS on single-CTCs, we identified the shared genomic alterations between CTCs and tumor tissues. This approach yielded insight into the heterogeneity of the mutational landscape of SSNVs and SVs. It may be possible to use this approach to study heterogeneity and characterize the biological evolution of a cancer during the course of its natural history. PMID:26575023

  3. Whole-exome/genome sequencing and genomics.

    PubMed

    Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne

    2013-12-01

    As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.

  4. Information Topics of Greatest Interest for Return of Genome Sequencing Results among Women Diagnosed with Breast Cancer at a Young Age.

    PubMed

    Seo, Joann; Ivanovich, Jennifer; Goodman, Melody S; Biesecker, Barbara B; Kaphingst, Kimberly A

    2016-08-20

    We investigated what information women diagnosed with breast cancer at a young age would want to learn when genome sequencing results are returned. We conducted 60 semi-structured interviews with women diagnosed with breast cancer at age 40 or younger. We examined what specific information participants would want to learn across result types and for each type of result, as well as how much information they would want. Genome sequencing was not offered to participants as part of the study. Two coders independently coded interview transcripts; analysis was conducted using NVivo10. Across result types, participants wanted to learn about health implications, risk and prevalence in quantitative terms, causes of variants, and causes of diseases. Participants wanted to learn actionable information for variants affecting risk of preventable or treatable disease, medication response, and carrier status. The amount of desired information differed for variants affecting risk of unpreventable or untreatable disease, with uncertain significance, and not health-related. Women diagnosed with breast cancer at a young age recognize the value of genome sequencing results in identifying potential causes and effective treatments and expressed interest in using the information to help relatives and to further understand their other health risks. Our findings can inform the development of effective feedback strategies for genome sequencing that meet patients' information needs and preferences.

  5. Comparative effectiveness of next generation genomic sequencing for disease diagnosis: Design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes✩

    PubMed Central

    Gallego, Carlos J.; Bennette, Caroline S.; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M.; Bennett, Robin L.; Dorschner, Michael O.; Tarczy-Hornoch, Peter; Grady, William M.; Fullerton, S. Malia; Trinidad, Susan B.; Regier, Dean A.; Nickerson, Deborah A.; Burke, Wylie; Patrick, Donald L.; Jarvik, Gail P.; Veenstra, David L.

    2014-01-01

    Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of result visit, and findings not associated with colorectal cancer (incidental findings) during a second return of result visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

  6. Whole-genome sequences of DA and F344 rats with different susceptibilities to arthritis, autoimmunity, inflammation and cancer.

    PubMed

    Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S

    2013-08-01

    DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease.

  7. Genomic Sequencing in Determining Treatment in Patients With Metastatic Cancer or Cancer That Cannot Be Removed by Surgery

    ClinicalTrials.gov

    2017-02-20

    Metastatic Neoplasm; Recurrent Neoplasm; Recurrent Non-Small Cell Lung Carcinoma; Stage IIIA Non-Small Cell Lung Cancer; Stage IIIB Non-Small Cell Lung Cancer; Stage IV Non-Small Cell Lung Cancer; Unresectable Malignant Neoplasm

  8. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  9. Network Biomarkers of Bladder Cancer Based on a Genome-Wide Genetic and Epigenetic Network Derived from Next-Generation Sequencing Data

    PubMed Central

    Li, Cheng-Wei

    2016-01-01

    Epigenetic and microRNA (miRNA) regulation are associated with carcinogenesis and the development of cancer. By using the available omics data, including those from next-generation sequencing (NGS), genome-wide methylation profiling, candidate integrated genetic and epigenetic network (IGEN) analysis, and drug response genome-wide microarray analysis, we constructed an IGEN system based on three coupling regression models that characterize protein-protein interaction networks (PPINs), gene regulatory networks (GRNs), miRNA regulatory networks (MRNs), and epigenetic regulatory networks (ERNs). By applying system identification method and principal genome-wide network projection (PGNP) to IGEN analysis, we identified the core network biomarkers to investigate bladder carcinogenic mechanisms and design multiple drug combinations for treating bladder cancer with minimal side-effects. The progression of DNA repair and cell proliferation in stage 1 bladder cancer ultimately results not only in the derepression of miR-200a and miR-200b but also in the regulation of the TNF pathway to metastasis-related genes or proteins, cell proliferation, and DNA repair in stage 4 bladder cancer. We designed a multiple drug combination comprising gefitinib, estradiol, yohimbine, and fulvestrant for treating stage 1 bladder cancer with minimal side-effects, and another multiple drug combination comprising gefitinib, estradiol, chlorpromazine, and LY294002 for treating stage 4 bladder cancer with minimal side-effects. PMID:27034531

  10. Malaria Genome Sequencing Project

    DTIC Science & Technology

    2004-01-01

    spectrometry identified i authentic peptides corresponding to proteins0 -: __{_ _ ___ Caenorhabditis elegans encoded by 2,391 of the genes, including...16 P-type ATPases. An Nramp numbers seen in S. cerevisiae, S. pombe or Caenorhabditis elegans divalent cation transporter was identified which may be...used for sequencing were not avail- and Caenorhabditis elegans were nearing comple- able. Although large-insert yeast artificial chromo- tion. Two

  11. Genomic Data Commons | Office of Cancer Genomics

    Cancer.gov

    The NCI’s Center for Cancer Genomics launches the Genomic Data Commons (GDC), a unified data sharing platform for the cancer research community. The mission of the GDC is to enable data sharing across the entire cancer research community, to ultimately support precision medicine in oncology.

  12. Genome Sequence of Spizellomyces punctatus

    PubMed Central

    Russ, Carsten; Lang, B. Franz; Chen, Zehua; Gujja, Sharvari; Shea, Terrance; Zeng, Qiandong; Young, Sarah; Nusbaum, Chad

    2016-01-01

    Spizellomyces punctatus is a basally branching chytrid fungus that is found in the Chytridiomycota phylum. Spizellomyces species are common in soil and of importance in terrestrial ecosystems. Here, we report the genome sequence of S. punctatus, which will facilitate the study of this group of early diverging fungi. PMID:27540072

  13. Fusicladium effusum draft genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The pecan scab fungus (Fusicladium effusum [G. Winter]) is an economically important pathogen of pecan (Carya illinoinensis [Wangenh]. K. Koch), on account of its impact on yield and quality of valuable nutmeats. We describe the first draft genome sequence of F. effusum, the characteristics of annot...

  14. Identification of novel SNPs by next-generation sequencing of the genomic region containing the APC gene in colorectal cancer patients in China.

    PubMed

    Cheng, Yin; Wang, Jun; Shao, Jiaofang; Chen, Qiyun; Mo, Fan; Ma, Liang; Han, Xu; Zhang, Jing; Chen, Chen; Zhang, Cixiong; Lin, Shuyong; Yu, Jiekai; Zheng, Shu; Lin, Sheng-Cai; Lin, Biaoyang

    2010-06-01

    We described an approach of identifying single nucleotide polymorphisms (SNPs) in complete genomic regions of key genes including promoters, exons, introns, and downstream sequences by combining long-range polymerase chain reaction (PCR) or NimbleGen sequence capture with next-generation sequencing. Using the adenomatous polyposis coli (APC) gene as an example, we identified 210 highly reliable SNPs by next-generation sequencing analysis program MAQ and Samtools, of which 69 were novel ones, in the 123-kb APC genomic region in 27 pair of colorectal cancers and normal adjacent tissues. We confirmed all of the eight randomly selected high-quality SNPs by allele-specific PCR, suggesting that our false discovery rate is negligible. We identified 11 SNPs in the exonic region, including one novel SNP that was not previously reported. Although 10 of them are synonymous, they were predicted to affect splicing by creating or removing exonic splicing enhancers or exonic splicing silencers. We also identified seven SNPs in the upstream region of the APC gene, three of which were only identified in the cancer tissues. Six of these upstream SNPs were predicted to affect transcription factor binding. We also observed that long-range PCR was better in capturing GC-rich regions than the NimbleGen sequence capture technique.

  15. Genome Sequences of Eight Morphologically Diverse Alphaproteobacteria▿

    PubMed Central

    Brown, Pamela J. B.; Kysela, David T.; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V.

    2011-01-01

    The Alphaproteobacteriacomprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium. PMID:21705585

  16. Genome sequences of eight morphologically diverse Alphaproteobacteria.

    PubMed

    Brown, Pamela J B; Kysela, David T; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V

    2011-09-01

    The Alphaproteobacteria comprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium.

  17. Genome Sequence of Mycobacteriophage Momo.

    PubMed

    Pope, Welkin H; Bina, Elizabeth A; Brahme, Indraneel S; Hill, Amy B; Himmelstein, Philip H; Hunsicker, Sara M; Ish, Amanda R; Le, Tinh S; Martin, Mary M; Moscinski, Catherine N; Shetty, Sameer A; Swierzewski, Tomasz; Iyengar, Varun B; Kim, Hannah; Schafer, Claire E; Grubb, Sarah R; Warner, Marcie H; Bowman, Charles A; Russell, Daniel A; Hatfull, Graham F

    2015-06-18

    Momo is a newly discovered phage of Mycobacterium smegmatis mc(2)155. Momo has a double-stranded DNA genome 154,553 bp in length, with 233 predicted protein-encoding genes, 34 tRNA genes, and one transfer-messenger RNA (tmRNA) gene. Momo has a myoviral morphology and shares extensive nucleotide sequence similarity with subcluster C1 mycobacteriophages.

  18. Genome Sequence of Mycobacteriophage Momo

    PubMed Central

    Bina, Elizabeth A.; Brahme, Indraneel S.; Hill, Amy B.; Himmelstein, Philip H.; Hunsicker, Sara M.; Ish, Amanda R.; Le, Tinh S.; Martin, Mary M.; Moscinski, Catherine N.; Shetty, Sameer A.; Swierzewski, Tomasz; Iyengar, Varun B.; Kim, Hannah; Schafer, Claire E.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Momo is a newly discovered phage of Mycobacterium smegmatis mc2155. Momo has a double-stranded DNA genome 154,553 bp in length, with 233 predicted protein-encoding genes, 34 tRNA genes, and one transfer-messenger RNA (tmRNA) gene. Momo has a myoviral morphology and shares extensive nucleotide sequence similarity with subcluster C1 mycobacteriophages. PMID:26089415

  19. Personal genome sequencing: current approaches and challenges

    PubMed Central

    Snyder, Michael; Du, Jiang; Gerstein, Mark

    2010-01-01

    The revolution in DNA sequencing technologies has now made it feasible to determine the genome sequences of many individuals; i.e., “personal genomes.” Genome sequences of cells and tissues from both normal and disease states have been determined. Using current approaches, whole human genome sequences are not typically assembled and determined de novo, but, instead, variations relative to a reference sequence are identified. We discuss the current state of personal genome sequencing, the main steps involved in determining a genome sequence (i.e., identifying single-nucleotide polymorphisms [SNPs] and structural variations [SVs], assembling new sequences, and phasing haplotypes), and the challenges and performance metrics for evaluating the accuracy of the reconstruction. Finally, we consider the possible individual and societal benefits of personal genome sequences. PMID:20194435

  20. Whole Genome Sequencing of High-Risk Families to Identify New Mutational Mechanisms of Breast Cancer Predisposition

    DTIC Science & Technology

    2014-10-01

    patients from a severely affected breast cancer Family 1041 . All Shared Rare Excluding IBD0 Intergenic 3,345,727 1,650,045 35,927 3,990 ncRNA 266,300...genome (termed IBD0) in each of the 30 families. Figure 1 shows the non-IBD0 regions in Family 1041 . Figure 1. Non-IBD0 regions for Family... 1041 . The largest region overlaps BRCA1 on chromosome 17. chr1 chr2 chr3 chr4 chr17 After the non-IBD0 sharing constraint has been applied

  1. Translating genomic profiling to gastrointestinal cancer treatment.

    PubMed

    Harada, Kazuto; Mizrak Kaya, Dilsa; Shimodaira, Yusuke; Song, Shumei; Baba, Hideo; Ajani, Jaffer A

    2017-04-01

    Next-generation sequencing enables faster, cheaper and more accurate whole-genome sequencing, allowing genome profiling and discovery of molecular features. As molecular targeted drugs are developed, treatment can be tailored according to molecular subtype. Gastric and colorectal cancers have each been divided into four subtypes according to molecular features. Profiling of the esophageal cancer genome is underway and its classification is anticipated. To date, identification of HER2 expression in gastric adenocarcinoma and KRAS, NRAS and BRAF mutations in colon cancer have proved essential for treatment decisions. However, to overcome therapy resistance and improve prognosis, further individualized therapy is required. Here, we summarize the treatment options for gastrointestinal cancer according to genomic profiling and discuss future directions.

  2. Translating genomics in cancer care.

    PubMed

    Bombard, Yvonne; Bach, Peter B; Offit, Kenneth

    2013-11-01

    There is increasing enthusiasm for genomics and its promise in advancing personalized medicine. Genomic information has been used to personalize health care for decades, spanning the fields of cardiovascular disease, infectious disease, endocrinology, metabolic medicine, and hematology. However, oncology has often been the first test bed for the clinical translation of genomics for diagnostic, prognostic, and therapeutic applications. Notable hereditary cancer examples include testing for mutations in BRCA1 or BRCA2 in unaffected women to identify those at significantly elevated risk for developing breast and ovarian cancers, and screening patients with newly diagnosed colorectal cancer for mutations in 4 mismatch repair genes to reduce morbidity and mortality in their relatives. Somatic genomic testing is also increasingly used in oncology, with gene expression profiling of breast tumors and EGFR testing to predict treatment response representing commonly used examples. Health technology assessment provides a rigorous means to inform clinical and policy decision-making through systematic assessment of the evidentiary base, along with precepts of clinical effectiveness, cost-effectiveness, and consideration of risks and benefits for health care delivery and society. Although this evaluation is a fundamental step in the translation of any new therapeutic, procedure, or diagnostic test into clinical care, emerging developments may threaten this standard. These include "direct to consumer" genomic risk assessment services and the challenges posed by incidental results generated from next-generation sequencing (NGS) technologies. This article presents a review of the evidentiary standards and knowledge base supporting the translation of key cancer genomic technologies along the continuum of validity, utility, cost-effectiveness, health service impacts, and ethical and societal issues, and offers future research considerations to guide the responsible introduction of

  3. Collaborators | Office of Cancer Genomics

    Cancer.gov

    The TARGET initiative is jointly managed within the National Cancer Institute (NCI) by the Office of Cancer Genomics (OCG)Opens in a New Tab and the Cancer Therapy Evaluation Program (CTEP)Opens in a New Tab.

  4. Translational genomics for plant breeding with the genome sequence explosion.

    PubMed

    Kang, Yang Jae; Lee, Taeyoung; Lee, Jayern; Shim, Sangrea; Jeong, Haneul; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha

    2016-04-01

    The use of next-generation sequencers and advanced genotyping technologies has propelled the field of plant genomics in model crops and plants and enhanced the discovery of hidden bridges between genotypes and phenotypes. The newly generated reference sequences of unstudied minor plants can be annotated by the knowledge of model plants via translational genomics approaches. Here, we reviewed the strategies of translational genomics and suggested perspectives on the current databases of genomic resources and the database structures of translated information on the new genome. As a draft picture of phenotypic annotation, translational genomics on newly sequenced plants will provide valuable assistance for breeders and researchers who are interested in genetic studies.

  5. Characterizing genomic alterations in cancer by complementary functional associations | Office of Cancer Genomics

    Cancer.gov

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment.

  6. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley; Brown, Steven D; Podar, Mircea; Palumbo, Anthony Vito; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  7. Fungal genome sequencing: basic biology to biotechnology.

    PubMed

    Sharma, Krishna Kant

    2016-08-01

    The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research.

  8. Whole-Genome Sequencing: Manual Library Preparation.

    PubMed

    Mardis, Elaine; McCombie, W Richard

    2017-01-03

    This protocol describes a manual approach for the preparation of genomic DNA libraries suitable for Illumina sequencing. Genomic DNA fragments produced by shearing by sonication are ligated to adaptors and amplified by polymerase chain reaction (PCR). The amplified DNA, separated by size and gel-purified, is suitable for use as template in whole-genome sequencing.

  9. Draft Genome Sequence of Lactobacillus rhamnosus 2166

    PubMed Central

    Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains. PMID:24558254

  10. Value of a newly sequenced bacterial genome

    PubMed Central

    Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  11. Implementing personalized cancer genomics in clinical trials.

    PubMed

    Simon, Richard; Roychowdhury, Sameek

    2013-05-01

    The recent surge in high-throughput sequencing of cancer genomes has supported an expanding molecular classification of cancer. These studies have identified putative predictive biomarkers signifying aberrant oncogene pathway activation and may provide a rationale for matching patients with molecularly targeted therapies in clinical trials. Here, we discuss some of the challenges of adapting these data for rare cancers or molecular subsets of certain cancers, which will require aligning the availability of investigational agents, rapid turnaround of clinical grade sequencing, molecular eligibility and reconsidering clinical trial design and end points.

  12. Pathogenic Mutations in Cancer-Predisposing Genes: A Survey of 300 Patients with Whole-Genome Sequencing and Lifetime Electronic Health Records

    PubMed Central

    He, Karen Y.; McPherson, Elizabeth W.; Li, Quan; Xia, Fan; Weng, Chunhua; Wang, Kai

    2016-01-01

    Background It is unclear whether and how whole-genome sequencing (WGS) data can be used to implement genomic medicine. Our objective is to retrospectively evaluate whether WGS can facilitate improving prevention and care for patients with susceptibility to cancer syndromes. Methods and Findings We analyzed genetic mutations in 60 autosomal dominant cancer-predisposition genes in 300 deceased patients with WGS data and nearly complete long-term (over 30 years) medical records. To infer biological insights from massive amounts of WGS data and comprehensive clinical data in a short period of time, we developed an in-house analysis pipeline within the SeqHBase software framework to quickly identify pathogenic or likely pathogenic variants. The clinical data of the patients who carried pathogenic and/or likely pathogenic variants were further reviewed to assess their clinical conditions using their lifetime EHRs. Among the 300 participants, 5 (1.7%) carried pathogenic or likely pathogenic variants in 5 cancer-predisposing genes: one in APC, BRCA1, BRCA2, NF1, and TP53 each. When assessing the clinical data, each of the 5 patients had one or more different types of cancers, fully consistent with their genetic profiles. Among these 5 patients, 2 died due to cancer while the others had multiple disorders later in their lifetimes; however, they may have benefited from early diagnosis and treatment for healthier lives, had the patients had genetic testing in their earlier lifetimes. Conclusions We demonstrated a case study where the discovery of pathogenic or likely pathogenic germline mutations from population-wide WGS correlates with clinical outcome. The use of WGS may have clinical impacts to improve healthcare delivery. PMID:27930734

  13. Snake Genome Sequencing: Results and Future Prospects

    PubMed Central

    Kerkkamp, Harald M. I.; Kini, R. Manjunatha; Pospelov, Alexey S.; Vonk, Freek J.; Henkel, Christiaan V.; Richardson, Michael K.

    2016-01-01

    Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression. PMID:27916957

  14. Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with high-throughput sequencing

    PubMed Central

    2010-01-01

    Background Cancer cells undergo massive alterations to their DNA methylation patterns that result in aberrant gene expression and malignant phenotypes. However, the mechanisms that underlie methylome changes are not well understood nor is the genomic distribution of DNA methylation changes well characterized. Results Here, we performed methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq) to obtain whole-genome DNA methylation profiles for eight human breast cancer cell (BCC) lines and for normal human mammary epithelial cells (HMEC). The MeDIP-seq analysis generated non-biased DNA methylation maps by covering almost the entire genome with sufficient depth and resolution. The most prominent feature of the BCC lines compared to HMEC was a massively reduced methylation level particularly in CpG-poor regions. While hypomethylation did not appear to be associated with particular genomic features, hypermethylation preferentially occurred at CpG-rich gene-related regions independently of the distance from transcription start sites. We also investigated methylome alterations during epithelial-to-mesenchymal transition (EMT) in MCF7 cells. EMT induction was associated with specific alterations to the methylation patterns of gene-related CpG-rich regions, although overall methylation levels were not significantly altered. Moreover, approximately 40% of the epithelial cell-specific methylation patterns in gene-related regions were altered to those typical of mesenchymal cells, suggesting a cell-type specific regulation of DNA methylation. Conclusions This study provides the most comprehensive analysis to date of the methylome of human mammary cell lines and has produced novel insights into the mechanisms of methylome alteration during tumorigenesis and the interdependence between DNA methylome alterations and morphological changes. PMID:20181289

  15. Programs | Office of Cancer Genomics

    Cancer.gov

    OCG facilitates cancer genomics research through a series of highly-focused programs. These programs generate and disseminate genomic data for use by the cancer research community. OCG programs also promote advances in technology-based infrastructure and create valuable experimental reagents and tools. OCG programs encourage collaboration by interconnecting with other genomics and cancer projects in order to accelerate translation of findings into the clinic. Below are OCG’s current, completed, and initiated programs:

  16. Marsupial Genome Sequences: Providing Insight into Evolution and Disease

    PubMed Central

    Deakin, Janine E.

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

  17. The Genome Sequencing Center at NCGR

    SciTech Connect

    Schilkey, Faye

    2010-06-02

    Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  18. Dr. Marco Marra: Pioneer and Visionary in Cancer Genomics Research | Office of Cancer Genomics

    Cancer.gov

    Dr. Marco Marra is a highly distinguished genomics and bioinformatics researcher. He is the Director of Canada’s Michael Smith Genome Sciences Centre at the BC Cancer Agency and holds a faculty position at the University of British Columbia. The Centre is a state-of-the-art sequencing facility in Vancouver, Canada, with a major focus on the study of cancers.  Many of their research projects are undertaken in collaborations with other Canadian and international institutions.

  19. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    PubMed Central

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences. PMID:22247527

  20. Towards a reference pecan genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The cost of generating DNA sequence data has declined dramatically over the previous 15 years as a result of the Human Genome Project and the potential applications of genome sequencing for human medicine. This cost reduction has generated renewed interest among crop breeding scientists in applying...

  1. Maize genome sequencing by methylation filtration.

    PubMed

    Palmer, Lance E; Rabinowicz, Pablo D; O'Shaughnessy, Andrew L; Balija, Vivekanand S; Nascimento, Lidia U; Dike, Sujit; de la Bastide, Melissa; Martienssen, Robert A; McCombie, W Richard

    2003-12-19

    Gene enrichment strategies offer an alternative to sequencing large and repetitive genomes such as that of maize. We report the generation and analysis of nearly 100,000 undermethylated (or methylation filtration) maize sequences. Comparison with the rice genome reveals that methylation filtration results in a more comprehensive representation of maize genes than those that result from expressed sequence tags or transposon insertion sites sequences. About 7% of the repetitive DNA is unmethylated and thus selected in our libraries, but potentially active transposons and unmethylated organelle genomes can be identified. Reverse transcription polymerase chain reaction can be used to finish the maize transcriptome.

  2. Human Genome Sequencing in Health and Disease

    PubMed Central

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  3. Human genome sequencing in health and disease.

    PubMed

    Gonzaga-Jauregui, Claudia; Lupski, James R; Gibbs, Richard A

    2012-01-01

    Following the "finished," euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges.

  4. The genome sequence of parrot bornavirus 5.

    PubMed

    Guo, Jianhua; Tizard, Ian

    2015-12-01

    Although several new avian bornaviruses have recently been described, information on their evolution, virulence, and sequence are often limited. Here we report the complete genome sequence of parrot bornavirus 5 (PaBV-5) isolated from a case of proventricular dilatation disease in a Palm cockatoo (Probosciger aterrimus). The complete genome consists of 8842 nucleotides with distinct 5' and 3' end sequences. This virus shares nucleotide sequence identities of 69-74 % with other bornaviruses in the genomic regions excluding the 5' and 3' terminal sequences. Phylogenetic analysis based on the genomic regions demonstrated this new isolate is an isolated branch within the clade that includes the aquatic bird bornaviruses and the passerine bornaviruses. Based on phylogenetic analyses and its low nucleotide sequence identities with other bornavirus, we support the proposal that PaBV-5 be assigned to a new bornavirus species:- Psittaciform 2 bornavirus.

  5. Completely phased genome sequencing through chromosome sorting

    PubMed Central

    Yang, Hong; Chen, Xi; Wong, Wing Hung

    2011-01-01

    The two haploid genome sequences that a person inherits from the two parents represent the most fundamentally useful type of genetic information for the study of heritable diseases and the development of personalized medicine. Because of the difficulty in obtaining long-range phase information, current sequencing methods are unable to provide this information. Here, we introduce and show feasibility of a scalable approach capable of generating genomic sequences completely phased across the entire chromosome. PMID:21169219

  6. Contact | Office of Cancer Genomics

    Cancer.gov

    For more information about the Office of Cancer Genomics, please contact: Office of Cancer Genomics National Cancer Institute 31 Center Drive, 10A07 Bethesda, Maryland 20892-2580 Phone: (301) 451-8027 Fax: (301) 480-4368 Email: ocg@mail.nih.gov *Please note that this site will not function properly in Internet Explorer unless you completely turn off the Compatibility View*

  7. Genomic sequencing of Pleistocene cave bears

    SciTech Connect

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  8. The genome sequence of Drosophila melanogaster.

    PubMed

    Adams, M D; Celniker, S E; Holt, R A; Evans, C A; Gocayne, J D; Amanatides, P G; Scherer, S E; Li, P W; Hoskins, R A; Galle, R F; George, R A; Lewis, S E; Richards, S; Ashburner, M; Henderson, S N; Sutton, G G; Wortman, J R; Yandell, M D; Zhang, Q; Chen, L X; Brandon, R C; Rogers, Y H; Blazej, R G; Champe, M; Pfeiffer, B D; Wan, K H; Doyle, C; Baxter, E G; Helt, G; Nelson, C R; Gabor, G L; Abril, J F; Agbayani, A; An, H J; Andrews-Pfannkoch, C; Baldwin, D; Ballew, R M; Basu, A; Baxendale, J; Bayraktaroglu, L; Beasley, E M; Beeson, K Y; Benos, P V; Berman, B P; Bhandari, D; Bolshakov, S; Borkova, D; Botchan, M R; Bouck, J; Brokstein, P; Brottier, P; Burtis, K C; Busam, D A; Butler, H; Cadieu, E; Center, A; Chandra, I; Cherry, J M; Cawley, S; Dahlke, C; Davenport, L B; Davies, P; de Pablos, B; Delcher, A; Deng, Z; Mays, A D; Dew, I; Dietz, S M; Dodson, K; Doup, L E; Downes, M; Dugan-Rocha, S; Dunkov, B C; Dunn, P; Durbin, K J; Evangelista, C C; Ferraz, C; Ferriera, S; Fleischmann, W; Fosler, C; Gabrielian, A E; Garg, N S; Gelbart, W M; Glasser, K; Glodek, A; Gong, F; Gorrell, J H; Gu, Z; Guan, P; Harris, M; Harris, N L; Harvey, D; Heiman, T J; Hernandez, J R; Houck, J; Hostin, D; Houston, K A; Howland, T J; Wei, M H; Ibegwam, C; Jalali, M; Kalush, F; Karpen, G H; Ke, Z; Kennison, J A; Ketchum, K A; Kimmel, B E; Kodira, C D; Kraft, C; Kravitz, S; Kulp, D; Lai, Z; Lasko, P; Lei, Y; Levitsky, A A; Li, J; Li, Z; Liang, Y; Lin, X; Liu, X; Mattei, B; McIntosh, T C; McLeod, M P; McPherson, D; Merkulov, G; Milshina, N V; Mobarry, C; Morris, J; Moshrefi, A; Mount, S M; Moy, M; Murphy, B; Murphy, L; Muzny, D M; Nelson, D L; Nelson, D R; Nelson, K A; Nixon, K; Nusskern, D R; Pacleb, J M; Palazzolo, M; Pittman, G S; Pan, S; Pollard, J; Puri, V; Reese, M G; Reinert, K; Remington, K; Saunders, R D; Scheeler, F; Shen, H; Shue, B C; Sidén-Kiamos, I; Simpson, M; Skupski, M P; Smith, T; Spier, E; Spradling, A C; Stapleton, M; Strong, R; Sun, E; Svirskas, R; Tector, C; Turner, R; Venter, E; Wang, A H; Wang, X; Wang, Z Y; Wassarman, D A; Weinstock, G M; Weissenbach, J; Williams, S M; WoodageT; Worley, K C; Wu, D; Yang, S; Yao, Q A; Ye, J; Yeh, R F; Zaveri, J S; Zhan, M; Zhang, G; Zhao, Q; Zheng, L; Zheng, X H; Zhong, F N; Zhong, W; Zhou, X; Zhu, S; Zhu, X; Smith, H O; Gibbs, R A; Myers, E W; Rubin, G M; Venter, J C

    2000-03-24

    The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

  9. The genome sequence of Drosophila melanogaster.

    SciTech Connect

    2000-03-24

    The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

  10. Exploring cancer genomic data from the cancer genome atlas project

    PubMed Central

    Lee, Ju-Seog

    2016-01-01

    The Cancer Genome Atlas (TCGA) has compiled genomic, epigenomic, and proteomic data from more than 10,000 samples derived from 33 types of cancer, aiming to improve our understanding of the molecular basis of cancer development. Availability of these genome-wide information provides an unprecedented opportunity for uncovering new key regulators of signaling pathways or new roles of pre-existing members in pathways. To take advantage of the advancement, it will be necessary to learn systematic approaches that can help to uncover novel genes reflecting genetic alterations, prognosis, or response to treatments. This minireview describes the updated status of TCGA project and explains how to use TCGA data. PMID:27530686

  11. Human genetics and genomics a decade after the release of the draft sequence of the human genome.

    PubMed

    Naidoo, Nasheen; Pawitan, Yudi; Soong, Richie; Cooper, David N; Ku, Chee-Seng

    2011-10-01

    Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.

  12. Strategies for complete plastid genome sequencing.

    PubMed

    Twyford, Alex D; Ness, Rob W

    2016-10-28

    Plastid sequencing is an essential tool in the study of plant evolution. This high-copy organelle is one of the most technically accessible regions of the genome, and its sequence conservation makes it a valuable region for comparative genome evolution, phylogenetic analysis and population studies. Here, we discuss recent innovations and approaches for de novo plastid assembly that harness genomic tools. We focus on technical developments including low-cost sequence library preparation approaches for genome skimming, enrichment via hybrid baits and methylation-sensitive capture, sequence platforms with higher read outputs and longer read lengths, and automated tools for assembly. These developments allow for a much more streamlined assembly than via conventional short-range PCR. Although newer methods make complete plastid sequencing possible for any land plant or green alga, there are still challenges for producing finished plastomes particularly from herbarium material or from structurally divergent plastids such as those of parasitic plants.

  13. Overview | Office of Cancer Genomics

    Cancer.gov

    The Human Cancer Model Initiative (HCMI) is an international consortium that is generating novel human tumor-derived culture models with associated genomic and clinical data. The HCMI consortium includes the US-National Cancer Institute, part of the National Institutes of Health, Cancer Research UK, foundation Hubrecht Organoid Technology, and Wellcome Trust Sanger Institute (more on the Consortium).

  14. Microbial species delineation using whole genome sequences

    SciTech Connect

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  15. Genome sequence of Coxiella burnetii strain Namibia

    PubMed Central

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  16. Refined Pichia pastoris reference genome sequence.

    PubMed

    Sturmberger, Lukas; Chappell, Thomas; Geier, Martina; Krainer, Florian; Day, Kasey J; Vide, Ursa; Trstenjak, Sara; Schiefer, Anja; Richardson, Toby; Soriaga, Leah; Darnhofer, Barbara; Birner-Gruenberger, Ruth; Glick, Benjamin S; Tolstorukov, Ilya; Cregg, James; Madden, Knut; Glieder, Anton

    2016-10-10

    Strains of the species Komagataella phaffii are the most frequently used "Pichia pastoris" strains employed for recombinant protein production as well as studies on peroxisome biogenesis, autophagy and secretory pathway analyses. Genome sequencing of several different P. pastoris strains has provided the foundation for understanding these cellular functions in recent genomics, transcriptomics and proteomics experiments. This experimentation has identified mistakes, gaps and incorrectly annotated open reading frames in the previously published draft genome sequences. Here, a refined reference genome is presented, generated with genome and transcriptome sequencing data from multiple P. pastoris strains. Twelve major sequence gaps from 20 to 6000 base pairs were closed and 5111 out of 5256 putative open reading frames were manually curated and confirmed by RNA-seq and published LC-MS/MS data, including the addition of new open reading frames (ORFs) and a reduction in the number of spliced genes from 797 to 571. One chromosomal fragment of 76kbp between two previous gaps on chromosome 1 and another 134kbp fragment at the end of chromosome 4, as well as several shorter fragments needed re-orientation. In total more than 500 positions in the genome have been corrected. This reference genome is presented with new chromosomal numbering, positioning ribosomal repeats at the distal ends of the four chromosomes, and includes predicted chromosomal centromeres as well as the sequence of two linear cytoplasmic plasmids of 13.1 and 9.5kbp found in some strains of P. pastoris.

  17. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  18. Genomic Resources for Cancer Epidemiology

    Cancer.gov

    This page provides links to research resources, complied by the Epidemiology and Genomics Research Program, that may be of interest to genetic epidemiologists conducting cancer research, but is not exhaustive.

  19. Genomic sequencing of Pleistocene cave bears.

    PubMed

    Noonan, James P; Hofreiter, Michael; Smith, Doug; Priest, James R; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J Chris; Pääbo, Svante; Rubin, Edward M

    2005-07-22

    Despite the greater information content of genomic DNA, ancient DNA studies have largely been limited to the amplification of mitochondrial sequences. Here we describe metagenomic libraries constructed with unamplified DNA extracted from skeletal remains of two 40,000-year-old extinct cave bears. Analysis of approximately 1 megabase of sequence from each library showed that despite significant microbial contamination, 5.8 and 1.1% of clones contained cave bear inserts, yielding 26,861 base pairs of cave bear genome sequence. Comparison of cave bear and modern bear sequences revealed the evolutionary relationship of these lineages. The metagenomic approach used here establishes the feasibility of ancient DNA genome sequencing programs.

  20. Sequencing and comparing whole mitochondrial genomes ofanimals

    SciTech Connect

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  1. Cancer Genome Anatomy Project | Office of Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) Cancer Genome Anatomy Project (CGAP) is an online resource designed to provide the research community access to biological tissue characterization data. Request a free copy of the CGAP Website Virtual Tour CD from ocg@mail.nih.gov.

  2. Genomics and proteomics in cancer.

    PubMed

    Baak, J P A; Path, F R C; Hermsen, M A J A; Meijer, G; Schmidt, J; Janssen, E A M

    2003-06-01

    Cancer development is driven by the accumulation of DNA changes in the approximately 40000 chromosomal genes. In solid tumours, chromosomal numerical/structural aberrations are common. DNA repair defects may lead to genome-wide genetic instability, which can drive further cancer progression. The genes code the actual players in the cellular processes, the 100000-10 million proteins, which in (pre)malignant cells can also be altered in a variety of ways. Over the past decade, our knowledge of the human genome and Genomics (the study of the human genome) in (pre)malignancies has increased enormously and Proteomics (the analysis of the protein complement of the genome) has taken off as well. Both will play an increasingly important role. In this article, a short description of the essential molecular biological cell processes is given. Important genomic and proteomic research methods are described and illustrated. Applications are still limited, but the evidence so far is exciting. Will genomics replace classical diagnostic or prognostic procedures? In breast cancers, the gene expression array is stronger than classical criteria, but in endometrial hyperplasia, quantitative morphological features are more cost-effective than genetic testing. It is still too early to make strong statements, the more so because it is expected that genomics and proteomics will expand rapidly. However, it is likely that they will take a central place in the understanding, diagnosis, monitoring and treatment of (pre)cancers of many different sites.

  3. My Identical Twin Sequenced our Genome.

    PubMed

    Schilit, Samantha L P; Schilit Nitenson, Arielle

    2017-04-01

    With rapidly declining costs, whole genome sequencing is becoming feasible for widespread use. Although cost-effectiveness is driving increased use of the technology, comprehensive recommendations on how to handle ethical dilemmas have yet to reach a consensus. In this article, Sam shares her experience of undergoing whole genome sequencing. Despite the deeply private nature of the test, the results do not solely belong to Sam; her identical twin sister, Arielle, shares virtually the same genome and received results without a formal consent process. This article explores their parallel experiences as a way of highlighting the controversial ethics of a private test with familial implications.

  4. Pervasive sequence patents cover the entire human genome

    PubMed Central

    2013-01-01

    The scope and eligibility of patents for genetic sequences have been debated for decades, but a critical case regarding gene patents (Association of Molecular Pathologists v. Myriad Genetics) is now reaching the US Supreme Court. Recent court rulings have supported the assertion that such patents can provide intellectual property rights on sequences as small as 15 nucleotides (15mers), but an analysis of all current US patent claims and the human genome presented here shows that 15mer sequences from all human genes match at least one other gene. The average gene matches 364 other genes as 15mers; the breast-cancer-associated gene BRCA1 has 15mers matching at least 689 other genes. Longer sequences (1,000 bp) still showed extensive cross-gene matches. Furthermore, 15mer-length claims from bovine and other animal patents could also claim as much as 84% of the genes in the human genome. In addition, when we expanded our analysis to full-length patent claims on DNA from all US patents to date, we found that 41% of the genes in the human genome have been claimed. Thus, current patents for both short and long nucleotide sequences are extraordinarily non-specific and create an uncertain, problematic liability for genomic medicine, especially in regard to targeted re-sequencing and other sequence diagnostic assays. PMID:23522065

  5. Draft Genome Sequences of Elizabethkingia meningoseptica

    PubMed Central

    Matyi, Stephanie A.; Hoyt, Peter R.; Hosoyama, Akira; Yamazoe, Atsushi; Fujita, Nobuyuki

    2013-01-01

    Elizabethkingia meningoseptica is ubiquitous in nature, exhibits a multiple-antibiotic resistance phenotype, and causes rare opportunistic infections. We now report two draft genome sequences of E. meningoseptica type strains that were sequenced independently in two laboratories. PMID:23846266

  6. Complete Genome Sequencing of Trivittatus virus

    PubMed Central

    Groseth, Allison; Vine, Veronica; Weisend, Carla; Ebihara, Hideki

    2015-01-01

    Trivittatus virus (family Bunyaviridae, genus Orthobunyavirus) represents an important genetic intermediate between the California encephalitis group, and Bwamba/Pongola and Nyando groups. Here, we report the first complete genome sequence of the prototype (Eklund) strain, isolated in 1948, which interestingly shows only few differences compared to partial sequences of modern strains. PMID:26212363

  7. Complete Genome Sequence of Lleida Bat Lyssavirus

    PubMed Central

    Marston, Denise A.; Ellis, Richard J.; Wise, Emma L.; Aréchiga-Ceballos, Nidia; Freuling, Conrad M.; Banyard, Ashley C.; McElhinney, Lorraine M.; de Lamballerie, Xavier; Müller, Thomas; Echevarría, Juan E.

    2017-01-01

    ABSTRACT All lyssaviruses (family Rhabdoviridae) cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Using next-generation sequencing, the full-genome sequence for a novel lyssavirus, Lleida bat lyssavirus (LLEBV), from the original brain of a common bent-winged bat has been confirmed. PMID:28082487

  8. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    PubMed

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  9. Sequence composition and genome organization of maize

    PubMed Central

    Messing, Joachim; Bharti, Arvind K.; Karlowski, Wojciech M.; Gundlach, Heidrun; Kim, Hye Ran; Yu, Yeisoo; Wei, Fusheng; Fuks, Galina; Soderlund, Carol A.; Mayer, Klaus F. X.; Wing, Rod A.

    2004-01-01

    Zea mays L. ssp. mays, or corn, one of the most important crops and a model for plant genetics, has a genome ≈80% the size of the human genome. To gain global insight into the organization of its genome, we have sequenced the ends of large insert clones, yielding a cumulative length of one-eighth of the genome with a DNA sequence read every 6.2 kb, thereby describing a large percentage of the genes and transposable elements of maize in an unbiased approach. Based on the accumulative 307 Mb of sequence, repeat sequences occupy 58% and genic regions occupy 7.5%. A conservative estimate predicts ≈59,000 genes, which is higher than in any other organism sequenced so far. Because the sequences are derived from bacterial artificial chromosome clones, which are ordered in overlapping bins, tagged genes are also ordered along continuous chromosomal segments. Based on this positional information, roughly one-third of the genes appear to consist of tandemly arrayed gene families. Although the ancestor of maize arose by tetraploidization, fewer than half of the genes appear to be present in two orthologous copies, indicating that the maize genome has undergone significant gene loss since the duplication event. PMID:15388850

  10. Genome Sequence of the Palaeopolyploid soybean

    SciTech Connect

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  11. Future medical applications of single-cell sequencing in cancer

    PubMed Central

    2011-01-01

    Advances in whole genome amplification and next-generation sequencing methods have enabled genomic analyses of single cells, and these techniques are now beginning to be used to detect genomic lesions in individual cancer cells. Previous approaches have been unable to resolve genomic differences in complex mixtures of cells, such as heterogeneous tumors, despite the importance of characterizing such tumors for cancer treatment. Sequencing of single cells is likely to improve several aspects of medicine, including the early detection of rare tumor cells, monitoring of circulating tumor cells (CTCs), measuring intratumor heterogeneity, and guiding chemotherapy. In this review we discuss the challenges and technical aspects of single-cell sequencing, with a strong focus on genomic copy number, and discuss how this information can be used to diagnose and treat cancer patients. PMID:21631906

  12. Future medical applications of single-cell sequencing in cancer.

    PubMed

    Navin, Nicholas; Hicks, James

    2011-05-31

    Advances in whole genome amplification and next-generation sequencing methods have enabled genomic analyses of single cells, and these techniques are now beginning to be used to detect genomic lesions in individual cancer cells. Previous approaches have been unable to resolve genomic differences in complex mixtures of cells, such as heterogeneous tumors, despite the importance of characterizing such tumors for cancer treatment. Sequencing of single cells is likely to improve several aspects of medicine, including the early detection of rare tumor cells, monitoring of circulating tumor cells (CTCs), measuring intratumor heterogeneity, and guiding chemotherapy. In this review we discuss the challenges and technical aspects of single-cell sequencing, with a strong focus on genomic copy number, and discuss how this information can be used to diagnose and treat cancer patients.

  13. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  14. Genome Update. Let the consumer beware: Streptomyces genome sequence quality.

    PubMed

    Studholme, David J

    2016-01-01

    A genome sequence assembly represents a model of a genome. This article explores some tools and methods for assessing the quality of an assembly, using publicly available data for Streptomyces species as the example. There is great variability in quality of assemblies deposited in GenBank. Only in a small minority of these assemblies are the raw data available, enabling full appraisal of the assembly quality.

  15. Whole Genome Sequencing of High-Risk Families to Identify New Mutational Mechanisms of Breast Cancer Predisposition

    DTIC Science & Technology

    2015-12-01

    genotyping using next-generation DNA sequencing data. Nat Genet 43, 491-498. 4 Garrison E, Marth G (2012). Haplotype-based variant detection from short...Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA. (2012). An integrated map of genetic ...structure and demographic history of the Dutch population. Nat Genet . 46:818-825. 11 Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, Ward

  16. Exploring cancer genomic data from the cancer genome atlas project.

    PubMed

    Lee, Ju-Seog

    2016-11-01

    The Cancer Genome Atlas (TCGA) has compiled genomic, epigenomic, and proteomic data from more than 10,000 samples derived from 33 types of cancer, aiming to improve our understanding of the molecular basis of cancer development. Availability of these genome-wide information provides an unprecedented opportunity for uncovering new key regulators of signaling pathways or new roles of pre-existing members in pathways. To take advantage of the advancement, it will be necessary to learn systematic approaches that can help to uncover novel genes reflecting genetic alterations, prognosis, or response to treatments. This minireview describes the updated status of TCGA project and explains how to use TCGA data. [BMB Reports 2016; 49(11): 607-611].

  17. The genome sequence of Schizosaccharomyces pombe.

    PubMed

    Wood, V; Gwilliam, R; Rajandream, M-A; Lyne, M; Lyne, R; Stewart, A; Sgouros, J; Peat, N; Hayles, J; Baker, S; Basham, D; Bowman, S; Brooks, K; Brown, D; Brown, S; Chillingworth, T; Churcher, C; Collins, M; Connor, R; Cronin, A; Davis, P; Feltwell, T; Fraser, A; Gentles, S; Goble, A; Hamlin, N; Harris, D; Hidalgo, J; Hodgson, G; Holroyd, S; Hornsby, T; Howarth, S; Huckle, E J; Hunt, S; Jagels, K; James, K; Jones, L; Jones, M; Leather, S; McDonald, S; McLean, J; Mooney, P; Moule, S; Mungall, K; Murphy, L; Niblett, D; Odell, C; Oliver, K; O'Neil, S; Pearson, D; Quail, M A; Rabbinowitsch, E; Rutherford, K; Rutter, S; Saunders, D; Seeger, K; Sharp, S; Skelton, J; Simmonds, M; Squares, R; Squares, S; Stevens, K; Taylor, K; Taylor, R G; Tivey, A; Walsh, S; Warren, T; Whitehead, S; Woodward, J; Volckaert, G; Aert, R; Robben, J; Grymonprez, B; Weltjens, I; Vanstreels, E; Rieger, M; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Düsterhöft, A; Fritzc, C; Holzer, E; Moestl, D; Hilbert, H; Borzym, K; Langer, I; Beck, A; Lehrach, H; Reinhardt, R; Pohl, T M; Eger, P; Zimmermann, W; Wedler, H; Wambutt, R; Purnelle, B; Goffeau, A; Cadieu, E; Dréano, S; Gloux, S; Lelaure, V; Mottier, S; Galibert, F; Aves, S J; Xiang, Z; Hunt, C; Moore, K; Hurst, S M; Lucas, M; Rochet, M; Gaillardin, C; Tallada, V A; Garzon, A; Thode, G; Daga, R R; Cruzado, L; Jimenez, J; Sánchez, M; del Rey, F; Benito, J; Domínguez, A; Revuelta, J L; Moreno, S; Armstrong, J; Forsburg, S L; Cerutti, L; Lowe, T; McCombie, W R; Paulsen, I; Potashkin, J; Shpakovski, G V; Ussery, D; Barrell, B G; Nurse, P; Cerrutti, L

    2002-02-21

    We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.

  18. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    PubMed

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

  19. Accelerating Genome Sequencing 100X with FPGAs

    SciTech Connect

    Storaasli, Olaf O; Strenski, Dave

    2007-01-01

    The performance of two Cray XD1 systems with Virtex-II Pro 50 and Virtex-4 LX160 FPGAs was evaluated using the FASTA computational biology program for human genome (DNA and protein) sequence comparisons. FPGA speedups of 50X (Virtex-II Pro 50) and 100X (Virtex-4 LX160) over a 2.2 GHz Opteron were obtained. FPGA coding issues for human genome data are described.

  20. Sorghum genome sequencing by methylation filtration.

    PubMed

    Bedell, Joseph A; Budiman, Muhammad A; Nunberg, Andrew; Citek, Robert W; Robbins, Dan; Jones, Joshua; Flick, Elizabeth; Rholfing, Theresa; Fries, Jason; Bradford, Kourtney; McMenamy, Jennifer; Smith, Michael; Holeman, Heather; Roe, Bruce A; Wiley, Graham; Korf, Ian F; Rabinowicz, Pablo D; Lakey, Nathan; McCombie, W Richard; Jeddeloh, Jeffrey A; Martienssen, Robert A

    2005-01-01

    Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF) technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis.

  1. Initial genome sequencing and analysis of multiple myeloma

    PubMed Central

    Chapman, Michael A.; Lawrence, Michael S.; Keats, Jonathan J.; Cibulskis, Kristian; Sougnez, Carrie; Schinzel, Anna C.; Harview, Christina L.; Brunet, Jean-Philippe; Ahmann, Gregory J.; Adli, Mazhar; Anderson, Kenneth C.; Ardlie, Kristin G.; Auclair, Daniel; Baker, Angela; Bergsagel, P. Leif; Bernstein, Bradley E.; Drier, Yotam; Fonseca, Rafael; Gabriel, Stacey B.; Hofmeister, Craig C.; Jagannath, Sundar; Jakubowiak, Andrzej J.; Krishnan, Amrita; Levy, Joan; Liefeld, Ted; Lonial, Sagar; Mahan, Scott; Mfuko, Bunmi; Monti, Stefano; Perkins, Louise M.; Onofrio, Robb; Pugh, Trevor J.; Vincent Rajkumar, S.; Ramos, Alex H.; Siegel, David S.; Sivachenko, Andrey; Trudel, Suzanne; Vij, Ravi; Voet, Douglas; Winckler, Wendy; Zimmerman, Todd; Carpten, John; Trent, Jeff; Hahn, William C.; Garraway, Levi A.; Meyerson, Matthew; Lander, Eric S.; Getz, Gad; Golub, Todd R.

    2013-01-01

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumor genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the dataset. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-κB signaling was suggested by mutations in 11 members of the NF-κB pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge. PMID:21430775

  2. Translating Cancer Genomes and Transcriptomes for Precision Oncology

    PubMed Central

    Roychowdhury, Sameek; Chinnaiyan, Arul M.

    2015-01-01

    Understanding the molecular landscape of cancer has facilitated the development of diagnostic, prognostic, and predictive biomarkers for clinical oncology. Developments in next generation DNA sequencing technologies have increased the speed and reduced the cost of sequencing the nucleic acids of cancer cells. This has unlocked opportunities to characterize the genomic and transcriptomic landscapes of cancer for basic science research through projects such as The Cancer Genome Atlas. The cancer genome includes DNA-based alterations such as point mutations or gene duplications. The cancer transcriptome involves RNA-based alterations including changes in messenger RNAs. Together the genome and transcriptome can provide a comprehensive view of an individual patient’s cancer and is beginning to impact real-time clinical decision-making. We discuss several opportunities for translating this basic science knowledge into clinical practice including a molecular classification of cancer, heritable risk of cancer, eligibility for targeted therapies, and the development of innovative genomic-based clinical trials. In this review, we outline key applications and new directions for translating the cancer genome and transcriptome into patient care in the clinic. PMID:26528881

  3. Complete genome sequence of Caulobacter crescentus

    PubMed Central

    Nierman, William C.; Feldblyum, Tamara V.; Laub, Michael T.; Paulsen, Ian T.; Nelson, Karen E.; Eisen, Jonathan; Heidelberg, John F.; Alley, M. R. K.; Ohta, Noriko; Maddock, Janine R.; Potocka, Isabel; Nelson, William C.; Newton, Austin; Stephens, Craig; Phadke, Nikhil D.; Ely, Bert; DeBoy, Robert T.; Dodson, Robert J.; Durkin, A. Scott; Gwinn, Michelle L.; Haft, Daniel H.; Kolonay, James F.; Smit, John; Craven, M. B.; Khouri, Hoda; Shetty, Jyoti; Berry, Kristi; Utterback, Teresa; Tran, Kevin; Wolf, Alex; Vamathevan, Jessica; Ermolaeva, Maria; White, Owen; Salzberg, Steven L.; Venter, J. Craig; Shapiro, Lucy; Fraser, Claire M.

    2001-01-01

    The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living α-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus. PMID:11259647

  4. Genomic imprinting syndromes and cancer.

    PubMed

    Lim, Derek Hock Kiat; Maher, Eamonn Richard

    2010-01-01

    Genomic imprinting represents a form of epigenetic control of gene expression in which one allele of a gene is preferentially expressed according to the parent-of-origin of the allele. Genomic imprinting plays an important role in normal growth and development. Disruption of imprinting can result in a number of human imprinting syndromes and predispose to cancer. In this chapter, we describe a number of human imprinting syndromes to illustrate the concepts of genomic imprinting and how loss of imprinting of imprinted genes their relationship to human neoplasia.

  5. An emerging place for lung cancer genomics in 2013

    PubMed Central

    Bowman, Rayleen V.; Yang, Ian A.; Govindan, Ramaswamy; Fong, Kwun M.

    2013-01-01

    Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could

  6. An emerging place for lung cancer genomics in 2013.

    PubMed

    Daniels, Marissa G; Bowman, Rayleen V; Yang, Ian A; Govindan, Ramaswamy; Fong, Kwun M

    2013-10-01

    Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could

  7. Genomic Instability in Cancer

    PubMed Central

    Abbas, Tarek; Keaton, Mignon A.; Dutta, Anindya

    2013-01-01

    One of the fundamental challenges facing the cell is to accurately copy its genetic material to daughter cells. When this process goes awry, genomic instability ensues in which genetic alterations ranging from nucleotide changes to chromosomal translocations and aneuploidy occur. Organisms have developed multiple mechanisms that can be classified into two major classes to ensure the fidelity of DNA replication. The first class includes mechanisms that prevent premature initiation of DNA replication and ensure that the genome is fully replicated once and only once during each division cycle. These include cyclin-dependent kinase (CDK)-dependent mechanisms and CDK-independent mechanisms. Although CDK-dependent mechanisms are largely conserved in eukaryotes, higher eukaryotes have evolved additional mechanisms that seem to play a larger role in preventing aberrant DNA replication and genome instability. The second class ensures that cells are able to respond to various cues that continuously threaten the integrity of the genome by initiating DNA-damage-dependent “checkpoints” and coordinating DNA damage repair mechanisms. Defects in the ability to safeguard against aberrant DNA replication and to respond to DNA damage contribute to genomic instability and the development of human malignancy. In this article, we summarize our current knowledge of how genomic instability arises, with a particular emphasis on how the DNA replication process can give rise to such instability. PMID:23335075

  8. Cancer Genome Anatomy Project (CGAP) | Office of Cancer Genomics

    Cancer.gov

    CGAP generated a wide range of genomics data on cancerous cells that are accessible through easy-to-use online tools. Researchers, educators, and students can find "in silico" answers to biological questions through the CGAP website. Request a free copy of the CGAP Website Virtual Tour CD from ocg@mail.nih.gov to learn how to navigate the website.

  9. Mapping and sequencing the human genome

    SciTech Connect

    1988-01-01

    Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

  10. Mapping and Sequencing the Human Genome

    DOE R&D Accomplishments Database

    1988-01-01

    Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

  11. Dana-Farber Cancer Institute | Office of Cancer Genomics

    Cancer.gov

    Functional Annotation of Cancer Genomes Principal Investigator: William C. Hahn, M.D., Ph.D. The comprehensive characterization of cancer genomes has and will continue to provide an increasingly complete catalog of genetic alterations in specific cancers. However, most epithelial cancers harbor hundreds of genetic alterations as a consequence of genomic instability. Therefore, the functional consequences of the majority of mutations remain unclear.

  12. Complete Genome Sequences of 61 Mycobacteriophages

    PubMed Central

    2016-01-01

    Mycobacteriophages—viruses of mycobacteria—provide insights into viral diversity and evolution as well as numerous tools for genetic dissection of Mycobacterium tuberculosis. Here we report the complete genome sequences of 61 mycobacteriophages newly isolated from environmental samples using Mycobacterium smegmatis mc2155 that expand our understanding of phage diversity. PMID:27389257

  13. Draft Genome Sequence of Lactobacillus plantarum 2165

    PubMed Central

    Abramov, Vyacheslav M.

    2014-01-01

    This report describes a draft genome sequence of Lactobacillus plantarum 2165. The data demonstrate the presence of a large number of genes responsible for sugar metabolism and the fermentation activity of this bacterium. Different cell surface proteins, including fibronectin and mucus-binding adhesins, may contribute to the beneficial probiotic properties of this strain. PMID:24407651

  14. Genome sequence of Bacillus licheniformis WX-02.

    PubMed

    Yangtse, Wuming; Zhou, Yinhua; Lei, Yang; Qiu, Yimin; Wei, Xuetuan; Ji, Zhixia; Qi, Gaofu; Yong, Yangchun; Chen, Lingling; Chen, Shouwen

    2012-07-01

    Bacillus licheniformis is an important bacterium that has been used extensively for large-scale industrial production of exoenzymes and peptide antibiotics. B. licheniformis WX-02 produces poly-gamma-glutamate increasingly when fermented under stress conditions. Here its genome sequence (4,270,104 bp, with G+C content of 46.06%), which comprises a circular chromosome, is announced.

  15. Draft genome sequence of Bacillus oceanisediminis 2691.

    PubMed

    Lee, Yong-Jik; Lee, Sang-Jae; Jeong, Haeyoung; Kim, Hyun Ju; Ryu, Naeun; Kim, Byoung-Chan; Lee, Han-Seung; Lee, Dong-Woo; Lee, Sang Jun

    2012-11-01

    Bacillus oceanisediminis 2691 is an aerobic, Gram-positive, spore-forming, and moderately halophilic bacterium that was isolated from marine sediment of the Yellow Sea coast of South Korea. Here, we report the draft genome sequence of B. oceanisediminis 2691 that may have an important role in the bioremediation of marine sediment.

  16. Translating gastric cancer genomics into targeted therapies.

    PubMed

    Ang, Yvonne L E; Yong, Wei Peng; Tan, Patrick

    2016-04-01

    Gastric cancer is a common disease with limited treatment options and a poor prognosis. Many gastric cancers harbour potentially actionable targets, including over-expression and mutations in tyrosine kinase pathways. Agents have been developed against these targets with varying success- in particular, the use of trastuzumab in HER2-overexpressing gastric cancers has resulted in overall survival benefits. Gastric cancers also have high levels of somatic mutations, making them candidates for immunotherapy; early work in this field has been promising. Recent advances in whole genome and multi-platform sequencing have driven the development of molecular classification systems, which may in turn guide the selection of patients for targeted treatment. Moving forward, challenges will include the development of appropriate biomarkers to predict responses to targeted therapy, and the application of new molecular classifications into trial development and clinical practice.

  17. Whole Genome Sequence of a Turkish Individual

    PubMed Central

    Dogan, Haluk; Can, Handan; Otu, Hasan H.

    2014-01-01

    Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ∼1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123∶2,122,671 or 1∶1.5) and transition/transversion ratios (2,383,204∶1,154,590 or 2.06∶1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1∶1.09 insertion/deletion ratio), ranging from −52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale. PMID:24416366

  18. A Draft Sequence of the Neandertal Genome

    PubMed Central

    Green, Richard E.; Li, Heng; Zhai, Weiwei; Fritz, Markus Hsi-Yang; Hansen, Nancy F.; Durand, Eric Y.; Malaspinas, Anna-Sapfo; Jensen, Jeffrey D.; Marques-Bonet, Tomas; Alkan, Can; Prüfer, Kay; Meyer, Matthias; Burbano, Hernán A.; Good, Jeffrey M.; Schultz, Rigo; Aximu-Petri, Ayinuer; Butthof, Anne; Höber, Barbara; Höffner, Barbara; Siegemund, Madlen; Weihmann, Antje; Nusbaum, Chad; Lander, Eric S.; Russ, Carsten; Novod, Nathaniel; Affourtit, Jason; Egholm, Michael; Verna, Christine; Rudan, Pavao; Brajkovic, Dejana; Kucan, Željko; Gušic, Ivan; Doronichev, Vladimir B.; Golovanova, Liubov V.; Lalueza-Fox, Carles; de la Rasilla, Marco; Fortea, Javier; Rosas, Antonio; Schmitz, Ralf W.; Johnson, Philip L. F.; Eichler, Evan E.; Falush, Daniel; Birney, Ewan; Mullikin, James C.; Slatkin, Montgomery; Nielsen, Rasmus; Kelso, Janet; Lachmann, Michael; Reich, David; Pääbo, Svante

    2016-01-01

    Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other. PMID:20448178

  19. Gambling on a shortcut to genome sequencing

    SciTech Connect

    Roberts, L.

    1991-06-21

    Almost from the start of the Human Genome Project, a debate has been raging over whether to sequence the entire human genome, all 3 billion bases, or just the genes - a mere 2% or 3% of the genome, and by far the most interesting part. In England, Sydney Brenner convinced the Medical Research Council (MRC) to start with the expressed genes, or complementary DNAs. But the US stance has been that the entire sequence is essential if we are to understand the blueprint of man. Craig Venter of the National Institute of Neurological Disorders and Stroke says that focusing on the expressed genes may be even more useful than expected. His strategy involves randomly selecting clones from cDNA libraries which theoretically contain all the genes that are switched on at a particular time in a particular tissue. Then the researchers sequence just a short stretch of each clone, about 400 to 500 bases, to create can expressed sequence tag or EST. The sequences of these ESTs are then stored in a database. Using that information, other researchers can then recreate that EST by using polymerase chain reaction techniques.

  20. Dominant short repeated sequences in bacterial genomes.

    PubMed

    Avershina, Ekaterina; Rudi, Knut

    2015-03-01

    We use a novel multidimensional searching approach to present the first exhaustive search for all possible repeated sequences in 166 genomes selected to cover the bacterial domain. We found an overrepresentation of repeated sequences in all but one of the genomes. The most prevalent repeats by far were related to interspaced short palindromic repeats (CRISPRs)—conferring bacterial adaptive immunity. We identified a deep branching clade of thermophilic Firmicutes containing the highest number of CRISPR repeats. We also identified a high prevalence of tandem repeated heptamers. In addition, we identified GC-rich repeats that could potentially be involved in recombination events. Finally, we identified repeats in a 16322 amino acid mega protein (involved in biofilm formation) and inverted repeats flanking miniature transposable elements (MITEs). In conclusion, the exhaustive search for repeated sequences identified new elements and distribution of these, which has implications for understanding both the ecology and evolution of bacteria.

  1. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

  2. Genomics of Cancer and a New Era for Cancer Prevention

    PubMed Central

    Brennan, Paul; Wild, Christopher P.

    2015-01-01

    A primary justification for dedicating substantial amounts of research funding to large-scale cancer genomics projects of both somatic and germline DNA is that the biological insights will lead to new treatment targets and strategies for cancer therapy. While it is too early to judge the success of these projects in terms of clinical breakthroughs, an alternative rationale is that new genomics techniques can be used to reduce the overall burden of cancer by prevention of new cases occurring and also by detecting them earlier. In particular, it is now becoming apparent that studying the genomic profile of tumors can help to identify new carcinogens and may subsequently result in implementing strategies that limit exposure. In parallel, it may be feasible to utilize genomic biomarkers to identify cancers at an earlier and more treatable stage using screening or other early detection approaches based on prediagnostic biospecimens. While the potential for these techniques is large, their successful outcome will depend on international collaboration and planning similar to that of recent sequencing initiatives. PMID:26540230

  3. Genome sequence of Aspergillus luchuensis NBRC 4314

    PubMed Central

    Yamada, Osamu; Machida, Masayuki; Hosoyama, Akira; Goto, Masatoshi; Takahashi, Toru; Futagami, Taiki; Yamagata, Youhei; Takeuchi, Michio; Kobayashi, Tetsuo; Koike, Hideaki; Abe, Keietsu; Asai, Kiyoshi; Arita, Masanori; Fujita, Nobuyuki; Fukuda, Kazuro; Higa, Ken-ichi; Horikawa, Hiroshi; Ishikawa, Takeaki; Jinno, Koji; Kato, Yumiko; Kirimura, Kohtaro; Mizutani, Osamu; Nakasone, Kaoru; Sano, Motoaki; Shiraishi, Yohei; Tsukahara, Masatoshi; Gomi, Katsuya

    2016-01-01

    Awamori is a traditional distilled beverage made from steamed Thai-Indica rice in Okinawa, Japan. For brewing the liquor, two microbes, local kuro (black) koji mold Aspergillus luchuensis and awamori yeast Saccharomyces cerevisiae are involved. In contrast, that yeasts are used for ethanol fermentation throughout the world, a characteristic of Japanese fermentation industries is the use of Aspergillus molds as a source of enzymes for the maceration and saccharification of raw materials. Here we report the draft genome of a kuro (black) koji mold, A. luchuensis NBRC 4314 (RIB 2604). The total length of nonredundant sequences was nearly 34.7 Mb, comprising approximately 2,300 contigs with 16 telomere-like sequences. In total, 11,691 genes were predicted to encode proteins. Most of the housekeeping genes, such as transcription factors and N-and O-glycosylation system, were conserved with respect to Aspergillus niger and Aspergillus oryzae. An alternative oxidase and acid-stable α-amylase regarding citric acid production and fermentation at a low pH as well as a unique glutamic peptidase were also found in the genome. Furthermore, key biosynthetic gene clusters of ochratoxin A and fumonisin B were absent when compared with A. niger genome, showing the safety of A. luchuensis for food and beverage production. This genome information will facilitate not only comparative genomics with industrial kuro-koji molds, but also molecular breeding of the molds in improvements of awamori fermentation. PMID:27651094

  4. Agaricus bisporus genome sequence: a commentary.

    PubMed

    Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

    2013-06-01

    The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and β-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium.

  5. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  6. An evaluation of Comparative Genome Sequencing (CGS) by comparing two previously-sequenced bacterial genomes

    PubMed Central

    Herring, Christopher D; Palsson, Bernhard Ø

    2007-01-01

    Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions. PMID:17697331

  7. Clinical implications of genomics for cancer risk genetics.

    PubMed

    Thomas, David M; James, Paul A; Ballinger, Mandy L

    2015-06-01

    The study of human genetics has provided substantial insight into cancer biology. With an increase in sequencing capacity and a reduction in sequencing costs, genomics will probably transform clinical cancer genetics. A heritable basis for many cancers is accepted, but so far less than half the genetic drivers have been identified. Genomics will increasingly be applied to populations irrespective of family history, which will change the framework of phenotype-directed genetic testing. Panel testing and whole genome sequencing will identify novel, polygenic, and de-novo determinants of cancer risk, often with lower penetrance, which will challenge present binary clinical classification systems and management algorithms. In the future, genotype-stratified public screening and prevention programmes could form part of tailored population risk management. The integration of research with clinical practice will result in so-called discovery cohorts that will help identify clinically significant genetic variation.

  8. The genome of a blood fluke associated with human cancer

    PubMed Central

    Mitreva, Makedonka

    2013-01-01

    The sequencing of the genome and transcriptome of Schistosoma haematobium, a highly prevalent blood fluke and human parasite with a proven link to malignant bladder cancer, marks the 160th anniversary of its discovery as the first schistosome known to infect humans. Comparative genomic analyses of S. haematobium and the more prevalent human-schistosomiasis pathogens (Schistosoma mansoni and Schistosoma japonicum) identified both shared and distinct genomic features. PMID:22281765

  9. Expanding the computational toolbox for mining cancer genomes.

    PubMed

    Ding, Li; Wendl, Michael C; McMichael, Joshua F; Raphael, Benjamin J

    2014-08-01

    High-throughput DNA sequencing has revolutionized the study of cancer genomics with numerous discoveries that are relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations, including single-nucleotide variants, insertions and deletions, copy-number aberrations, structural variants and gene fusions. Additional computational techniques have proved useful for defining the mutations, genes and molecular networks that drive diverse cancer phenotypes and that determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic and epigenomic alterations in cancer, and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application.

  10. The topography of mutational processes in breast cancer genomes

    SciTech Connect

    Morganella, Sandro; Alexandrov, Ludmil B.; Glodzik, Dominik; Zou, Xueqing; Davies, Helen; Staaf, Johan; Sieuwerts, Anieta M.; Brinkman, Arie B.; Martin, Sancha; Ramakrishna, Manasa; Butler, Adam; Kim, Hyung -Yong; Borg, Ake; Sotiriou, Christos; Futreal, P. Andrew; Campbell, Peter J.; Span, Paul N.; Van Laere, Steven; Lakhani, Sunil R.; Eyfjord, Jorunn E.; Thompson, Alastair M.; Stunnenberg, Hendrik G.; van de Vijver, Marc J.; Martens, John W. M.; Borresen-Dale, Anne -Lise; Richardson, Andrea L.; Kong, Gu; Thomas, Gilles; Sale, Julian; Rada, Cristina; Stratton, Michael R.; Birney, Ewan; Nik-Zainal, Serena

    2016-01-01

    Somatic mutations in human cancers show unevenness in genomic distribution that correlate with aspects of genome structure and function. These mutations are, however, generated by multiple mutational processes operating through the cellular lineage between the fertilized egg and the cancer cell, each composed of specific DNA damage and repair components and leaving its own characteristic mutational signature on the genome. Using somatic mutation catalogues from 560 breast cancer whole-genome sequences, here we show that each of 12 base substitution, 2 insertion/deletion (indel) and 6 rearrangement mutational signatures present in breast tissue, exhibit distinct relationships with genomic features relating to transcription, DNA replication and chromatin organization. This signature-based approach permits visualization of the genomic distribution of mutational processes associated with APOBEC enzymes, mismatch repair deficiency and homologous recombinational repair deficiency, as well as mutational processes of unknown aetiology. Lastly, it highlights mechanistic insights including a putative replication-dependent mechanism of APOBEC-related mutagenesis.

  11. The topography of mutational processes in breast cancer genomes

    PubMed Central

    Morganella, Sandro; Alexandrov, Ludmil B.; Glodzik, Dominik; Zou, Xueqing; Davies, Helen; Staaf, Johan; Sieuwerts, Anieta M.; Brinkman, Arie B.; Martin, Sancha; Ramakrishna, Manasa; Butler, Adam; Kim, Hyung-Yong; Borg, Åke; Sotiriou, Christos; Futreal, P. Andrew; Campbell, Peter J.; Span, Paul N.; Van Laere, Steven; Lakhani, Sunil R.; Eyfjord, Jorunn E.; Thompson, Alastair M.; Stunnenberg, Hendrik G.; van de Vijver, Marc J.; Martens, John W. M.; Børresen-Dale, Anne-Lise; Richardson, Andrea L.; Kong, Gu; Thomas, Gilles; Sale, Julian; Rada, Cristina; Stratton, Michael R.; Birney, Ewan; Nik-Zainal, Serena

    2016-01-01

    Somatic mutations in human cancers show unevenness in genomic distribution that correlate with aspects of genome structure and function. These mutations are, however, generated by multiple mutational processes operating through the cellular lineage between the fertilized egg and the cancer cell, each composed of specific DNA damage and repair components and leaving its own characteristic mutational signature on the genome. Using somatic mutation catalogues from 560 breast cancer whole-genome sequences, here we show that each of 12 base substitution, 2 insertion/deletion (indel) and 6 rearrangement mutational signatures present in breast tissue, exhibit distinct relationships with genomic features relating to transcription, DNA replication and chromatin organization. This signature-based approach permits visualization of the genomic distribution of mutational processes associated with APOBEC enzymes, mismatch repair deficiency and homologous recombinational repair deficiency, as well as mutational processes of unknown aetiology. Furthermore, it highlights mechanistic insights including a putative replication-dependent mechanism of APOBEC-related mutagenesis. PMID:27136393

  12. Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux

    PubMed Central

    Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

    2012-01-01

    We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ∼20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology. PMID:22848480

  13. The first genome sequences of human bocaviruses from Vietnam

    PubMed Central

    2016-01-01

    As part of an ongoing effort to generate complete genome sequences of hand, foot and mouth disease-causing enteroviruses directly from clinical specimens, two complete coding sequences and two partial genomic sequences of human bocavirus 1 (n=3) and 2 (n=1) were co-amplified and sequenced, representing the first genome sequences of human bocaviruses from Vietnam. The sequences may aid future study aiming at understanding the evolution of the pathogen. PMID:28090592

  14. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    PubMed Central

    Yates, Lucy R; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Loo, Peter Van; Aas, Turid; Alexandrov, Ludmil B; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre-Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R; Sotiriou, Christos; Richardson, Andrea L; Lønning, Per Eystein; Wedge, David C; Campbell, Peter J

    2015-01-01

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient’s tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole genome and targeted sequencing to multiple samples from each of 50 patients’ tumors (total 303). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13/50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resisting chemotherapy and acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer. PMID:26099045

  15. Whole-genome sequencing in bacteriology: state of the art

    PubMed Central

    Dark, Michael J

    2013-01-01

    Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

  16. Draft Genome Sequence of Actinomyces massiliensis Strain 4401292T

    PubMed Central

    Robert, Catherine; Gimenez, Grégory; Gharbi, Reem; Raoult, Didier

    2012-01-01

    A draft genome sequence of Actinomyces massiliensis, an anaerobic bacterium isolated from a patient's blood culture, is described here. CRISPR-associated proteins, insertion sequences, and toxin-antitoxin loci were found on the genome. PMID:22933754

  17. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  18. Genome Sequencing Reveals a Phage in Helicobacter pylori

    PubMed Central

    Lehours, Philippe; Vale, Filipa F.; Bjursell, Magnus K.; Melefors, Ojar; Advani, Reza; Glavas, Steve; Guegueniat, Julia; Gontier, Etienne; Lacomme, Sabrina; Alves Matos, António; Menard, Armelle; Mégraud, Francis; Engstrand, Lars; Andersson, Anders F.

    2011-01-01

    ABSTRACT Helicobacter pylori chronically infects the gastric mucosa in more than half of the human population; in a subset of this population, its presence is associated with development of severe disease, such as gastric cancer. Genomic analysis of several strains has revealed an extensive H. pylori pan-genome, likely to grow as more genomes are sampled. Here we describe the draft genome sequence (63 contigs; 26× mean coverage) of H. pylori strain B45, isolated from a patient with gastric mucosa-associated lymphoid tissue (MALT) lymphoma. The major finding was a 24.6-kb prophage integrated in the bacterial genome. The prophage shares most of its genes (22/27) with prophage region II of Helicobacter acinonychis strain Sheeba. After UV treatment of liquid cultures, circular DNA carrying the prophage integrase gene could be detected, and intracellular tailed phage-like particles were observed in H. pylori cells by transmission electron microscopy, indicating that phage production can be induced from the prophage. PCR amplification and sequencing of the integrase gene from 341 H. pylori strains from different geographic regions revealed a high prevalence of the prophage (21.4%). Phylogenetic reconstruction showed four distinct clusters in the integrase gene, three of which tended to be specific for geographic regions. Our study implies that phages may play important roles in the ecology and evolution of H. pylori. PMID:22086490

  19. Simple sequence repeats in prokaryotic genomes

    PubMed Central

    Mrázek, Jan; Guo, Xiangxue; Shah, Apurva

    2007-01-01

    Simple sequence repeats (SSRs) in DNA sequences are composed of tandem iterations of short oligonucleotides and may have functional and/or structural properties that distinguish them from general DNA sequences. They are variable in length because of slip-strand mutations and may also affect local structure of the DNA molecule or the encoded proteins. Long SSRs (LSSRs) are common in eukaryotes but rare in most prokaryotes. In pathogens, SSRs can enhance antigenic variance of the pathogen population in a strategy that counteracts the host immune response. We analyze representations of SSRs in >300 prokaryotic genomes and report significant differences among different prokaryotes as well as among different types of SSRs. LSSRs composed of short oligonucleotides (1–4 bp length, designated LSSR1–4) are often found in host-adapted pathogens with reduced genomes that are not known to readily survive in a natural environment outside the host. In contrast, LSSRs composed of longer oligonucleotides (5–11 bp length, designated LSSR5–11) are found mostly in nonpathogens and opportunistic pathogens with large genomes. Comparisons among SSRs of different lengths suggest that LSSR1–4 are likely maintained by selection. This is consistent with the established role of some LSSR1–4 in enhancing antigenic variance. By contrast, abundance of LSSR5–11 in some genomes may reflect the SSRs' general tendency to expand rather than their specific role in the organisms' physiology. Differences among genomes in terms of SSR representations and their possible interpretations are discussed. PMID:17485665

  20. Benchmark dataset for Whole Genome sequence compression.

    PubMed

    C L, Biji; Nair, Achuthsankar

    2016-05-16

    The research in DNA data compression lacks a standard dataset to test out compression tools specific to DNA. This paper argues that the current state of achievement in DNA compression is unable to be bench marked in the absence of such scientifically compiled whole genome sequence dataset and proposes a bench mark dataset using multistage sampling procedure. Considering the genome sequence of organisms available in the National Centre for Biotechnology and Information (NCBI) as the universe, the proposed dataset selects 1105 prokaryotes, 200 plasmids, 164 viruses and 65 eukaryotes. This paper reports the results of using 3 established tools on the newly compiled dataset and show that their strength and weakness are evident only with a comparison based on the scientifically compiled bench mark data set.

  1. Whole-genome reconstruction and mutational signatures in gastric cancer

    PubMed Central

    2012-01-01

    Background Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability. Results Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer - against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer. Conclusions These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data. PMID:23237666

  2. Complete genome sequence of Candidatus Ruthia magnifica.

    PubMed

    Roeselers, Guus; Newton, Irene L G; Woyke, Tanja; Auchtung, Thomas A; Dilly, Geoffrey F; Dutton, Rachel J; Fisher, Meredith C; Fontanez, Kristina M; Lau, Evan; Stewart, Frank J; Richardson, Paul M; Barry, Kerrie W; Saunders, Elizabeth; Detter, John C; Wu, Dongying; Eisen, Jonathan A; Cavanaugh, Colleen M

    2010-10-27

    The hydrothermal vent clam Calyptogena magnifica (Bivalvia: Mollusca) is a member of the Vesicomyidae. Species within this family form symbioses with chemosynthetic Gammaproteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a rudimentary gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. The C. magnifica symbiont, Candidatus Ruthia magnifica, was the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced (Newton et al. 2007). Here we expand upon the original report and provide additional details complying with the emerging MIGS/MIMS standards. The complete genome exposed the genetic blueprint of the metabolic capabilities of the symbiont. Genes which were predicted to encode the proteins required for all the metabolic pathways typical of free-living chemoautotrophs were detected in the symbiont genome. These include major pathways including carbon fixation, sulfur oxidation, nitrogen assimilation, as well as amino acid and cofactor/vitamin biosynthesis. This genome sequence is invaluable in the study of these enigmatic associations and provides insights into the origin and evolution of autotrophic endosymbiosis.

  3. Draft Genome Sequence of Rubrivivax gelatinosus CBS

    SciTech Connect

    Hu, P. S.; Lang, J.; Wawrousek, K.; Yu, J. P.; Maness, P. C.; Chen, J.

    2012-06-01

    Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N{sub 2} as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H{sub 2}. We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium.

  4. Complete Genome Sequences of 138 Mycobacteriophages

    PubMed Central

    2012-01-01

    Bacteriophages are the most numerous biological entities in the biosphere, and although their genetic diversity is high, it remains ill defined. Mycobacteriophages—the viruses of mycobacterial hosts—provide insights into this diversity as well as tools for manipulating Mycobacterium tuberculosis. We report here the complete genome sequences of 138 new mycobacteriophages, which—together with the 83 mycobacteriophages previously reported—represent the largest collection of phages known to infect a single common host, Mycobacterium smegmatis mc2 155. PMID:22282335

  5. Genome Sequence of Aerococcus viridans LL1

    PubMed Central

    Qin, Nan; Zheng, Beiwen; Yang, Fengling; Chen, Yanfei; Guo, Jing; Hu, Xinjun

    2012-01-01

    Aerococcus viridans is a catalase-negative Gram-positive bacterium and has been described as an airborne organism widely distributed in the hospital environment or in clinical specimens. We isolated A. viridans strain LL1 from indoor dust samples collected by a patient. Here, we prepared a genome sequence for this strain consisting of 31 contigs totaling 1,994,039 bases and a GC content of 39.42%. PMID:22815455

  6. Genome sequence of Aerococcus viridans LL1.

    PubMed

    Qin, Nan; Zheng, Beiwen; Yang, Fengling; Chen, Yanfei; Guo, Jing; Hu, Xinjun; Li, Lanjuan

    2012-08-01

    Aerococcus viridans is a catalase-negative Gram-positive bacterium and has been described as an airborne organism widely distributed in the hospital environment or in clinical specimens. We isolated A. viridans strain LL1 from indoor dust samples collected by a patient. Here, we prepared a genome sequence for this strain consisting of 31 contigs totaling 1,994,039 bases and a GC content of 39.42%.

  7. Draft genome sequence of an aflatoxigenic Aspergillus species, A. bombycis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the A. bombycis Type strain was sequenced using a Personal Genome Machine, followed by annotation of its predicted genes. The genome size for A. bombycis was found to be approximately 37 Mb and contained 12,266 genes. This announcement introduces a sequenced genome for an aflatoxigenic...

  8. Integrative clinical genomics of advanced prostate cancer

    PubMed Central

    Dan, Robinson; Van Allen, Eliezer M.; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J.; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim M.; Bradley, Robert K.; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P.; Hussain, Maha; Feng, Felix Y.; Tomlins, Scott A.; Cooney, Kathleen A.; Smith, David C.; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E.; Morris, Michael J.; Solomon, Stephen B.; Durack, Jeremy C.; Reuter, Victor E.; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T.; Bowden, Michaela; Balk, Stephen P.; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y.; Mostaghel, Elahe A.; Cheng, Heather H.; Mulcahy, Hyojeong; True, Lawrence D.; Plymate, Stephen R.; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Lopez, Raquel Perez; Demichelis, Francesca; Robinson, Brian D.; Schiffman, Marc A.; Nanus, David M.; Tagawa, Scott T.; Sigaras, Alexandros; Eng, Kenneth W.; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I.; Scher, Howard I.; Pienta, Kenneth J.; Kantoff, Philip; de Bono, Johann S.; Rubin, Mark A.; Nelson, Peter S.; Garraway, Levi A.; Sawyers, Charles L.; Chinnaiyan, Arul M.

    2015-01-01

    SUMMARY Toward development of a precision medicine framework for metastatic, castration resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53 and PTEN were frequent (40–60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified novel genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin and ZBTB16/PLZF. Aberrations of BRCA2, BRCA1 and ATM were observed at substantially higher frequencies (19.3% overall) than seen in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides evidence that clinical sequencing in mCRPC is feasible and could impact treatment decisions in significant numbers of affected individuals. PMID:26000489

  9. Whole genome sequencing analysis of lung adenocarcinoma in Xuanwei, China

    PubMed Central

    Wang, Xiao; Li, Jing; Duan, Yong; Wu, Huifei; Xu, Qiuyue

    2017-01-01

    Background The lung cancer mortality rate in Xuanwei city is among the highest in China and adenocarcinoma is the major histological type. Lung cancer has been associated with exposure to indoor smoky coal emissions that contain high levels of polycyclic aromatic hydrocarbons; however, the pathogenesis of lung cancer has not yet been fully elucidated. Methods We performed whole genome sequencing with lung adenocarcinoma and corresponding non‐tumor tissue to explore the genomic features of Xuanwei lung cancer. We used the Molecule Annotation System to determine and plot alterations in genes and signaling pathways. Results A total of 3 428 060 and 3 416 989 single nucleotide variants were detected in tumor and normal genomes, respectively. After comparison of these two genomes, 977 high‐confidence somatic single nucleotide variants were identified. We observed a remarkably high proportion of C·G‐A·T transversions. HECTD4, RCBTB2, KLF15, and CACNA1C may be cancer‐related genes. Nine copy number variations increased in chromosome 5 and one in chromosome 7. The novel junctions were detected via clustered discordant paired ends and 1955 structural variants were discovered. Among these, we found 44 novel chromosome structural variations. In addition, EGFR and CACNA1C in the mitogen‐activated protein kinase signaling pathway were mutated or amplified in lung adenocarcinoma tumor tissue. Conclusion We obtained a comprehensive view of somatic alterations of Xuanwei lung adenocarcinoma. These findings provide insight into the genomic landscape in order to further learn about the progress and development of Xuanwei lung adenocarcinoma. PMID:28083984

  10. Identification of ancient remains through genomic sequencing

    PubMed Central

    Blow, Matthew J.; Zhang, Tao; Woyke, Tanja; Speller, Camilla F.; Krivoshapkin, Andrei; Yang, Dongya Y.; Derevianko, Anatoly; Rubin, Edward M.

    2008-01-01

    Studies of ancient DNA have been hindered by the preciousness of remains, the small quantities of undamaged DNA accessible, and the limitations associated with conventional PCR amplification. In these studies, we developed and applied a genomewide adapter-mediated emulsion PCR amplification protocol for ancient mammalian samples estimated to be between 45,000 and 69,000 yr old. Using 454 Life Sciences (Roche) and Illumina sequencing (formerly Solexa sequencing) technologies, we examined over 100 megabases of DNA from amplified extracts, revealing unbiased sequence coverage with substantial amounts of nonredundant nuclear sequences from the sample sources and negligible levels of human contamination. We consistently recorded over 500-fold increases, such that nanogram quantities of starting material could be amplified to microgram quantities. Application of our protocol to a 50,000-yr-old uncharacterized bone sample that was unsuccessful in mitochondrial PCR provided sufficient nuclear sequences for comparison with extant mammals and subsequent phylogenetic classification of the remains. The combined use of emulsion PCR amplification and high-throughput sequencing allows for the generation of large quantities of DNA sequence data from ancient remains. Using such techniques, even small amounts of ancient remains with low levels of endogenous DNA preservation may yield substantial quantities of nuclear DNA, enabling novel applications of ancient DNA genomics to the investigation of extinct phyla. PMID:18426903

  11. Swine Genome Sequencing Consortium (SGSC): A Strategic Roadmap for Sequencing The Pig Genome

    PubMed Central

    Schook, Lawrence B.; Beever, Jonathan E.; Rogers, Jane; Humphray, Sean; Archibald, Alan; Chardon, Patrick; Milan, Denis; Rohrer, Gary; Eversole, Kellye

    2005-01-01

    The Swine Genome Sequencing Consortium (SGSC) was formed in September 2003 by academic, government and industry representatives to provide international coordination for sequencing the pig genome. The SGSC’s mission is to advance biomedical research for animal production and health by the development of DNAbased tools and products resulting from the sequencing of the swine genome. During the past 2 years, the SGSC has met bi-annually to develop a strategic roadmap for creating the required scientific resources, to integrate existing physical maps, and to create a sequencing strategy that captured international participation and a broad funding base. During the past year, SGSC members have integrated their respective physical mapping data with the goal of creating a minimal tiling path (MTP) that will be used as the sequencing template. During the recent Plant and Animal Genome meeting (January 16, 2005 San Diego, CA), presentations demonstrated that a human–pig comparative map has been completed, BAC fingerprint contigs (FPC) for each of the autosomes and X chromosome have been constructed and that BAC end-sequencing has permitted, through BLAST analysis and RH-mapping, anchoring of the contigs. Thus, significant progress has been made towards the creation of a MTP. In addition, whole-genome (WG) shotgun libraries have been constructed and are currently being sequenced in various laboratories around the globe. Thus, a hybrid sequencing approach in which 3x coverage of BACs comprising the MTP and 3x of the WG-shotgun libraries will be used to develop a draft 6x coverage of the pig genome. PMID:18629187

  12. Genome instability, cancer and aging

    PubMed Central

    Maslov, Alexander Y.; Vijg, Jan

    2015-01-01

    DNA damage-driven genome instability underlies the diversity of life forms generated by the evolutionary process but is detrimental to the somatic cells of individual organisms. The cellular response to DNA damage can be roughly divided in two parts. First, when damage is severe, programmed cell death may occur or, alternatively, temporary or permanent cell cycle arrest. This protects against cancer but can have negative effects on the long term, e.g., by depleting stem cell reservoirs. Second, damage can be repaired through one or more of the many sophisticated genome maintenance pathways. However, erroneous DNA repair and incomplete restoration of chromatin after damage is resolved, produce mutations and epimutations, respectively, both of which have been shown to accumulate with age. An increased burden of mutations and/or epimutations in aged tissues increases cancer risk and adversely affects gene transcriptional regulation, leading to progressive decline in organ function. Cellular degeneration and uncontrolled cell proliferation are both major hallmarks of aging. Despite the fact that one seems to exclude the other, they both may be driven by a common mechanism. Here, we review age related changes in the mammalian genome and their possible functional consequences, with special emphasis on genome instability in stem/progenitor cells. PMID:19344750

  13. International network of cancer genome projects

    PubMed Central

    2010-01-01

    The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumors from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of over 25,000 cancer genomes at the genomic, epigenomic, and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically-relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies. PMID:20393554

  14. International network of cancer genome projects.

    PubMed

    Hudson, Thomas J; Anderson, Warwick; Artez, Axel; Barker, Anna D; Bell, Cindy; Bernabé, Rosa R; Bhan, M K; Calvo, Fabien; Eerola, Iiro; Gerhard, Daniela S; Guttmacher, Alan; Guyer, Mark; Hemsley, Fiona M; Jennings, Jennifer L; Kerr, David; Klatt, Peter; Kolar, Patrik; Kusada, Jun; Lane, David P; Laplace, Frank; Youyong, Lu; Nettekoven, Gerd; Ozenberger, Brad; Peterson, Jane; Rao, T S; Remacle, Jacques; Schafer, Alan J; Shibata, Tatsuhiro; Stratton, Michael R; Vockley, Joseph G; Watanabe, Koichi; Yang, Huanming; Yuen, Matthew M F; Knoppers, Bartha M; Bobrow, Martin; Cambon-Thomsen, Anne; Dressler, Lynn G; Dyke, Stephanie O M; Joly, Yann; Kato, Kazuto; Kennedy, Karen L; Nicolás, Pilar; Parker, Michael J; Rial-Sebbag, Emmanuelle; Romeo-Casabona, Carlos M; Shaw, Kenna M; Wallace, Susan; Wiesner, Georgia L; Zeps, Nikolajs; Lichter, Peter; Biankin, Andrew V; Chabannon, Christian; Chin, Lynda; Clément, Bruno; de Alava, Enrique; Degos, Françoise; Ferguson, Martin L; Geary, Peter; Hayes, D Neil; Hudson, Thomas J; Johns, Amber L; Kasprzyk, Arek; Nakagawa, Hidewaki; Penny, Robert; Piris, Miguel A; Sarin, Rajiv; Scarpa, Aldo; Shibata, Tatsuhiro; van de Vijver, Marc; Futreal, P Andrew; Aburatani, Hiroyuki; Bayés, Mónica; Botwell, David D L; Campbell, Peter J; Estivill, Xavier; Gerhard, Daniela S; Grimmond, Sean M; Gut, Ivo; Hirst, Martin; López-Otín, Carlos; Majumder, Partha; Marra, Marco; McPherson, John D; Nakagawa, Hidewaki; Ning, Zemin; Puente, Xose S; Ruan, Yijun; Shibata, Tatsuhiro; Stratton, Michael R; Stunnenberg, Hendrik G; Swerdlow, Harold; Velculescu, Victor E; Wilson, Richard K; Xue, Hong H; Yang, Liu; Spellman, Paul T; Bader, Gary D; Boutros, Paul C; Campbell, Peter J; Flicek, Paul; Getz, Gad; Guigó, Roderic; Guo, Guangwu; Haussler, David; Heath, Simon; Hubbard, Tim J; Jiang, Tao; Jones, Steven M; Li, Qibin; López-Bigas, Nuria; Luo, Ruibang; Muthuswamy, Lakshmi; Ouellette, B F Francis; Pearson, John V; Puente, Xose S; Quesada, Victor; Raphael, Benjamin J; Sander, Chris; Shibata, Tatsuhiro; Speed, Terence P; Stein, Lincoln D; Stuart, Joshua M; Teague, Jon W; Totoki, Yasushi; Tsunoda, Tatsuhiko; Valencia, Alfonso; Wheeler, David A; Wu, Honglong; Zhao, Shancen; Zhou, Guangyu; Stein, Lincoln D; Guigó, Roderic; Hubbard, Tim J; Joly, Yann; Jones, Steven M; Kasprzyk, Arek; Lathrop, Mark; López-Bigas, Nuria; Ouellette, B F Francis; Spellman, Paul T; Teague, Jon W; Thomas, Gilles; Valencia, Alfonso; Yoshida, Teruhiko; Kennedy, Karen L; Axton, Myles; Dyke, Stephanie O M; Futreal, P Andrew; Gerhard, Daniela S; Gunter, Chris; Guyer, Mark; Hudson, Thomas J; McPherson, John D; Miller, Linda J; Ozenberger, Brad; Shaw, Kenna M; Kasprzyk, Arek; Stein, Lincoln D; Zhang, Junjun; Haider, Syed A; Wang, Jianxin; Yung, Christina K; Cros, Anthony; Cross, Anthony; Liang, Yong; Gnaneshan, Saravanamuttu; Guberman, Jonathan; Hsu, Jack; Bobrow, Martin; Chalmers, Don R C; Hasel, Karl W; Joly, Yann; Kaan, Terry S H; Kennedy, Karen L; Knoppers, Bartha M; Lowrance, William W; Masui, Tohru; Nicolás, Pilar; Rial-Sebbag, Emmanuelle; Rodriguez, Laura Lyman; Vergely, Catherine; Yoshida, Teruhiko; Grimmond, Sean M; Biankin, Andrew V; Bowtell, David D L; Cloonan, Nicole; deFazio, Anna; Eshleman, James R; Etemadmoghadam, Dariush; Gardiner, Brooke B; Gardiner, Brooke A; Kench, James G; Scarpa, Aldo; Sutherland, Robert L; Tempero, Margaret A; Waddell, Nicola J; Wilson, Peter J; McPherson, John D; Gallinger, Steve; Tsao, Ming-Sound; Shaw, Patricia A; Petersen, Gloria M; Mukhopadhyay, Debabrata; Chin, Lynda; DePinho, Ronald A; Thayer, Sarah; Muthuswamy, Lakshmi; Shazand, Kamran; Beck, Timothy; Sam, Michelle; Timms, Lee; Ballin, Vanessa; Lu, Youyong; Ji, Jiafu; Zhang, Xiuqing; Chen, Feng; Hu, Xueda; Zhou, Guangyu; Yang, Qi; Tian, Geng; Zhang, Lianhai; Xing, Xiaofang; Li, Xianghong; Zhu, Zhenggang; Yu, Yingyan; Yu, Jun; Yang, Huanming; Lathrop, Mark; Tost, Jörg; Brennan, Paul; Holcatova, Ivana; Zaridze, David; Brazma, Alvis; Egevard, Lars; Prokhortchouk, Egor; Banks, Rosamonde Elizabeth; Uhlén, Mathias; Cambon-Thomsen, Anne; Viksna, Juris; Ponten, Fredrik; Skryabin, Konstantin; Stratton, Michael R; Futreal, P Andrew; Birney, Ewan; Borg, Ake; Børresen-Dale, Anne-Lise; Caldas, Carlos; Foekens, John A; Martin, Sancha; Reis-Filho, Jorge S; Richardson, Andrea L; Sotiriou, Christos; Stunnenberg, Hendrik G; Thoms, Giles; van de Vijver, Marc; van't Veer, Laura; Calvo, Fabien; Birnbaum, Daniel; Blanche, Hélène; Boucher, Pascal; Boyault, Sandrine; Chabannon, Christian; Gut, Ivo; Masson-Jacquemier, Jocelyne D; Lathrop, Mark; Pauporté, Iris; Pivot, Xavier; Vincent-Salomon, Anne; Tabone, Eric; Theillet, Charles; Thomas, Gilles; Tost, Jörg; Treilleux, Isabelle; Calvo, Fabien; Bioulac-Sage, Paulette; Clément, Bruno; Decaens, Thomas; Degos, Françoise; Franco, Dominique; Gut, Ivo; Gut, Marta; Heath, Simon; Lathrop, Mark; Samuel, Didier; Thomas, Gilles; Zucman-Rossi, Jessica; Lichter, Peter; Eils, Roland; Brors, Benedikt; Korbel, Jan O; Korshunov, Andrey; Landgraf, Pablo; Lehrach, Hans; Pfister, Stefan; Radlwimmer, Bernhard; Reifenberger, Guido; Taylor, Michael D; von Kalle, Christof; Majumder, Partha P; Sarin, Rajiv; Rao, T S; Bhan, M K; Scarpa, Aldo; Pederzoli, Paolo; Lawlor, Rita A; Delledonne, Massimo; Bardelli, Alberto; Biankin, Andrew V; Grimmond, Sean M; Gress, Thomas; Klimstra, David; Zamboni, Giuseppe; Shibata, Tatsuhiro; Nakamura, Yusuke; Nakagawa, Hidewaki; Kusada, Jun; Tsunoda, Tatsuhiko; Miyano, Satoru; Aburatani, Hiroyuki; Kato, Kazuto; Fujimoto, Akihiro; Yoshida, Teruhiko; Campo, Elias; López-Otín, Carlos; Estivill, Xavier; Guigó, Roderic; de Sanjosé, Silvia; Piris, Miguel A; Montserrat, Emili; González-Díaz, Marcos; Puente, Xose S; Jares, Pedro; Valencia, Alfonso; Himmelbauer, Heinz; Himmelbaue, Heinz; Quesada, Victor; Bea, Silvia; Stratton, Michael R; Futreal, P Andrew; Campbell, Peter J; Vincent-Salomon, Anne; Richardson, Andrea L; Reis-Filho, Jorge S; van de Vijver, Marc; Thomas, Gilles; Masson-Jacquemier, Jocelyne D; Aparicio, Samuel; Borg, Ake; Børresen-Dale, Anne-Lise; Caldas, Carlos; Foekens, John A; Stunnenberg, Hendrik G; van't Veer, Laura; Easton, Douglas F; Spellman, Paul T; Martin, Sancha; Barker, Anna D; Chin, Lynda; Collins, Francis S; Compton, Carolyn C; Ferguson, Martin L; Gerhard, Daniela S; Getz, Gad; Gunter, Chris; Guttmacher, Alan; Guyer, Mark; Hayes, D Neil; Lander, Eric S; Ozenberger, Brad; Penny, Robert; Peterson, Jane; Sander, Chris; Shaw, Kenna M; Speed, Terence P; Spellman, Paul T; Vockley, Joseph G; Wheeler, David A; Wilson, Richard K; Hudson, Thomas J; Chin, Lynda; Knoppers, Bartha M; Lander, Eric S; Lichter, Peter; Stein, Lincoln D; Stratton, Michael R; Anderson, Warwick; Barker, Anna D; Bell, Cindy; Bobrow, Martin; Burke, Wylie; Collins, Francis S; Compton, Carolyn C; DePinho, Ronald A; Easton, Douglas F; Futreal, P Andrew; Gerhard, Daniela S; Green, Anthony R; Guyer, Mark; Hamilton, Stanley R; Hubbard, Tim J; Kallioniemi, Olli P; Kennedy, Karen L; Ley, Timothy J; Liu, Edison T; Lu, Youyong; Majumder, Partha; Marra, Marco; Ozenberger, Brad; Peterson, Jane; Schafer, Alan J; Spellman, Paul T; Stunnenberg, Hendrik G; Wainwright, Brandon J; Wilson, Richard K; Yang, Huanming

    2010-04-15

    The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies.

  15. Transforming clinical microbiology with bacterial genome sequencing.

    PubMed

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  16. Transforming clinical microbiology with bacterial genome sequencing

    PubMed Central

    2016-01-01

    Whole genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here we review the current status of clinical microbiology and how it has already begun to be transformed by the use of next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. The application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow. PMID:22868263

  17. Comparative genomic analysis of esophageal cancers.

    PubMed

    Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang

    2014-09-01

    The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.

  18. Why Assembling Plant Genome Sequences Is So Challenging

    PubMed Central

    Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

    2012-01-01

    In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

  19. Making sense of cancer genomic data

    PubMed Central

    Chin, Lynda; Hahn, William C.; Getz, Gad; Meyerson, Matthew

    2011-01-01

    High-throughput tools for nucleic acid characterization now provide the means to conduct comprehensive analyses of all somatic alterations in the cancer genomes. Both large-scale and focused efforts have identified new targets of translational potential. The deluge of information that emerges from these genome-scale investigations has stimulated a parallel development of new analytical frameworks and tools. The complexity of somatic genomic alterations in cancer genomes also requires the development of robust methods for the interrogation of the function of genes identified by these genomics efforts. Here we provide an overview of the current state of cancer genomics, appraise the current portals and tools for accessing and analyzing cancer genomic data, and discuss emerging approaches to exploring the functions of somatically altered genes in cancer. PMID:21406553

  20. Cancer genetics and genomics: essentials for oncology nurses.

    PubMed

    Boucher, Jean; Habin, Karleen; Underhill, Meghan

    2014-06-01

    Cancer genetics and genomics are rapidly evolving, with new discoveries emerging in genetic mutations, variants, genomic sequencing, risk-reduction methods, and targeted therapies. To educate patients and families, state-of-the-art care requires nurses to understand terminology, scientific and technological advances, and pharmacogenomics. Clinical application of cancer genetics and genomics involves working in interdisciplinary teams to properly identify patient risk through assessing family history, facilitating genetic testing and counseling services, applying risk-reduction methods, and administering and monitoring targeted therapies.

  1. Viral sequences integrated into plant genomes.

    PubMed

    Harper, Glyn; Hull, Roger; Lockhart, Ben; Olszewski, Neil

    2002-01-01

    Sequences of various DNA plant viruses have been found integrated into the host genome. There are two forms of integrant, those that can form episomal viral infections and those that cannot. Integrants of three pararetroviruses, Banana streak virus (BSV), Tobacco vein clearing virus (TVCV), and Petunia vein clearing virus (PVCV), can generate episomal infections in certain hybrid plant hosts in response to stress. In the case of BSV and TVCV, one of the parents contains the integrant but is has not been seen to be activated in that parent; the other parent does not contain the integrant. The number of integrant loci is low for BSV and PVCV and high in TVCV. The structure of the integrants is complex, and it is thought that episomal virus is released by recombination and/or reverse transcription. Geminiviral and pararetroviral sequences are found in plant genomes although not so far associated with a virus disease. It appears that integration of viral sequences is widespread in the plant kingdom and has been occurring for a long period of time.

  2. Simple sequence repeats in bryophyte mitochondrial genomes.

    PubMed

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2016-01-01

    Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions.

  3. Complete mitochondrial genome sequence of Nectogale elegans.

    PubMed

    Huang, Ting; Yan, Chaochao; Tan, Zheng; Tu, Feiyun; Yue, Bisong; Zhang, Xiuyue

    2014-08-01

    The elegant water shrew (Nectogale elegans) belongs to the family Soricidae, and distributes in northern South Asia, central and southern China and northern Southeast Asia. In this study, the complete mitochondrial genome of N. elegans was sequenced. It was determined to be 17,460 bases, and included 13 protein-coding genes (PCGs), 22 tRNA genes, 2 ribosomal RNA genes and one non-coding region, which is similar to other mammalian mitochondrial genomes. Bayesian inference and maximum likelihood methods were used to construct phylogenetic trees based on 12 heavy-strand concatenated PCGs. Phylogenetic analyses further confirmed that Crocidurinae diverged prior to Soricinae, and Sorex unguiculatus differentiated earlier than N. elegans.

  4. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome.

    PubMed

    Ley, Timothy J; Mardis, Elaine R; Ding, Li; Fulton, Bob; McLellan, Michael D; Chen, Ken; Dooling, David; Dunford-Shore, Brian H; McGrath, Sean; Hickenbotham, Matthew; Cook, Lisa; Abbott, Rachel; Larson, David E; Koboldt, Dan C; Pohl, Craig; Smith, Scott; Hawkins, Amy; Abbott, Scott; Locke, Devin; Hillier, Ladeana W; Miner, Tracie; Fulton, Lucinda; Magrini, Vincent; Wylie, Todd; Glasscock, Jarret; Conyers, Joshua; Sander, Nathan; Shi, Xiaoqi; Osborne, John R; Minx, Patrick; Gordon, David; Chinwalla, Asif; Zhao, Yu; Ries, Rhonda E; Payton, Jacqueline E; Westervelt, Peter; Tomasson, Michael H; Watson, Mark; Baty, Jack; Ivanovich, Jennifer; Heath, Sharon; Shannon, William D; Nagarajan, Rakesh; Walter, Matthew J; Link, Daniel C; Graubert, Timothy A; DiPersio, John F; Wilson, Richard K

    2008-11-06

    Acute myeloid leukaemia is a highly malignant haematopoietic tumour that affects about 13,000 adults in the United States each year. The treatment of this disease has changed little in the past two decades, because most of the genetic events that initiate the disease remain undiscovered. Whole-genome sequencing is now possible at a reasonable cost and timeframe to use this approach for the unbiased discovery of tumour-specific somatic mutations that alter the protein-coding genes. Here we present the results obtained from sequencing a typical acute myeloid leukaemia genome, and its matched normal counterpart obtained from the same patient's skin. We discovered ten genes with acquired mutations; two were previously described mutations that are thought to contribute to tumour progression, and eight were new mutations present in virtually all tumour cells at presentation and relapse, the function of which is not yet known. Our study establishes whole-genome sequencing as an unbiased method for discovering cancer-initiating mutations in previously unidentified genes that may respond to targeted therapies.

  5. Next generation sequencing in cancer: opportunities and challenges for precision cancer medicine.

    PubMed

    Paolillo, Carmela; Londin, Eric; Fortina, Paolo

    2016-01-01

    Over the past decade, testing the genes of patients and their specific cancer types has become standardized practice in medical oncology since somatic mutations, changes in gene expression and epigenetic modifications are all hallmarks of cancer. However, while cancer genetic assessment has been limited to single biomarkers to guide the use of therapies, improvements in nucleic acid sequencing technologies and implementation of different genome analysis tools have enabled clinicians to detect these genomic alterations and identify functional and disease-associated genomic variants. Next-generation sequencing (NGS) technologies have provided clues about therapeutic targets and genomic markers for novel clinical applications when standard therapy has failed. While Sanger sequencing, an accurate and sensitive approach, allows for the identification of potential novel variants, it is however limited by the single amplicon being interrogated. Similarly, quantitative and qualitative profiling of gene expression changes also represents a challenge for the cancer field. Both RT-PCR and microarrays are efficient approaches, but are limited to the genes present on the array or being assayed. This leaves vast swaths of the transcriptome, including non-coding RNAs and other features, unexplored. With the advent of the ability to collect and analyze genomic sequence data in a timely fashion and at an ever-decreasing cost, many of these limitations have been overcome and are being incorporated into cancer research and diagnostics giving patients and clinicians new hope for targeted and personalized treatment. Below we highlight the various applications of next-generation sequencing in precision cancer medicine.

  6. Single-cell sequencing in cancer research.

    PubMed

    Mato Prado, Mireia; Frampton, Adam E; Stebbing, Justin; Krell, Jonathan

    2016-01-01

    Genome-wide single-cell sequencing investigations have the potential to classify individual cells within a tumor mass. In recent years, various single-cell DNA and RNA quantification techniques have facilitated significant advances in our ability to classify subpopulations of cells within a heterogeneous population. These approaches provide the possibility of unraveling the complex variability in genetic, epigenetic and transcriptional interactions that occur within identical cells in a tumor. This should enhance our knowledge of the underlying biological phenotypes and could have a huge impact in designing more precise anticancer treatments in order to improve outcomes and avoid tumor resistance. In addition, single-cell sequencing analysis has the potential to allow the development of better diagnostic and prognostic biomarkers, and thus aid the delivery of more personalized targeted cancer therapy. Nevertheless, further research is still required to overcome technical, biological and computational problems before clinical application.

  7. Cancer Genomics: Diversity and Disparity Across Ethnicity and Geography.

    PubMed

    Tan, Daniel S W; Mok, Tony S K; Rebbeck, Timothy R

    2016-01-01

    Ethnic and geographic differences in cancer incidence, prognosis, and treatment outcomes can be attributed to diversity in the inherited (germline) and somatic genome. Although international large-scale sequencing efforts are beginning to unravel the genomic underpinnings of cancer traits, much remains to be known about the underlying mechanisms and determinants of genomic diversity. Carcinogenesis is a dynamic, complex phenomenon representing the interplay between genetic and environmental factors that results in divergent phenotypes across ethnicities and geography. For example, compared with whites, there is a higher incidence of prostate cancer among Africans and African Americans, and the disease is generally more aggressive and fatal. Genome-wide association studies have identified germline susceptibility loci that may account for differences between the African and non-African patients, but the lack of availability of appropriate cohorts for replication studies and the incomplete understanding of genomic architecture across populations pose major limitations. We further discuss the transformative potential of routine diagnostic evaluation for actionable somatic alterations, using lung cancer as an example, highlighting implications of population disparities, current hurdles in implementation, and the far-reaching potential of clinical genomics in enhancing cancer prevention, diagnosis, and treatment. As we enter the era of precision cancer medicine, a concerted multinational effort is key to addressing population and genomic diversity as well as overcoming barriers and geographical disparities in research and health care delivery.

  8. Aligning Two Genomic Sequences That Contain Duplications

    NASA Astrophysics Data System (ADS)

    Hou, Minmei; Riemer, Cathy; Berman, Piotr; Hardison, Ross C.; Miller, Webb

    It is difficult to properly align genomic sequences that contain intra-species duplications. With this goal in mind, we have developed a tool, called TOAST (two-way orthologous alignment selection tool), for predicting whether two aligned regions from different species are orthologous, i.e., separated by a speciation event, as opposed to a duplication event. The advantage of restricting alignment to orthologous pairs is that they constitute the aligning regions that are most likely to share the same biological function, and most easily analyzed for evidence of selection. We evaluate TOAST on 12 human/mouse gene clusters.

  9. Management of familial cancer: sequencing, surveillance and society.

    PubMed

    Samuel, Nardin; Villani, Anita; Fernandez, Conrad V; Malkin, David

    2014-12-01

    The clinical management of familial cancer begins with recognition of patterns of cancer occurrence suggestive of genetic susceptibility in a proband or pedigree, to enable subsequent investigation of the underlying DNA mutations. In this regard, next-generation sequencing of DNA continues to transform cancer diagnostics, by enabling screening for cancer-susceptibility genes in the context of known and emerging familial cancer syndromes. Increasingly, not only are candidate cancer genes sequenced, but also entire 'healthy' genomes are mapped in children with cancer and their family members. Although large-scale genomic analysis is considered intrinsic to the success of cancer research and discovery, a number of accompanying ethical and technical issues must be addressed before this approach can be adopted widely in personalized therapy. In this Perspectives article, we describe our views on how the emergence of new sequencing technologies and cancer surveillance strategies is altering the framework for the clinical management of hereditary cancer. Genetic counselling and disclosure issues are discussed, and strategies for approaching ethical dilemmas are proposed.

  10. Rickettsia felis, from culture to genome sequencing.

    PubMed

    Ogata, H; Robert, C; Audic, S; Robineau, S; Blanc, G; Fournier, P E; Renesto, P; Claverie, J M; Raoult, D

    2005-12-01

    Rickettsia felis has been recently cultured in XTC2 cells. This allows production of enough bacteria to create a genomic bank and to sequence it. The chromosome of R. felis is longer than that of previously sequenced rickettsiae and it possess 2 plasmids. Microscopically, this bacterium exhibits two forms of pili: one resembles a conjugative pilus and another forms hair-like projections that may play a role in pathogenicity. R. felis also exhibits several copies of ankyrin-repeat genes and tetratricopeptide encoding gene that are specifically linked to pathogenic host-associated bacteria. It also contains toxin-antitoxin system encoding genes that are extremely rare in intracellular bacteria and may be linked to plasmid maintenance.

  11. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    PubMed

    Wang, Yu; Li, Wei; Xia, Yingying; Wang, Chongzhi; Tang, Y Tom; Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2014-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  12. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions

    PubMed Central

    Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2015-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information. PMID:25919136

  13. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  14. [Current topics in mutations in the cancer genome].

    PubMed

    Iwaya, Takeshi; Mimori, Koshi; Wakabayashi, Go

    2012-03-01

    Several oncogenes and tumor suppressor genes are involved in the multistep process of carcinogenesis in many cancer types. Recently, global mutational analyses have revealed that the cancer genome has far greater numbers of mutations than previously thought. Furthermore, the next-generation sequencing method, which has a different principle from conventional Sanger sequencing, has provided more information on the cancer genome such as new cancer-related genes and the existence of many rearrangements in solid cancers. Somatic mutations occurring in cancer cells are divided into "driver" and "passenger" mutations. Driver mutations confer a growth advantage upon the neoplastic clone and are crucial for carcinogenesis. The remaining large majority of mutations are passengers, which, by definition, do not confer a growth advantage. Driver genes with low-frequency mutation rates (less than 10%) are also involved in carcinogenesis along with well-known drivers with high-frequency mutations. There are now several celebrated examples of anticancer drugs of which the efficacy in cancer patients can be predicted based on the genotype of several driver genes, such as EGFR, KRAS, and BRAF on the EGFR signaling pathway. The complete catalogs of somatic mutations provided by the sequencing of the cancer genome are expected to prompt new approaches to diagnosis, therapy, and potentially prevention.

  15. Next-generation sequencing: advances and applications in cancer diagnosis

    PubMed Central

    Serratì, Simona; De Summa, Simona; Pilato, Brunella; Petriella, Daniela; Lacalamita, Rosanna; Tommasi, Stefania; Pinto, Rosamaria

    2016-01-01

    Technological advances have led to the introduction of next-generation sequencing (NGS) platforms in cancer investigation. NGS allows massive parallel sequencing that affords maximal tumor genomic assessment. NGS approaches are different, and concern DNA and RNA analysis. DNA sequencing includes whole-genome, whole-exome, and targeted sequencing, which focuses on a selection of genes of interest for a specific disease. RNA sequencing facilitates the detection of alternative gene-spliced transcripts, posttranscriptional modifications, gene fusion, mutations/single-nucleotide polymorphisms, small and long noncoding RNAs, and changes in gene expression. Most applications are in the cancer research field, but lately NGS technology has been revolutionizing cancer molecular diagnostics, due to the many advantages it offers compared to traditional methods. There is greater knowledge on solid cancer diagnostics, and recent interest has been shown also in the field of hematologic cancer. In this review, we report the latest data on NGS diagnostic/predictive clinical applications in solid and hematologic cancers. Moreover, since the amount of NGS data produced is very large and their interpretation is very complex, we briefly discuss two bioinformatic aspects, variant-calling accuracy and copy-number variation detection, which are gaining a lot of importance in cancer-diagnostic assessment. PMID:27980425

  16. Investigating core genetic-and-epigenetic cell cycle networks for stemness and carcinogenic mechanisms, and cancer drug design using big database mining and genome-wide next-generation sequencing data.

    PubMed

    Li, Cheng-Wei; Chen, Bor-Sen

    2016-10-01

    Recent studies have demonstrated that cell cycle plays a central role in development and carcinogenesis. Thus, the use of big databases and genome-wide high-throughput data to unravel the genetic and epigenetic mechanisms underlying cell cycle progression in stem cells and cancer cells is a matter of considerable interest. Real genetic-and-epigenetic cell cycle networks (GECNs) of embryonic stem cells (ESCs) and HeLa cancer cells were constructed by applying system modeling, system identification, and big database mining to genome-wide next-generation sequencing data. Real GECNs were then reduced to core GECNs of HeLa cells and ESCs by applying principal genome-wide network projection. In this study, we investigated potential carcinogenic and stemness mechanisms for systems cancer drug design by identifying common core and specific GECNs between HeLa cells and ESCs. Integrating drug database information with the specific GECNs of HeLa cells could lead to identification of multiple drugs for cervical cancer treatment with minimal side-effects on the genes in the common core. We found that dysregulation of miR-29C, miR-34A, miR-98, and miR-215; and methylation of ANKRD1, ARID5B, CDCA2, PIF1, STAMBPL1, TROAP, ZNF165, and HIST1H2AJ in HeLa cells could result in cell proliferation and anti-apoptosis through NFκB, TGF-β, and PI3K pathways. We also identified 3 drugs, methotrexate, quercetin, and mimosine, which repressed the activated cell cycle genes, ARID5B, STK17B, and CCL2, in HeLa cells with minimal side-effects.

  17. Revealing the Complexity of Breast Cancer by Next Generation Sequencing

    PubMed Central

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of “-omic” platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  18. Detecting long tandem duplications in genomic sequences

    PubMed Central

    2012-01-01

    Background Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. Results In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,a we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS < 1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. Conclusions ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations. PMID:22568762

  19. Rapid whole genome sequencing and precision neonatology.

    PubMed

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care.

  20. Complete genome sequence of Methanoculleus marisnigri type strain JR1

    SciTech Connect

    Anderson, Iain; Sieprawska-Lupa, Magdalena; Goltsman, Eugene; Lapidus, Alla L.; Copeland, A; Glavina Del Rio, Tijana; Tice, Hope; Dalin, Eileen; Barry, Kerrie; Saunders, Elizabeth H; Han, Cliff; Brettin, Tom; Detter, J. Chris; Bruce, David; Mikhailova, Natalia; Pitluck, Sam; Hauser, Loren John; Land, Miriam L; Lucas, Susan; Richardson, P M; Whitman, W. B.; Kyrpides, Nikos C

    2009-01-01

    Methanoculleus marisnigri Romesser et al. 1981 is a methanogen belonging to the order Methanomicrobiales within the archaeal phylum Euryarchaeota. The type strain, JR1, was isolated from anoxic sediments of the Black Sea. M. marisnigri is of phylogenetic interest because at the time the sequencing project began only one genome had previously been sequenced from the order Methanomicrobiales. We report here the complete genome sequence of M. marisnigri type strain JR1 and its annotation. This is part of a Joint Genome Institute 2006 Community Sequencing Program to sequence genomes of diverse Archaea.

  1. Complete genome sequence of Methanocorpusculum labreanum type strain Z

    SciTech Connect

    Anderson, Iain; Sieprawska-Lupa, Magdalena; Goltsman, Eugene; Lapidus, Alla L.; Copeland, A; Glavina Del Rio, Tijana; Tice, Hope; Dalin, Eileen; Barry, Kerrie; Pitluck, Sam; Hauser, Loren John; Land, Miriam L; Lucas, Susan; Richardson, P M; Whitman, W. B.; Kyrpides, Nikos C

    2009-01-01

    Methanocorpusculum labreanum is a methanogen belonging to the order Methanomicrobiales within the archaeal phylum Euryarchaeota. The type strain Z was isolated from surface sediments of Tar Pit Lake in the La Brea Tar Pits in Los Angeles, California. M. labreanum is of phylogenetic interest because at the time the sequencing project began only one genome had previously been sequenced from the order Methanomicrobiales. We report here the complete genome sequence of M. labreanum type strain Z and its annotation. This is part of a 2006 Joint Genome Institute Community Sequencing Program project to sequence genomes of diverse Archaea.

  2. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  3. Comprehensive characterization of the genomic alterations in human gastric cancer

    PubMed Central

    Cui, Juan; Yin, Yanbin; Ma, Qin; Wang, Guoqing; Olman, Victor; Zhang, Yu; Chou, Wen-Chi; Hong, Celine S.; Zhang, Chi; Cao, Sha; Mao, Xizeng; Li, Ying; Qin, Steve; Zhao, Shaying; Jiang, Jing; Hastings, Phil; Li, Fan; Xu, Ying

    2016-01-01

    Gastric cancer is one of the most prevalent and aggressive cancers worldwide, and its molecular mechanism remains largely elusive. Here we report the genomic landscape in primary gastric adenocarcinoma of human, based on the complete genome sequences of five pairs of cancer and matching normal samples. In total, 103,464 somatic point mutations, including 407 nonsynonymous ones, were identified and the most recurrent mutations were harbored by Mucins (MUC3A and MUC12) and transcription factors (ZNF717, ZNF595 and TP53). 679 genomic rearrangements were detected, which affect 355 protein-coding genes; and 76 genes show copy number changes. Through mapping the boundaries of the rearranged regions to the folded three-dimensional structure of human chromosomes, we determined that 79.6% of the chromosomal rearrangements happen among DNA fragments in close spatial proximity, especially when two endpoints stay in a similar replication phase. We demonstrated evidences that microhomology-mediated break-induced replication was utilized as a mechanism in inducing ~40.9% of the identified genomic changes in gastric tumor. Our data analyses revealed potential integrations of Helicobacter pylori DNA into the gastric cancer genomes. Overall a large set of novel genomic variations were detected in these gastric cancer genomes, which may be essential to the study of the genetic basis and molecular mechanism of the gastric tumorigenesis. PMID:25422082

  4. First Complete Genome Sequence of Cherry virus A

    PubMed Central

    Koinuma, Hiroaki; Nijo, Takamichi; Iwabuchi, Nozomu; Yoshida, Tetsuya; Keima, Takuya; Okano, Yukari; Maejima, Kensaku; Yamaji, Yasuyuki

    2016-01-01

    The 5′-terminal genomic sequence of Cherry virus A (CVA) has long been unknown. We determined the first complete genome sequence of an apricot isolate of CVA (7,434 nucleotides [nt]). The 5′-untranslated region was 107 nt in length, which was 53 nt longer than those of known CVA sequences. PMID:27284130

  5. Draft Genome Sequence of Bacillus amyloliquefaciens B-1895

    PubMed Central

    Melnikov, Vyacheslav G.; Chistyakov, Vladimir A.

    2014-01-01

    In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

  6. Draft Genome Sequence of Brevibacterium massiliense Strain 541308T

    PubMed Central

    Robert, Catherine; Gimenez, Grégory; Raoult, Didier

    2012-01-01

    A draft genome sequence of Brevibacterium massiliense, an aerobic bacterium isolated from a human ankle discharge, is described here. CRISPR-associated proteins were found to be encoded in the genome, and analysis of transport proteins was performed. PMID:22933772

  7. CGCI Investigators Reveal Comprehensive Landscape of Diffuse Large B-Cell Lymphoma (DLBCL) Genomes | Office of Cancer Genomics

    Cancer.gov

    Researchers from British Columbia Cancer Agency used whole genome sequencing to analyze 40 DLBCL cases and 13 cell lines in order to fill in the gaps of the complex landscape of DLBCL genomes. Their analysis, “Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing,” was published online in Blood on May 22. The authors are Ryan Morin, Marco Marra, and colleagues.  

  8. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

    PubMed

    Kwok, Hin; Chiang, Alan Kwok Shing

    2016-02-24

    Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

  9. Weighted gene co-expression network analysis of colorectal cancer liver metastasis genome sequencing data and screening of anti-metastasis drugs.

    PubMed

    Gao, Bo; Shao, Qin; Choudhry, Hani; Marcus, Victoria; Dong, Kung; Ragoussis, Jiannis; Gao, Zu-Hua

    2016-09-01

    Approximately 9% of cancer-related deaths are caused by colorectal cancer (CRC). CRC patients are prone to liver metastasis, which is the most important cause for the high CRC mortality rate. Understanding the molecular mechanism of CRC liver metastasis could help us to find novel targets for the effective treatment of this deadly disease. Using weighted gene co-expression network analysis on the sequencing data of CRC with and with metastasis, we identified 5 colorectal cancer liver metastasis related modules which were labeled as brown, blue, grey, yellow and turquoise. In the brown module, which represents the metastatic tumor in the liver, gene ontology (GO) analysis revealed functions including the G-protein coupled receptor protein signaling pathway, epithelial cell differentiation and cell surface receptor linked signal transduction. In the blue module, which represents the primary CRC that has metastasized, GO analysis showed that the genes were mainly enriched in GO terms including G-protein coupled receptor protein signaling pathway, cell surface receptor linked signal transduction, and negative regulation of cell differentiation. In the yellow and turquoise modules, which represent the primary non-metastatic CRC, 13 downregulated CRC liver metastasis-related candidate miRNAs were identified (e.g. hsa-miR-204, hsa-miR-455, etc.). Furthermore, analyzing the DrugBank database and mining the literature identified 25 and 12 candidate drugs that could potentially block the metastatic processes of the primary tumor and inhibit the progression of metastatic tumors in the liver, respectively. Data generated from this study not only furthers our understanding of the genetic alterations that drive the metastatic process, but also guides the development of molecular-targeted therapy of colorectal cancer liver metastasis.

  10. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    SciTech Connect

    Yasawong, Montri; Teshima, Hazuki; Lapidus, Alla L.; Nolan, Matt; Lucas, Susan; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Detter, J. Chris; Tapia, Roxanne; Han, Cliff; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Rohde, Manfred; Sikorski, Johannes; Pukall, Rudiger; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  11. Complete genome sequence of Acetohalobium arabaticum type strain (Z-7288).

    PubMed

    Sikorski, Johannes; Lapidus, Alla; Chertkov, Olga; Lucas, Susan; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Brambilla, Evelyne; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Bruce, David; Detter, Chris; Tapia, Roxanne; Goodwin, Lynne; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Rohde, Manfred; Göker, Markus; Spring, Stefan; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-08-20

    Acetohalobium arabaticum Zhilina and Zavarzin 1990 is of special interest because of its physiology and its participation in the anaerobic C(1)-trophic chain in hypersaline environments. This is the first completed genome sequence of the family Halobacteroidaceae and only the second genome sequence in the order Halanaerobiales. The 2,469,596 bp long genome with its 2,353 protein-coding and 90 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  12. Next-generation sequencing strategies for characterizing the turkey genome.

    PubMed

    Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

    2014-02-01

    The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry.

  13. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    PubMed

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org.

  14. Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability

    PubMed Central

    Akagi, Keiko; Li, Jingfeng; Broutian, Tatevik R.; Padilla-Nash, Hesed; Xiao, Weihong; Jiang, Bo; Rocco, James W.; Teknos, Theodoros N.; Kumar, Bhavna; Wangsa, Danny; He, Dandan; Ried, Thomas; Symer, David E.; Gillison, Maura L.

    2014-01-01

    Genomic instability is a hallmark of human cancers, including the 5% caused by human papillomavirus (HPV). Here we report a striking association between HPV integration and adjacent host genomic structural variation in human cancer cell lines and primary tumors. Whole-genome sequencing revealed HPV integrants flanking and bridging extensive host genomic amplifications and rearrangements, including deletions, inversions, and chromosomal translocations. We present a model of “looping” by which HPV integrant-mediated DNA replication and recombination may result in viral–host DNA concatemers, frequently disrupting genes involved in oncogenesis and amplifying HPV oncogenes E6 and E7. Our high-resolution results shed new light on a catastrophic process, distinct from chromothripsis and other mutational processes, by which HPV directly promotes genomic instability. PMID:24201445

  15. Whole Genome Sequencing of Newly Established Pancreatic Cancer Lines Identifies Novel Somatic Mutation (c.2587G>A) in Axon Guidance Receptor Plexin A1 as Enhancer of Proliferation and Invasion

    PubMed Central

    Abisoye-Ogunniyan, Abisola; Waterfall, Joshua J.; Davis, Sean; Killian, J. Keith; Pineda, Marbin; Ray, Satyajit; McCord, Matt R.; Pflicke, Holger; Burkett, Sandra Sczerba; Meltzer, Paul S.; Rudloff, Udo

    2016-01-01

    The genetic profile of human pancreatic cancers harbors considerable heterogeneity, which suggests a possible explanation for the pronounced inefficacy of single therapies in this disease. This observation has led to a belief that custom therapies based on individual tumor profiles are necessary to more effectively treat pancreatic cancer. It has recently been discovered that axon guidance genes are affected by somatic structural variants in up to 25% of human pancreatic cancers. Thus far, however, some of these mutations have only been correlated to survival probability and no function has been assigned to these observed axon guidance gene mutations in pancreatic cancer. In this study we established three novel pancreatic cancer cell lines and performed whole genome sequencing to discover novel mutations in axon guidance genes that may contribute to the cancer phenotype of these cells. We discovered, among other novel somatic variants in axon guidance pathway genes, a novel mutation in the PLXNA1 receptor (c.2587G>A) in newly established cell line SB.06 that mediates oncogenic cues of increased invasion and proliferation in SB.06 cells and increased invasion in 293T cells upon stimulation with the receptor’s natural ligand semaphorin 3A compared to wild type PLXNA1 cells. Mutant PLXNA1 signaling was associated with increased Rho-GTPase and p42/p44 MAPK signaling activity and cytoskeletal expansion, but not changes in E-cadherin, vimentin, or metalloproteinase 9 expression levels. Pharmacologic inhibition of the Rho-GTPase family member CDC42 selectively abrogated PLXNA1 c.2587G>A-mediated increased invasion. These findings provide in-vitro confirmation that somatic mutations in axon guidance genes can provide oncogenic gain-of-function signals and may contribute to pancreatic cancer progression. PMID:26962861

  16. Whole Genome Sequencing of Newly Established Pancreatic Cancer Lines Identifies Novel Somatic Mutation (c.2587G>A) in Axon Guidance Receptor Plexin A1 as Enhancer of Proliferation and Invasion.

    PubMed

    Sorber, Rebecca; Teper, Yaroslav; Abisoye-Ogunniyan, Abisola; Waterfall, Joshua J; Davis, Sean; Killian, J Keith; Pineda, Marbin; Ray, Satyajit; McCord, Matt R; Pflicke, Holger; Burkett, Sandra Sczerba; Meltzer, Paul S; Rudloff, Udo

    2016-01-01

    The genetic profile of human pancreatic cancers harbors considerable heterogeneity, which suggests a possible explanation for the pronounced inefficacy of single therapies in this disease. This observation has led to a belief that custom therapies based on individual tumor profiles are necessary to more effectively treat pancreatic cancer. It has recently been discovered that axon guidance genes are affected by somatic structural variants in up to 25% of human pancreatic cancers. Thus far, however, some of these mutations have only been correlated to survival probability and no function has been assigned to these observed axon guidance gene mutations in pancreatic cancer. In this study we established three novel pancreatic cancer cell lines and performed whole genome sequencing to discover novel mutations in axon guidance genes that may contribute to the cancer phenotype of these cells. We discovered, among other novel somatic variants in axon guidance pathway genes, a novel mutation in the PLXNA1 receptor (c.2587G>A) in newly established cell line SB.06 that mediates oncogenic cues of increased invasion and proliferation in SB.06 cells and increased invasion in 293T cells upon stimulation with the receptor's natural ligand semaphorin 3A compared to wild type PLXNA1 cells. Mutant PLXNA1 signaling was associated with increased Rho-GTPase and p42/p44 MAPK signaling activity and cytoskeletal expansion, but not changes in E-cadherin, vimentin, or metalloproteinase 9 expression levels. Pharmacologic inhibition of the Rho-GTPase family member CDC42 selectively abrogated PLXNA1 c.2587G>A-mediated increased invasion. These findings provide in-vitro confirmation that somatic mutations in axon guidance genes can provide oncogenic gain-of-function signals and may contribute to pancreatic cancer progression.

  17. Draft Genome Sequence for a Urinary Isolate of Nosocomiicoccus ampullae

    PubMed Central

    Hilt, Evann E.; Price, Travis K.; Diebel, Katherine; Putonti, Catherine

    2016-01-01

    A draft genome sequence for a urinary isolate of Nosocomiicoccus ampullae (UMB0853) was investigated. The size of the genome was 1,578,043 bp, with an observed G+C content of 36.1%. Annotation revealed 10 rRNA sequences, 40 tRNA genes, and 1,532 protein-coding sequences. Genome coverage was 727× and consisted of 32 contigs, with an N50 of 109,831 bp. PMID:27856579

  18. Sequence analysis of mutations and translocations across breast cancer subtypes

    PubMed Central

    Banerji, Shantanu; Cibulskis, Kristian; Rangel-Escareno, Claudia; Brown, Kristin K.; Carter, Scott L.; Frederick, Abbie M.; Lawrence, Michael S.; Sivachenko, Andrey Y.; Sougnez, Carrie; Zou, Lihua; Cortes, Maria L.; Fernandez-Lopez, Juan C.; Peng, Shouyong; Ardlie, Kristin G.; Auclair, Daniel; Bautista-Piña, Veronica; Duke, Fujiko; Francis, Joshua; Jung, Joonil; Maffuz-Aziz, Antonio; Onofrio, Robert C.; Parkin, Melissa; Pho, Nam H.; Quintanar-Jurado, Valeria; Ramos, Alex H.; Rebollar-Vega, Rosa; Rodriguez-Cuevas, Sergio; Romero-Cordoba, Sandra L.; Schumacher, Steven E.; Stransky, Nicolas; Thompson, Kristin M.; Uribe-Figueroa, Laura; Baselga, Jose; Beroukhim, Rameen; Polyak, Kornelia; Sgroi, Dennis C.; Richardson, Andrea L.; Jimenez-Sanchez, Gerardo; Lander, Eric S.; Gabriel, Stacey B.; Garraway, Levi A.; Golub, Todd R.; Melendez-Zajgla, Jorge; Toker, Alex; Getz, Gad; Hidalgo-Miranda, Alfredo; Meyerson, Matthew

    2014-01-01

    Breast carcinoma is the leading cause of cancer-related mortality in women worldwide with an estimated 1.38 million new cases and 458,000 deaths in 2008 alone1. This malignancy represents a heterogeneous group of tumours with characteristic molecular features, prognosis, and responses to available therapy2–4. Recurrent somatic alterations in breast cancer have been described including mutations and copy number alterations, notably ERBB2 amplifications, the first successful therapy target defined by a genomic aberration5. Prior DNA sequencing studies of breast cancer genomes have revealed additional candidate mutations and gene rearrangements 6–10. Here we report the whole-exome sequences of DNA from 103 human breast cancers of diverse subtypes from patients in Mexico and Vietnam compared to matched-normal DNA, together with whole-genome sequences of 22 breast cancer/normal pairs. Beyond confirming recurrent somatic mutations in PIK3CA11, TP536, AKT112, GATA313, and MAP3K110, we discovered recurrent mutations in the CBFB transcription factor gene and deletions of its partner RUNX1. Furthermore, we have identified a recurrent MAGI3-AKT3 fusion enriched in triple-negative breast cancer lacking estrogen and progesterone receptors and ERBB2 expression. The Magi3-Akt3 fusion leads to constitutive activation of Akt kinase, which is abolished by treatment with an ATP-competitive Akt small-molecule inhibitor. PMID:22722202

  19. Selection to sequence: opportunities in fungal genomics

    SciTech Connect

    Baker, Scott E.

    2009-12-01

    Selection is a biological force, causing genotypic and phenotypic change over time. Whether environmental or human induced, selective pressures shape the genotypes and the phenotypes of organisms both in nature and in the laboratory. In nature, selective pressure is highly dynamic and the sum of the environment and other organisms. In the laboratory, selection is used in genetic studies and industrial strain development programs to isolate mutants affecting biological processes of interest to researchers. Selective pressures are important considerations for fungal biology. In the laboratory a number of fungi are used as experimental systems to study a wide range of biological processes and in nature fungi are important pathogens of plants and animals and play key roles in carbon and nitrogen cycling. The continued development of high throughput sequencing technologies makes it possible to characterize at the genomic level, the effect of selective pressures both in the lab and in nature for filamentous fungi as well as other organisms.

  20. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions fr...

  1. The complete genome sequence of a dog: a perspective.

    PubMed

    Lee, Soohyun; Kasif, Simon

    2006-06-01

    A complete, high-quality reference sequence of a dog genome was recently produced by a team of researchers led by the Broad Institute, achieving another major milestone in deciphering the genomic landscape of mammalian organisms. The genome sequence provides an indispensable resource for comparative analysis and novel insights into dog and human evolution and history. Together with the survey sequence of a poodle previously published in 2003, the two dog genome sequences allowed identification of more than 2.5 million single nucleotide polymorphisms within and between dog breeds, which can be used in evolutionary analysis, behavioral studies and disease gene mapping.(1)

  2. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational

  3. Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective

    PubMed Central

    Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

    2011-01-01

    Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead. PMID:22147957

  4. The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now

    PubMed Central

    Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C.; Dwight, Selina S.; Hitz, Benjamin C.; Karra, Kalpana; Nash, Robert S.; Weng, Shuai; Wong, Edith D.; Lloyd, Paul; Skrzypek, Marek S.; Miyasato, Stuart R.; Simison, Matt; Cherry, J. Michael

    2014-01-01

    The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639

  5. The reference genome sequence of Saccharomyces cerevisiae: then and now.

    PubMed

    Engel, Stacia R; Dietrich, Fred S; Fisk, Dianna G; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C; Dwight, Selina S; Hitz, Benjamin C; Karra, Kalpana; Nash, Robert S; Weng, Shuai; Wong, Edith D; Lloyd, Paul; Skrzypek, Marek S; Miyasato, Stuart R; Simison, Matt; Cherry, J Michael

    2014-03-20

    The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called "S288C 2010," was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science.

  6. Diagnostic and prognostic roles of IRAK1 in hepatocellular carcinoma tissues: an analysis of immunohistochemistry and RNA-sequencing data from the cancer genome atlas

    PubMed Central

    Ye, Zhi-hua; Gao, Li; Wen, Dong-yue; He, Yun; Pang, Yu-yan; Chen, Gang

    2017-01-01

    Background IRAK1 has been repoted to play an essential role in the development of multiple cancers. However, the clinical significance of IRAK1 in hepatocellular carcinoma (HCC) and the underlying molecular mechanism remain unclear. Therefore, we aimed to investigate the role of IRAK1 in the pathogenesis of HCC in this study. Materials and methods HCC tissues and para-carcinoma tissues were collected for immunohistochemistry (IHC) analysis to evaluate IRAK1 expression. Data of IRAK1 expression were downloaded from the cancer genome atlas (TCGA) for analyzing the clinical significance of IRAK1. Receiver operating characteristic (ROC) curve and survival analyses were carried out to assess the diagnostic and prognostic significance of IRAK1 in IHC and TCGA data. Additionally, we investigated the alteration of IRAK1 gene in HCC from cBioPortal to generate a network of the interaction between IRAK1 and the neighboring genes. The influence of IRAK1 gene alteration on the prognosis of HCC patients was evaluated by survival analysis. Results Analysis of both IHC and TCGA data revealed a significant upregulation of IRAK1 in HCC tissues. The IHC analysis revealed there was an increasing trend in IRAK1 expression among normal liver tissues, liver cirrhosis tissues, para-carcinoma tissues and HCC tissues. The ROC curves for IHC and TCGA data demonstrated that IRAK1 exhibited a significant diagnostic value for HCC. Moreover, IRAK1 expression was observed to be associated with tumor size, metastasis and T-stage. The survival analysis indicated that the upregulation of IRAK1 predicted a worse overall survival of HCC. Additionally, data from cBioPortal confirmed that 29% of HCC tissues possessed an alteration of the IRAK1 gene. Conclusion IRAK1 may act as an oncogene in the development of HCC with its overexpression in HCC. Moreover, IRAK1 might serve as a promising diagnostic and therapeutic target for HCC. PMID:28356759

  7. A sequence-based survey of the complex structural organization of tumor genomes

    SciTech Connect

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  8. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  9. A taste of pineapple evolution through genome sequencing.

    PubMed

    Xu, Qing; Liu, Zhong-Jian

    2015-12-01

    The genome sequence assembly of the highly heterozygous Ananas comosus and its varieties is an impressive technical achievement. The sequence opens the door to a greater understanding of pineapple morphology and evolution.

  10. Whole-Genome Sequencing in Outbreak Analysis

    PubMed Central

    Turner, Stephen D.; Riley, Margaret F.; Petri, William A.; Hewlett, Erik L.

    2015-01-01

    SUMMARY In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  11. Insights from twenty years of bacterial genome sequencing

    SciTech Connect

    Land, Miriam L; Hauser, Loren John; Jun, Se Ran; Nookaew, Intawat; Leuze, Michael Rex; Ahn, Tae-Hyuk; Karpinets, Tatiana V; Lund, Ole; Kora, Guruprasad H; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  12. Insights from 20 years of bacterial genome sequencing.

    PubMed

    Land, Miriam; Hauser, Loren; Jun, Se-Ran; Nookaew, Intawat; Leuze, Michael R; Ahn, Tae-Hyuk; Karpinets, Tatiana; Lund, Ole; Kora, Guruprased; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

    2015-03-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  13. Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the

  14. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  15. Finishing The Euchromatic Sequence Of The Human Genome

    SciTech Connect

    Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

    2004-09-07

    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

  16. Draft Genome Sequence of Cystobacter ferrugineus Strain Cbfe23

    PubMed Central

    Akbar, Shukria; Dowd, Scot E.

    2017-01-01

    ABSTRACT In an effort to explore myxobacterial natural product biosynthetic pathways, the draft genome sequence of Cystobacter ferrugineus strain Cbfe23 has been obtained. Analysis of the genome using antiSMASH suggests a multitude of unique natural product biosynthetic pathways. This genome will contribute to the investigation of secondary metabolism in other myxobacterial species. PMID:28183768

  17. Genome sequence of Brevibacillus laterosporus strain GI-9.

    PubMed

    Sharma, Vikas; Singh, Pradip K; Midha, Samriti; Ranjan, Manish; Korpole, Suresh; Patil, Prabhu B

    2012-03-01

    We report the 5.18-Mb genome sequence of Brevibacillus laterosporus strain GI-9, isolated from a subsurface soil sample during a screen for novel strains producing antimicrobial compounds. The draft genome of this strain will aid in biotechnological exploitation and comparative genomics of Brevibacillus laterosporus strains.

  18. Draft Genome Sequence of Archangium sp. Strain Cb G35

    PubMed Central

    Adaikpoh, Barbara I.; Dowd, Scot E.

    2017-01-01

    ABSTRACT In an effort to explore myxobacterial natural product biosynthetic pathways, the draft genome sequence of Archangium sp. strain Cb G35 has been obtained. Analysis of the genome using antiSMASH predicts 49 natural product biosynthetic pathways. This genome will contribute to the investigation of myxobacterial secondary metabolite biosynthetic pathways. PMID:28232451

  19. Complete Genome Sequence of Staphylococcus pseudintermedius Type Strain LMG 22219

    PubMed Central

    Abouelkhair, Mohamed A.; Riley, Matthew C.; Bemis, David A.

    2017-01-01

    ABSTRACT We report the first complete genome sequence of LMG 22219 (=ON 86T = CCUG 49543T), the Staphylococcus pseudintermedius type strain isolated from feline lung tissue. This sequence information will facilitate phylogenetic comparisons of staphylococcal species and other bacteria at the genome level. PMID:28209834

  20. Draft Genome Sequence of the Pelagic Photoferrotroph Chlorobium phaeoferrooxidans

    PubMed Central

    Hahn, Aria S.; Morgan-Lang, Connor; Thompson, Katherine J.; Simister, Rachel L.; Llirós, Marc; Hirst, Martin; Hallam, Steven J.

    2017-01-01

    ABSTRACT Here, we report the draft genome sequence of Chlorobium phaeoferrooxidans, a photoferrotrophic member of the genus Chlorobium in the phylum Chlorobi. This genome sequence provides insight into the metabolic capacity that underpins photoferrotrophy within low-light-adapted pelagic Chlorobi. PMID:28360175

  1. Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages.

    PubMed

    Sheflo, Michael A; Gardner, Adam V; Merrill, Bryan D; Fisher, Joshua N B; Lunt, Bryce L; Breakwell, Donald P; Grose, Julianne H; Burnett, Sandra H

    2013-11-14

    Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage.

  2. Selection of sequence variants to improve dairy cattle genomic predictions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic prediction reliabilities improved when adding selected sequence variants from run 5 of the 1,000 bull genomes project. High density (HD) imputed genotypes for 26,970 progeny tested Holstein bulls were combined with sequence variants for 444 Holstein animals. The first test included 481,904 c...

  3. Complete Genome Sequence of Lactobacillus plantarum CGMCC 8198

    PubMed Central

    Dong, Qing-Qing; Hu, Hai-Jie; Wang, Qiu-Tong; Gu, Xiang-Chao; Zhou, Hao; Zhou, Wen-Juan; Ni, Xiao-Meng

    2017-01-01

    ABSTRACT We report the complete genome sequence of Lactobacillus plantarum CGMCC 8198, a novel probiotic strain isolated from fermented herbage. We have determined the complete genome sequence of strain L. plantarum CGMCC 8198, which consists of genes that are likely to be involved in dairy fermentation and that have probiotic qualities. PMID:28183756

  4. Almost finished: the complete genome sequence of Mycosphaerella graminicola

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mycosphaerella graminicola causes septoria tritici blotch of wheat. An 8.9x shotgun sequence of bread wheat strain IPO323 was generated through the Community Sequencing Program of the U.S. Department of Energy’s Joint Genome Institute (JGI), and was finished at the Stanford Human Genome Center. The ...

  5. Draft Genome Sequence of Vibrio (Listonella) anguillarum ATCC 14181

    PubMed Central

    Grim, Christopher J.

    2016-01-01

    We report the draft genome sequence of Vibrio anguillarum ATCC 14181, a Gram-negative, hemolytic, O2 serotype marine bacterium that causes mortality in mariculture species. The availability of this genome sequence will add to our knowledge of diversity and virulence mechanisms of Vibrio anguillarum as well as other pathogenic Vibrio spp. PMID:27795288

  6. Complete genome sequence of Enterobacter aerogenes KCTC 2190.

    PubMed

    Shin, Sang Heum; Kim, Sewhan; Kim, Jae Young; Lee, Soojin; Um, Youngsoon; Oh, Min-Kyu; Kim, Young-Rok; Lee, Jinwon; Yang, Kap-Seok

    2012-05-01

    This is the first complete genome sequence of the Enterobacter aerogenes species. Here we present the genome sequence of E. aerogenes KCTC 2190, which contains 5,280,350 bp with a G + C content of 54.8 mol%, 4,912 protein-coding genes, and 109 structural RNAs.

  7. Initial sequencing and analysis of the human genome.

    PubMed

    Lander, E S; Linton, L M; Birren, B; Nusbaum, C; Zody, M C; Baldwin, J; Devon, K; Dewar, K; Doyle, M; FitzHugh, W; Funke, R; Gage, D; Harris, K; Heaford, A; Howland, J; Kann, L; Lehoczky, J; LeVine, R; McEwan, P; McKernan, K; Meldrim, J; Mesirov, J P; Miranda, C; Morris, W; Naylor, J; Raymond, C; Rosetti, M; Santos, R; Sheridan, A; Sougnez, C; Stange-Thomann, Y; Stojanovic, N; Subramanian, A; Wyman, D; Rogers, J; Sulston, J; Ainscough, R; Beck, S; Bentley, D; Burton, J; Clee, C; Carter, N; Coulson, A; Deadman, R; Deloukas, P; Dunham, A; Dunham, I; Durbin, R; French, L; Grafham, D; Gregory, S; Hubbard, T; Humphray, S; Hunt, A; Jones, M; Lloyd, C; McMurray, A; Matthews, L; Mercer, S; Milne, S; Mullikin, J C; Mungall, A; Plumb, R; Ross, M; Shownkeen, R; Sims, S; Waterston, R H; Wilson, R K; Hillier, L W; McPherson, J D; Marra, M A; Mardis, E R; Fulton, L A; Chinwalla, A T; Pepin, K H; Gish, W R; Chissoe, S L; Wendl, M C; Delehaunty, K D; Miner, T L; Delehaunty, A; Kramer, J B; Cook, L L; Fulton, R S; Johnson, D L; Minx, P J; Clifton, S W; Hawkins, T; Branscomb, E; Predki, P; Richardson, P; Wenning, S; Slezak, T; Doggett, N; Cheng, J F; Olsen, A; Lucas, S; Elkin, C; Uberbacher, E; Frazier, M; Gibbs, R A; Muzny, D M; Scherer, S E; Bouck, J B; Sodergren, E J; Worley, K C; Rives, C M; Gorrell, J H; Metzker, M L; Naylor, S L; Kucherlapati, R S; Nelson, D L; Weinstock, G M; Sakaki, Y; Fujiyama, A; Hattori, M; Yada, T; Toyoda, A; Itoh, T; Kawagoe, C; Watanabe, H; Totoki, Y; Taylor, T; Weissenbach, J; Heilig, R; Saurin, W; Artiguenave, F; Brottier, P; Bruls, T; Pelletier, E; Robert, C; Wincker, P; Smith, D R; Doucette-Stamm, L; Rubenfield, M; Weinstock, K; Lee, H M; Dubois, J; Rosenthal, A; Platzer, M; Nyakatura, G; Taudien, S; Rump, A; Yang, H; Yu, J; Wang, J; Huang, G; Gu, J; Hood, L; Rowen, L; Madan, A; Qin, S; Davis, R W; Federspiel, N A; Abola, A P; Proctor, M J; Myers, R M; Schmutz, J; Dickson, M; Grimwood, J; Cox, D R; Olson, M V; Kaul, R; Raymond, C; Shimizu, N; Kawasaki, K; Minoshima, S; Evans, G A; Athanasiou, M; Schultz, R; Roe, B A; Chen, F; Pan, H; Ramser, J; Lehrach, H; Reinhardt, R; McCombie, W R; de la Bastide, M; Dedhia, N; Blöcker, H; Hornischer, K; Nordsiek, G; Agarwala, R; Aravind, L; Bailey, J A; Bateman, A; Batzoglou, S; Birney, E; Bork, P; Brown, D G; Burge, C B; Cerutti, L; Chen, H C; Church, D; Clamp, M; Copley, R R; Doerks, T; Eddy, S R; Eichler, E E; Furey, T S; Galagan, J; Gilbert, J G; Harmon, C; Hayashizaki, Y; Haussler, D; Hermjakob, H; Hokamp, K; Jang, W; Johnson, L S; Jones, T A; Kasif, S; Kaspryzk, A; Kennedy, S; Kent, W J; Kitts, P; Koonin, E V; Korf, I; Kulp, D; Lancet, D; Lowe, T M; McLysaght, A; Mikkelsen, T; Moran, J V; Mulder, N; Pollara, V J; Ponting, C P; Schuler, G; Schultz, J; Slater, G; Smit, A F; Stupka, E; Szustakowki, J; Thierry-Mieg, D; Thierry-Mieg, J; Wagner, L; Wallis, J; Wheeler, R; Williams, A; Wolf, Y I; Wolfe, K H; Yang, S P; Yeh, R F; Collins, F; Guyer, M S; Peterson, J; Felsenfeld, A; Wetterstrand, K A; Patrinos, A; Morgan, M J; de Jong, P; Catanese, J J; Osoegawa, K; Shizuya, H; Choi, S; Chen, Y J; Szustakowki, J

    2001-02-15

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

  8. Draft Genome Sequence of Tannerella forsythia Type Strain ATCC 43037.

    PubMed

    Friedrich, Valentin; Pabinger, Stephan; Chen, Tsute; Messner, Paul; Dewhirst, Floyd E; Schäffer, Christina

    2015-06-11

    Tannerella forsythia is an oral pathogen implicated in the development of periodontitis. Here, we report the draft genome sequence of the Tannerella forsythia strain ATCC 43037. The previously available genome of this designation (NCBI reference sequence NC_016610.1) was discovered to be derived from a different strain, FDC 92A2 (= ATCC BAA-2717).

  9. Draft genome sequence of Kocuria rhizophila P7-4.

    PubMed

    Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

    2011-08-01

    We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes.

  10. Draft Genome Sequence of “Cohnella kolymensis” B-2846

    PubMed Central

    Kudryashova, Ekaterina B.; Ariskina, Elena V.

    2016-01-01

    A draft genome sequence of “Cohnella kolymensis” strain B-2846 was derived using IonTorrent sequencing technology. The size of the assembly and G+C content were in agreement with those of other species of this genus. Characterization of the genome of a novel species of Cohnella will assist in bacterial systematics. PMID:26769947

  11. Genome Sequence of Pasteurella multocida Strain Razi_Pm0001

    PubMed Central

    Tadayon, Keyvan

    2017-01-01

    ABSTRACT We report here the genome sequence of Pasteurella multocida Razi_Pm0001 from bovine origin, isolated in Iran in 1936. The genome has a size of 2,360,663 bp, a G+C content of 40.4%, and is predicted to contain 2,052 coding sequences. PMID:28153892

  12. Complete Genome Sequence of Burkholderia cepacia Strain LO6

    PubMed Central

    Belcaid, Mahdi; Kang, Yun; Tuanyok, Apichai

    2015-01-01

    Burkholderia cepacia strain LO6 is a betaproteobacterium that was isolated from a cystic fibrosis patient. Here we report the 6.4 Mb draft genome sequence assembled into 2 contigs. This genome sequence will aid the transcriptomic profiling of this bacterium and help us to better understand the mechanisms specific to pulmonary infections. PMID:26067955

  13. Complete Genome Sequence of Burkholderia cepacia Strain LO6.

    PubMed

    Belcaid, Mahdi; Kang, Yun; Tuanyok, Apichai; Hoang, Tung T

    2015-06-11

    Burkholderia cepacia strain LO6 is a betaproteobacterium that was isolated from a cystic fibrosis patient. Here we report the 6.4 Mb draft genome sequence assembled into 2 contigs. This genome sequence will aid the transcriptomic profiling of this bacterium and help us to better understand the mechanisms specific to pulmonary infections.

  14. Whole-Genome Sequences of 26 Vibrio cholerae Isolates

    PubMed Central

    Watve, Samit S.; Chande, Aroon T.; Rishishwar, Lavanya; Jordan, I. King

    2016-01-01

    The human pathogen Vibrio cholerae employs several adaptive mechanisms for environmental persistence, including natural transformation and type VI secretion, creating a reservoir for the spread of disease. Here, we report whole-genome sequences of 26 diverse V. cholerae isolates, significantly increasing the sequence diversity of publicly available V. cholerae genomes. PMID:28007852

  15. Draft Genome Sequence of the Suttonella ornithocola Bacterium

    PubMed Central

    Waldman Ben-Asher, Hiba; Yerushalmi, Rebecca; Wachtel, Chaim; Barbiro-Michaely, Efrat

    2017-01-01

    ABSTRACT   We report here the draft genome sequence of the Suttonella ornithocola bacterium. To date, this bacterium, found in birds, passed only phylogenetic and phenotypic analyses. To our knowledge, this is the first publication of the Suttonella ornithocola genome sequence. The genetic profile provides a basis for further analysis of its infection pathways. PMID:28209820

  16. Draft Genome Sequence of Neurospora crassa Strain FGSC 73

    SciTech Connect

    Baker, Scott E.; Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V.; Simmons, Blake A.; McCluskey, Kevin

    2015-03-05

    We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models.

  17. Draft Genome Sequence of Neurospora crassa Strain FGSC 73

    SciTech Connect

    Baker, Scott E.; Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V.; Simmons, Blake A.; McCluskey, Kevin

    2015-04-02

    We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models.

  18. Complete Genome Sequence of Staphylococcus pseudintermedius Type Strain LMG 22219.

    PubMed

    Abouelkhair, Mohamed A; Riley, Matthew C; Bemis, David A; Kania, Stephen A

    2017-02-16

    We report the first complete genome sequence of LMG 22219 (=ON 86(T) = CCUG 49543(T)), the Staphylococcus pseudintermedius type strain isolated from feline lung tissue. This sequence information will facilitate phylogenetic comparisons of staphylococcal species and other bacteria at the genome level.

  19. Complete genome sequence of ‘Candidatus Liberibacter africanus’

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of ‘Candidatus Liberibacter africanus’ (Laf), strain ptsapsy, was obtained by an Illumina HiSeq 2000. The Laf genome comprises 1,192,232 nucleotides, 34.5% GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S and 5S) ...

  20. Microbial genome sequencing using optical mapping and Illumina sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  1. Trends in Next-Generation Sequencing and a New Era for Whole Genome Sequencing.

    PubMed

    Park, Sang Tae; Kim, Jayoung

    2016-11-01

    This article is a mini-review that provides a general overview for next-generation sequencing (NGS) and introduces one of the most popular NGS applications, whole genome sequencing (WGS), developed from the expansion of human genomics. NGS technology has brought massively high throughput sequencing data to bear on research questions, enabling a new era of genomic research. Development of bioinformatic software for NGS has provided more opportunities for researchers to use various applications in genomic fields. De novo genome assembly and large scale DNA resequencing to understand genomic variations are popular genomic research tools for processing a tremendous amount of data at low cost. Studies on transcriptomes are now available, from previous-hybridization based microarray methods. Epigenetic studies are also available with NGS applications such as whole genome methylation sequencing and chromatin immunoprecipitation followed by sequencing. Human genetics has faced a new paradigm of research and medical genomics by sequencing technologies since the Human Genome Project. The trend of NGS technologies in human genomics has brought a new era of WGS by enabling the building of human genomes databases and providing appropriate human reference genomes, which is a necessary component of personalized medicine and precision medicine.

  2. Trends in Next-Generation Sequencing and a New Era for Whole Genome Sequencing

    PubMed Central

    2016-01-01

    This article is a mini-review that provides a general overview for next-generation sequencing (NGS) and introduces one of the most popular NGS applications, whole genome sequencing (WGS), developed from the expansion of human genomics. NGS technology has brought massively high throughput sequencing data to bear on research questions, enabling a new era of genomic research. Development of bioinformatic software for NGS has provided more opportunities for researchers to use various applications in genomic fields. De novo genome assembly and large scale DNA resequencing to understand genomic variations are popular genomic research tools for processing a tremendous amount of data at low cost. Studies on transcriptomes are now available, from previous-hybridization based microarray methods. Epigenetic studies are also available with NGS applications such as whole genome methylation sequencing and chromatin immunoprecipitation followed by sequencing. Human genetics has faced a new paradigm of research and medical genomics by sequencing technologies since the Human Genome Project. The trend of NGS technologies in human genomics has brought a new era of WGS by enabling the building of human genomes databases and providing appropriate human reference genomes, which is a necessary component of personalized medicine and precision medicine. PMID:27915479

  3. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.

    PubMed

    Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

    2006-07-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this

  4. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-03-19

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data.

  5. Draft sequences of the radish (Raphanus sativus L.) genome.

    PubMed

    Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

    2014-10-01

    Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥ 300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified.

  6. Genomic hallmarks of localized, non-indolent prostate cancer.

    PubMed

    Fraser, Michael; Sabelnykova, Veronica Y; Yamaguchi, Takafumi N; Heisler, Lawrence E; Livingstone, Julie; Huang, Vincent; Shiah, Yu-Jia; Yousif, Fouad; Lin, Xihui; Masella, Andre P; Fox, Natalie S; Xie, Michael; Prokopec, Stephenie D; Berlin, Alejandro; Lalonde, Emilie; Ahmed, Musaddeque; Trudel, Dominique; Luo, Xuemei; Beck, Timothy A; Meng, Alice; Zhang, Junyan; D'Costa, Alister; Denroche, Robert E; Kong, Haiying; Espiritu, Shadrielle Melijah G; Chua, Melvin L K; Wong, Ada; Chong, Taryne; Sam, Michelle; Johns, Jeremy; Timms, Lee; Buchner, Nicholas B; Orain, Michèle; Picard, Valérie; Hovington, Helène; Murison, Alexander; Kron, Ken; Harding, Nicholas J; P'ng, Christine; Houlahan, Kathleen E; Chu, Kenneth C; Lo, Bryan; Nguyen, Francis; Li, Constance H; Sun, Ren X; de Borja, Richard; Cooper, Christopher I; Hopkins, Julia F; Govind, Shaylan K; Fung, Clement; Waggott, Daryl; Green, Jeffrey; Haider, Syed; Chan-Seng-Yue, Michelle A; Jung, Esther; Wang, Zhiyuan; Bergeron, Alain; Pra, Alan Dal; Lacombe, Louis; Collins, Colin C; Sahinalp, Cenk; Lupien, Mathieu; Fleshner, Neil E; He, Housheng H; Fradet, Yves; Tetu, Bernard; van der Kwast, Theodorus; McPherson, John D; Bristow, Robert G; Boutros, Paul C

    2017-01-19

    Prostate tumours are highly variable in their response to therapies, but clinically available prognostic factors can explain only a fraction of this heterogeneity. Here we analysed 200 whole-genome sequences and 277 additional whole-exome sequences from localized, non-indolent prostate tumours with similar clinical risk profiles, and carried out RNA and methylation analyses in a subset. These tumours had a paucity of clinically actionable single nucleotide variants, unlike those seen in metastatic disease. Rather, a significant proportion of tumours harboured recurrent non-coding aberrations, large-scale genomic rearrangements, and alterations in which an inversion repressed transcription within its boundaries. Local hypermutation events were frequent, and correlated with specific genomic profiles. Numerous molecular aberrations were prognostic for disease recurrence, including several DNA methylation events, and a signature comprised of these aberrations outperformed well-described prognostic biomarkers. We suggest that intensified treatment of genomically aggressive localized prostate cancer may improve cure rates.

  7. Genome Sequence of Human Rhinovirus A22, Strain Lancaster/2015

    PubMed Central

    Atkinson, Kate V.; Bishop, Lisa A.; Rhodes, Glenn; Salez, Nicolas; McEwan, Neil R.; Hegarty, Matthew J.; Robey, Julie; Harding, Nicola; Wetherell, Simon; Lauder, Robert M.; Pickup, Roger W.; Wilkinson, Mark

    2017-01-01

    ABSTRACT The genome of human rhinovirus A22 (HRV-A22) was assembled by deep sequencing RNA samples from nasopharyngeal swabs. The assembled genome is 8.7% divergent from the HRV-A22 reference strain over its full length, and it is only the second full-length genome sequence for HRV-A22. The new strain is designated strain HRV-A22/Lancaster/2015. PMID:28336607

  8. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    SciTech Connect

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  9. Deep sequencing of 10,000 human genomes

    PubMed Central

    Pierce, Levi C. T.; Biggs, William H.; di Iulio, Julia; Wong, Emily H. M.; Fabani, Martin M.; Kirkness, Ewen F.; Moustafa, Ahmed; Shah, Naisha; Xie, Chao; Brewerton, Suzanne C.; Bulsara, Nadeem; Garner, Chad; Metzker, Gary; Sandoval, Efren; Perkins, Brad A.; Och, Franz J.; Turpaz, Yaron; Venter, J. Craig

    2016-01-01

    We report on the sequencing of 10,545 human genomes at 30×–40× coverage with an emphasis on quality metrics and novel variant and sequence discovery. We find that 84% of an individual human genome can be sequenced confidently. This high-confidence region includes 91.5% of exon sequence and 95.2% of known pathogenic variant positions. We present the distribution of over 150 million single-nucleotide variants in the coding and noncoding genome. Each newly sequenced genome contributes an average of 8,579 novel variants. In addition, each genome carries on average 0.7 Mb of sequence that is not found in the main build of the hg38 reference genome. The density of this catalog of variation allowed us to construct high-resolution profiles that define genomic sites that are highly intolerant of genetic variation. These results indicate that the data generated by deep genome sequencing is of the quality necessary for clinical use. PMID:27702888

  10. Genomic libraries: II. Subcloning, sequencing, and assembling large-insert genomic DNA clones.

    PubMed

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Sequencing large insert clones to completion is useful for characterizing specific genomic regions, identifying haplotypes, and closing gaps in whole genome sequencing projects. Despite being a standard technique in molecular laboratories, DNA sequencing using the Sanger method can be highly problematic when complex secondary structures or sequence repeats are encountered in genomic clones. Here, we describe methods to isolate DNA from a large insert clone (fosmid or BAC), subclone the sample, and sequence the region to the highest industry standard. Troubleshooting solutions for sequencing difficult templates are discussed.

  11. Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer

    SciTech Connect

    Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

    2009-06-18

    Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes to cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also reveal the

  12. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    SciTech Connect

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  13. Endometrial and acute myeloid leukemia cancer genomes characterized

    Cancer.gov

    Two studies from The Cancer Genome Atlas (TCGA) program reveal details about the genomic landscapes of acute myeloid leukemia (AML) and endometrial cancer. Both provide new insights into the molecular underpinnings of these cancers.

  14. Real-time, portable genome sequencing for Ebola surveillance.

    PubMed

    Quick, Joshua; Loman, Nicholas J; Duraffour, Sophie; Simpson, Jared T; Severi, Ettore; Cowley, Lauren; Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan H J; Becker-Ziaja, Beate; Boettcher, Jan Peter; Cabeza-Cabrerizo, Mar; Camino-Sánchez, Álvaro; Carter, Lisa L; Doerrbecker, Juliane; Enkirch, Theresa; García-Dorival, Isabel; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigael; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallasch, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Y; Sachse, Andreas; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Racine, Trina; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N'Faly; Williams, Cecelia V; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Frank; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, James; Rachwal, Phillip; Turner, Daniel J; Pollakis, Georgios; Hiscox, Julian A; Matthews, David A; O'Shea, Matthew K; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Wölfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A; Koivogui, Lamine; Diallo, Boubacar; Keïta, Sakoba; Rambaut, Andrew; Formenty, Pierre; Günther, Stephan; Carroll, Miles W

    2016-02-11

    The Ebola virus disease epidemic in West Africa is the largest on record, responsible for over 28,599 cases and more than 11,299 deaths. Genome sequencing in viral outbreaks is desirable to characterize the infectious agent and determine its evolutionary rate. Genome sequencing also allows the identification of signatures of host adaptation, identification and monitoring of diagnostic targets, and characterization of responses to vaccines and treatments. The Ebola virus (EBOV) genome substitution rate in the Makona strain has been estimated at between 0.87 × 10(-3) and 1.42 × 10(-3) mutations per site per year. This is equivalent to 16-27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions. Genomic surveillance during the epidemic has been sporadic owing to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities. To address this problem, here we devise a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. We present sequence data and analysis of 142 EBOV samples collected during the period March to October 2015. We were able to generate results less than 24 h after receiving an Ebola-positive sample, with the sequencing process taking as little as 15-60 min. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks.

  15. Returning genome sequences to research participants: Policy and practice

    PubMed Central

    2017-01-01

    Despite advances in genomic science stimulating an explosion of literature around returning health-related findings, the possibility of returning entire genome sequences to individual research participants has not been widely considered. Through direct involvement in large-scale translational genomics studies, we have identified a number of logistical challenges that would need to be overcome prior to returning individual genome sequence data, including verifying that the data belong to the requestor and providing appropriate informatics support. In addition, we identify a number of ethico-legal issues that require careful consideration, including returning data to family members, mitigating against unintended consequences, and ensuring appropriate governance. Finally, recognising that there is an opportunity cost to addressing these issues, we make some specific pragmatic suggestions for studies that are considering whether to share individual genomic datasets with individual study participants. If data are shared, research should be undertaken into the personal, familial and societal impact of receiving individual genome sequence data. PMID:28317033

  16. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    PubMed

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  17. Reference genome sequence of the model plant Setaria

    SciTech Connect

    Bennetzen, Jeffrey L; Yang, Xiaohan; Ye, Chuyu; Tuskan, Gerald A

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  18. Reference genome sequence of the model plant Setaria

    SciTech Connect

    Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C.; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chuyu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela; Panaud, Olivier; Kellogg, Elizabeth A.; Brutnell, Thomas P.; Doust, Andrew N.; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  19. Genomic Biomarkers for Breast Cancer Risk

    PubMed Central

    Walsh, Michael F.; Nathanson, Katherine L.; Couch, Fergus J.

    2016-01-01

    Clinical risk assessment for cancer predisposition includes a three-generation pedigree and physical examination to identify inherited syndromes. Additionally genetic and genomic biomarkers may identify individuals with a constitutional basis for their disease that may not be evident clinically. Genomic biomarker testing may detect molecular variations in single genes, panels of genes, or entire genomes. The strength of evidence for the association of a genomic biomarker with disease risk may be weak or strong. The factors contributing to clinical validity and utility of genomic biomarkers include functional laboratory analyses and genetic epidemiologic evidence. Genomic biomarkers may be further classified as low, moderate or highly penetrant based on the likelihood of disease. Genomic biomarkers for breast cancer are comprised of rare highly penetrant mutations of genes such as BRCA1 or BRCA2, moderately penetrant mutations of genes such as CHEK2, as well as more common genomic variants, including single nucleotide polymorphisms, associated with modest effect sizes. When applied in the context of appropriate counseling and interpretation, identification of genomic biomarkers of inherited risk for breast cancer may decrease morbidity and mortality, allow for definitive prevention through assisted reproduction, and serve as a guide to targeted therapy. PMID:26987529

  20. Complete Genome Sequence of Corynebacterium minutissimum, an Opportunistic Pathogen and the Causative Agent of Erythrasma.

    PubMed

    Penton, Patricia K; Tyagi, Eishita; Humrighouse, Ben W; McQuiston, John R

    2015-03-19

    Corynebacterium minutissimum was first isolated in 1961 from infection sites of patients presenting with erythrasma, a common cutaneous infection characterized by a rash. Since its discovery, C. minutissimum has been identified as an opportunistic pathogen in immunosuppressed cancer and HIV patients. Here, we report the whole-genome sequence of C. minutissimum.

  1. Genome Science: A Video Tour of the Washington University Genome Sequencing Center for High School and Undergraduate Students

    ERIC Educational Resources Information Center

    Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.

    2005-01-01

    Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…

  2. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    SciTech Connect

    Ivanova, N; Sikorski, Johannes; Jando, Marlen; Lapidus, Alla L.; Nolan, Matt; Glavina Del Rio, Tijana; Tice, Hope; Copeland, A; Cheng, Jan-Fang; Chen, Feng; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Mavromatis, K; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Chain, Patrick S. G.; Saunders, Elizabeth H; Han, Cliff; Detter, J C; Brettin, Thomas S; Rohde, Manfred; Goker, Markus; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    SciTech Connect

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    PubMed Central

    Ivanova, Natalia; Sikorski, Johannes; Jando, Marlen; Lapidus, Alla; Nolan, Matt; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Chain, Patrick; Saunders, Elizabeth; Han, Cliff; Detter, John C.; Brettin, Thomas; Rohde, Manfred; Göker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304674

  5. Complete genome sequence of Thermomonospora curvata type strain (B9)

    SciTech Connect

    Chertkov, Olga; Sikorski, Johannes; Nolan, Matt; Lapidus, Alla L.; Lucas, Susan; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Mikhailova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Ngatchou, Olivier Duplex; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Brettin, Thomas S; Han, Cliff; Detter, J. Chris; Rohde, Manfred; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2011-01-01

    Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. Complete genome sequence of Sulfurospirillum deleyianum type strain (5175T)

    PubMed Central

    Sikorski, Johannes; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Saunders, Elizabeth; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ovchinnikova, Galina; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Chain, Patrick; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Brettin, Thomas; Detter, John C.; Han, Cliff; Rohde, Manfred; Lang, Elke; Spring, Stefan; Göker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2010-01-01

    Sulfurospirillum deleyianum Schumacher et al. 1993 is the type species of the genus Sulfurospirillum. S. deleyianum is a model organism for studying sulfur reduction and dissimilatory nitrate reduction as an energy source for growth. Also, it is a prominent model organism for studying the structural and functional characteristics of cytochrome c nitrite reductase. Here, we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the genus Sulfurospirillum. The 2,306,351 bp long genome with its 2,291 protein-coding and 52 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304697

  7. The revolution of whole genome sequencing to study parasites.

    PubMed

    Forrester, Sarah Jayne; Hall, Neil

    2014-07-01

    Genome sequencing has revolutionized the way in which we approach biological research from fundamental molecular biology to ecology and epidemiology. In the last 10 years the field of genomics has changed enormously as technology has improved and the tools for genomic sequencing have moved out of a few dedicated centers and now can be performed on bench-top instruments. In this review we will cover some of the key discoveries that were catalyzed by some of the first genome projects and discuss how this field is developing, what the new challenges are and how this may impact on research in the near future.

  8. First Complete Genome Sequences of Two Keystone Viruses from Florida

    PubMed Central

    Stockwell, Timothy B.; Heberlein-Larson, Lea A.; Tan, Yi; Halpin, Rebecca A.; Fedorova, Nadia; Katzel, Daniel A.; Smole, Sandra; Unnasch, Thomas R.; Kramer, Laura D.

    2015-01-01

    We report here the first complete sequences of two Keystone virus (KEYV) genomes isolated from Florida in 2005, which include the first two publicly available complete large (L) gene sequences. The sequences of the KEYV L segments show 75.99 to 83.86% nucleotide similarity with those of other viruses in the California (CAL) serogroup of bunyaviruses. PMID:26514762

  9. Genome sequencing of the redbanded stink bug (Piezodorus guildinii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We assembled a partial genome sequence from the redbanded stink bug, Piezodorus guildinii from Illumina MiSeq sequencing runs. The sequence has been submitted and published under NCBI GenBank Accession Number JTEQ01000000. The BioProject and BioSample Accession numbers are PRJNA263369 and SAMN030997...

  10. Selecting sequence variants to improve genomic predictions for dairy cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Millions of genetic variants have been identified by population-scale sequencing projects, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Methods of selecting sequence variants were compared using both simulated sequence genotypes and actual data from run ...

  11. A microscopic landscape of the invasive breast cancer genome

    PubMed Central

    Ping, Zheng; Xia, Yuchao; Shen, Tiansheng; Parekh, Vishwas; Siegal, Gene P.; Eltoum, Isam-Eldin; He, Jianbo; Chen, Dongquan; Deng, Minghua; Xi, Ruibin; Shen, Dejun

    2016-01-01

    Histologic grade is one of the most important microscopic features used to predict the prognosis of invasive breast cancer and may serve as a marker for studying cancer driving genomic abnormalities in vivo. We analyzed whole genome sequencing data from 680 cases of TCGA invasive ductal carcinomas of the breast and correlated them to corresponding pathology information. Ten genetic abnormalities were found to be statistically associated with histologic grade, including three most prevalent cancer driver events, TP53 and PIK3CA mutations and MYC amplification. A distinct genetic interaction among these genomic abnormalities was revealed as measured by the histologic grading score. While TP53 mutation and MYC amplification were synergistic in promoting tumor progression, PIK3CA mutation was found to have alleviated the oncogenic effect of either the TP53 mutation or MYC amplification, and was associated with a significant reduction in mitotic activity in TP53 mutated and/or MYC amplified breast cancer. Furthermore, we discovered that different types of genetic abnormalities (mutation versus amplification) within the same cancer driver gene (PIK3CA or GATA3) were associated with opposite histologic changes in invasive breast cancer. In conclusion, our study suggests that histologic grade may serve as a biomarker to define cancer driving genetic events in vivo. PMID:27283966

  12. Oxford Nanopore MinION Sequencing and Genome Assembly.

    PubMed

    Lu, Hengyun; Giordano, Francesca; Ning, Zemin

    2016-10-01

    The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS) technology. The third-generation sequencing (TGS) technology, led by Pacific Biosciences (PacBio), is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT). MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assembly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.

  13. Genomics of Colorectal Cancer in African Americans

    PubMed Central

    Brim, Hassan; Ashktorab, Hassan

    2016-01-01

    Genome-wide studies are increasingly becoming a must, especially for complex diseases such as cancer where multiple genes and diverse molecular mechanisms are known to be involved in genes’ function alteration. In this review, we report our latest genomic and epigenomic findings in African-American colorectal cancer patients. This population suffers a higher burden of the disease and most investigators in this field are looking for the underlying genetic and epigenetic targets that might be responsible for this disparity. We here report genome-wide copy number variations, single nucleotide mutations and DNA methylation findings that might be specific to this population. PMID:27917406

  14. The Genomic Scrapheap Challenge; Extracting Relevant Data from Unmapped Whole Genome Sequencing Reads, Including Strain Specific Genomic Segments, in Rats

    PubMed Central

    van der Weide, Robin H.; Simonis, Marieke; Hermsen, Roel; Toonen, Pim; Cuppen, Edwin; de Ligt, Joep

    2016-01-01

    Unmapped next-generation sequencing reads are typically ignored while they contain biologically relevant information. We systematically analyzed unmapped reads from whole genome sequencing of 33 inbred rat strains. High quality reads were selected and enriched for biologically relevant sequences; similarity-based analysis revealed clustering similar to previously reported phylogenetic trees. Our results demonstrate that on average 20% of all unmapped reads harbor sequences that can be used to improve reference genomes and generate hypotheses on potential genotype-phenotype relationships. Analysis pipelines would benefit from incorporating the described methods and reference genomes would benefit from inclusion of the genomic segments obtained through these efforts. PMID:27501045

  15. The Genomic Scrapheap Challenge; Extracting Relevant Data from Unmapped Whole Genome Sequencing Reads, Including Strain Specific Genomic Segments, in Rats.

    PubMed

    van der Weide, Robin H; Simonis, Marieke; Hermsen, Roel; Toonen, Pim; Cuppen, Edwin; de Ligt, Joep

    2016-01-01

    Unmapped next-generation sequencing reads are typically ignored while they contain biologically relevant information. We systematically analyzed unmapped reads from whole genome sequencing of 33 inbred rat strains. High quality reads were selected and enriched for biologically relevant sequences; similarity-based analysis revealed clustering similar to previously reported phylogenetic trees. Our results demonstrate that on average 20% of all unmapped reads harbor sequences that can be used to improve reference genomes and generate hypotheses on potential genotype-phenotype relationships. Analysis pipelines would benefit from incorporating the described methods and reference genomes would benefit from inclusion of the genomic segments obtained through these efforts.

  16. Mechanisms of Base Substitution Mutagenesis in Cancer Genomes

    PubMed Central

    Bacolla, Albino; Cooper, David N.; Vasquez, Karen M.

    2014-01-01

    Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules. PMID:24705290

  17. The topography of mutational processes in breast cancer genomes

    DOE PAGES

    Morganella, Sandro; Alexandrov, Ludmil B.; Glodzik, Dominik; ...

    2016-01-01

    Somatic mutations in human cancers show unevenness in genomic distribution that correlate with aspects of genome structure and function. These mutations are, however, generated by multiple mutational processes operating through the cellular lineage between the fertilized egg and the cancer cell, each composed of specific DNA damage and repair components and leaving its own characteristic mutational signature on the genome. Using somatic mutation catalogues from 560 breast cancer whole-genome sequences, here we show that each of 12 base substitution, 2 insertion/deletion (indel) and 6 rearrangement mutational signatures present in breast tissue, exhibit distinct relationships with genomic features relating to transcription,more » DNA replication and chromatin organization. This signature-based approach permits visualization of the genomic distribution of mutational processes associated with APOBEC enzymes, mismatch repair deficiency and homologous recombinational repair deficiency, as well as mutational processes of unknown aetiology. Lastly, it highlights mechanistic insights including a putative replication-dependent mechanism of APOBEC-related mutagenesis.« less

  18. Toolbox for mobile-element insertion detection on cancer genomes.

    PubMed

    Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

    2015-01-01

    Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram.

  19. Toolbox for mobile-element insertion detection on cancer genomes.

    PubMed

    Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

    2014-01-01

    Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram.

  20. Toolbox for Mobile-Element Insertion Detection on Cancer Genomes

    PubMed Central

    Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

    2015-01-01

    Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram. PMID:25931804

  1. Complete Genome Sequence of Phytopathogenic Pectobacterium atrosepticum Bacteriophage Peat1

    PubMed Central

    Kalischuk, Melanie; Hachey, John

    2015-01-01

    Pectobacterium atrosepticum is a common phytopathogen causing significant economic losses worldwide. To develop a biocontrol strategy for this blackleg pathogen of solanaceous plants, P. atrosepticum bacteriophage Peat1 was isolated and its genome completely sequenced. Interestingly, morphological and sequence analyses of the 45,633-bp genome revealed that phage Peat1 is a member of the family Podoviridae and most closely resembles the Klebsiella pneumoniae bacteriophage KP34. This is the first published complete genome sequence of a phytopathogenic P. atrosepticum bacteriophage, and details provide important information for the development of biocontrol by advancing our understanding of phage-phytopathogen interactions. PMID:26272557

  2. Complete Genome Sequence of Phytopathogenic Pectobacterium atrosepticum Bacteriophage Peat1.

    PubMed

    Kalischuk, Melanie; Hachey, John; Kawchuk, Lawrence

    2015-08-13

    Pectobacterium atrosepticum is a common phytopathogen causing significant economic losses worldwide. To develop a biocontrol strategy for this blackleg pathogen of solanaceous plants, P. atrosepticum bacteriophage Peat1 was isolated and its genome completely sequenced. Interestingly, morphological and sequence analyses of the 45,633-bp genome revealed that phage Peat1 is a member of the family Podoviridae and most closely resembles the Klebsiella pneumoniae bacteriophage KP34. This is the first published complete genome sequence of a phytopathogenic P. atrosepticum bacteriophage, and details provide important information for the development of biocontrol by advancing our understanding of phage-phytopathogen interactions.

  3. Complete genome sequence of Staphylothermus hellenicus P8T

    SciTech Connect

    Anderson, Iain; Wirth, Reinhard; Lucas, Susan; Copeland, A; Lapidus, Alla L.; Cheng, Jan-Fang; Goodwin, Lynne A.; Pitluck, Sam; Davenport, Karen W.; Detter, J. Chris; Han, Cliff; Tapia, Roxanne; Land, Miriam L; Hauser, Loren John; Pati, Amrita; Mikhailova, Natalia; Woyke, Tanja; Klenk, Hans-Peter; Kyrpides, Nikos C; Ivanova, N

    2011-01-01

    Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phy- lum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shal- low hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the com- plete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein- coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) La- boratory Sequencing Program (LSP) project.

  4. Genome sequence and comparative virulence of raccoonpox virus: the first North American poxvirus sequence.

    PubMed

    Fleischauer, Clare; Upton, Chris; Victoria, Joseph; Jones, Gwendolyn J B; Roper, Rachel L

    2015-09-01

    We report here the complete genome sequence of raccoonpox virus (RCNV), a naturally occurring North American poxvirus. This is the first such North American sequence to the best of our knowledge, and the data showed that RCNV forms a new phylogenetic branch between orthopoxviruses and Yoka poxvirus. RCNV shared overall similarity in genome organization with orthopoxviruses, and the proteins in the central conserved region shared approximately 90  % amino acid identity with orthopoxviruses. RCNV proteins shared approximately 81  % amino acid identity with Yokapox virus proteins. RCNV is missing 10 genes normally conserved in orthopoxviruses, most of which are implicated in virulence. These gene deletions may explain the attenuated phenotype of RCNV in mammals. RCNV contained one unique genome region containing approximately 1 kb of DNA sequence that is not present in any reported poxvirus. It contained a unique ORF predicted to encode a protein with a transmembrane domain. RCNV replicates well in mammalian cells, is naturally attenuated and has been shown to be effective as a vaccine vector platform, so we further tested its safety. We showed here that RCNV is substantially more attenuated than even the highly attenuated VACV-A35Del mutant virus in pregnant, nude and severe combined immunodeficient (SCID) mouse models. RCNV was much safer in pregnant mice and was cleared rapidly from tissues, even in immunocompromised animals, whereas the VACV-A35Del mutant retains virulence and persists in tissues. Thus, RCNV is expected to be a superior vaccine vector for infectious diseases and cancer due to its excellent safety profile, reported vaccine efficacy and ability to replicate in mammalian cells.

  5. Genomic Disparities in Breast Cancer Among Latinas

    PubMed Central

    Lynce, Filipa; Graves, Kristi D.; Jandorf, Lina; Ricker, Charité; Castro, Eida; Moreno, Laura; Augusto, Bianca; Fejerman, Laura; Vadaparampil, Susan T.

    2016-01-01

    Background Breast cancer is the most common cancer diagnosed among Latinas in the United States and the leading cause of cancer-related death among this population. Latinas tend to be diagnosed at a later stage and have worse prognostic features than their non-Hispanic white counterparts. Genetic and genomic factors may contribute to observed breast cancer health disparities in Latinas. Methods We provide a landscape of our current understanding and the existing gaps that need to be filled across the cancer prevention and control continuum. Results We summarize available data on mutations in high and moderate penetrance genes for inherited risk of breast cancer and the associated literature on disparities in awareness of and uptake of genetic counseling and testing in Latina populations. We also discuss common genetic polymorphisms and risk of breast cancer in Latinas. In the treatment setting, we examine tumor genomics and pharmacogenomics in Latina patients with breast cancer. Conclusions As the US population continues to diversify, extending genetic and genomic research into this underserved and understudied population is critical. By understanding the risk of breast cancer among ethnically diverse populations, we will be better positioned to make treatment advancements for earlier stages of cancer, identify more effective and ideally less toxic treatment regimens, and increase rates of survival. PMID:27842325

  6. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    SciTech Connect

    Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Van Loo, Peter; Aas, Turid; Alexandrov, Ludmil B.; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J.; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre -Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R.; Sotiriou, Christos; Richardson, Andrea L.; Lønning, Per Eystein; Wedge, David C.; Campbell, Peter J.

    2015-06-22

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.

  7. Subclonal diversification of primary breast cancer revealed by multiregion sequencing.

    PubMed

    Yates, Lucy R; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Van Loo, Peter; Aas, Turid; Alexandrov, Ludmil B; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre-Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R; Sotiriou, Christos; Richardson, Andrea L; Lønning, Per Eystein; Wedge, David C; Campbell, Peter J

    2015-07-01

    The sequencing of cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.

  8. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    DOE PAGES

    Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian; ...

    2015-06-22

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and latemore » in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.« less

  9. MIPS: a database for genomes and protein sequences.

    PubMed

    Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

  10. Single-Cell Whole-Genome Amplification and Sequencing: Methodology and Applications.

    PubMed

    Huang, Lei; Ma, Fei; Chapman, Alec; Lu, Sijia; Xie, Xiaoliang Sunney

    2015-01-01

    We present a survey of single-cell whole-genome amplification (WGA) methods, including degenerate oligonucleotide-primed polymerase chain reaction (DOP-PCR), multiple displacement amplification (MDA), and multiple annealing and looping-based amplification cycles (MALBAC). The key parameters to characterize the performance of these methods are defined, including genome coverage, uniformity, reproducibility, unmappable rates, chimera rates, allele dropout rates, false positive rates for calling single-nucleotide variations, and ability to call copy-number variations. Using these parameters, we compare five commercial WGA kits by performing deep sequencing of multiple single cells. We also discuss several major applications of single-cell genomics, including studies of whole-genome de novo mutation rates, the early evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells, preimplantation genetic diagnosis (PGD), and preimplantation genomic screening (PGS) for in vitro-fertilized embryos.

  11. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    PubMed

    Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal

  12. Genome Science and Personalized Cancer Treatment

    ScienceCinema

    Gray, Joe

    2016-07-12

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  13. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-07

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  14. CTD² Publication Guidelines | Office of Cancer Genomics

    Cancer.gov

    The Cancer Target Discovery and Development (CTD2) Network is a “community resource project” supported by the National Cancer Institute’s Office of Cancer Genomics. Members of the Network release data to the broader research community by depositing data into NCI-supported or public databases. Data deposition is NOT equivalent to publishing in a peer-reviewed journal. Unless there is a manuscript associated with a dataset, the Network considers data to be formally unpublished.

  15. Real-time, portable genome sequencing for Ebola surveillance

    PubMed Central

    Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan HJ; Becker-Ziaja, Beate; Boettcher, Jan-Peter; Cabeza-Cabrerizo, Mar; Camino-Sanchez, Alvaro; Carter, Lisa L.; Doerrbecker, Juiliane; Enkirch, Theresa; Dorival, Isabel Graciela García; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigail; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallash, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Yasmin; Sachse, Andrea; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Trina, Racine; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N’Faly; Williams, Cecelia V.; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Franck; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, Jamie; Rachwal, Philip; Turner, Daniel; Pollakis, Georgios; Hiscox, Julian A.; Matthews, David A.; O’Shea, Matthew K.; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Woelfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A.; Koivogui, Lamine; Diallo, Boubacar; Keita, Sakoba; Rambaut, Andrew; Formenty, Pierre; Gunther, Stephan; Carroll, Miles W.

    2016-01-01

    The Ebola virus disease (EVD) epidemic in West Africa is the largest on record, responsible for >28,599 cases and >11,299 deaths 1. Genome sequencing in viral outbreaks is desirable in order to characterize the infectious agent to determine its evolutionary rate, signatures of host adaptation, identification and monitoring of diagnostic targets and responses to vaccines and treatments. The Ebola virus genome (EBOV) substitution rate in the Makona strain has been estimated at between 0.87 × 10−3 to 1.42 × 10−3 mutations per site per year. This is equivalent to 16 to 27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic 2-7. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought-after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions 8. Genomic surveillance during the epidemic has been sporadic due to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities 9. In order to address this problem, we devised a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. Here we present sequence data and analysis of 142 Ebola virus (EBOV) samples collected during the period March to October 2015. We were able to generate results in less than 24 hours after receiving an Ebola positive sample, with the sequencing process taking as little as 15-60 minutes. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks. PMID:26840485

  16. BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

  17. Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato

    PubMed Central

    Macias-Muñoz, Aide; Briscoe, Adriana D.

    2014-01-01

    Here, we report the genome sequence of a novel iflavirus strain recovered from the neotropical butterfly Heliconius erato. The coding DNA sequence (CDS) of the iflavirus genome was 8,895 nucleotides in length, encoding a polyprotein that was 2,965 amino acids long. PMID:24831145

  18. Genome-Wide Association Studies of Cancer

    PubMed Central

    Stadler, Zsofia K.; Thom, Peter; Robson, Mark E.; Weitzel, Jeffrey N.; Kauff, Noah D.; Hurley, Karen E.; Devlin, Vincent; Gold, Bert; Klein, Robert J.; Offit, Kenneth

    2010-01-01

    Knowledge of the inherited risk for cancer is an important component of preventive oncology. In addition to well-established syndromes of cancer predisposition, much remains to be discovered about the genetic variation underlying susceptibility to common malignancies. Increased knowledge about the human genome and advances in genotyping technology have made possible genome-wide association studies (GWAS) of human diseases. These studies have identified many important regions of genetic variation associated with an increased risk for human traits and diseases including cancer. Understanding the principles, major findings, and limitations of GWAS is becoming increasingly important for oncologists as dissemination of genomic risk tests directly to consumers is already occurring through commercial companies. GWAS have contributed to our understanding of the genetic basis of cancer and will shed light on biologic pathways and possible new strategies for targeted prevention. To date, however, the clinical utility of GWAS-derived risk markers remains limited. PMID:20585100

  19. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genetic and genomic analyses of Upland cotton (Gossypium hirsutum) are difficult because it has a complex allotetraploid (AADD; 2n = 4x = 52) genome. Here we sequenced, assembled and analyzed the world's most important cultivated cotton genome with 246.2 gigabase (Gb) clean data obtained using whol...

  20. 100K Pathogen Genome Project: 306 Listeria Draft Genome Sequences for Food Safety and Public Health.

    PubMed

    Chen, Poyin; Kong, Nguyet; Huang, Bihua; Thao, Kao; Ng, Whitney; Storey, Dylan Bobby; Arabyan, Narine; Foutouhi, Azarene; Foutouhi, Soraya; Weimer, Bart C

    2017-02-09

    Listeria monocytogenes is a food-associated bacterium that is responsible for food-related illnesses worldwide. This is the initial public release of 306 L. monocytogenes genome sequences as part of the 100K Pathogen Genome Project. These isolates represent global genomic diversity in L. monocytogenes.

  1. 100K Pathogen Genome Project: 306 Listeria Draft Genome Sequences for Food Safety and Public Health

    PubMed Central

    Chen, Poyin; Kong, Nguyet; Huang, Bihua; Thao, Kao; Ng, Whitney; Storey, Dylan Bobby; Arabyan, Narine; Foutouhi, Azarene; Foutouhi, Soraya

    2017-01-01

    ABSTRACT Listeria monocytogenes is a food-associated bacterium that is responsible for food-related illnesses worldwide. This is the initial public release of 306 L. monocytogenes genome sequences as part of the 100K Pathogen Genome Project. These isolates represent global genomic diversity in L. monocytogenes. PMID:28183778

  2. Genome sequencing and analysis of the model grass Brachypodium distachyon.

    PubMed

    2010-02-11

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

  3. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    SciTech Connect

    Abt, Birte; Foster, Brian; Lapidus, Alla L.; Clum, Alicia; Sun, Hui; Pukall, Rudiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne A.; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Rohde, Manfred; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. Complete genome sequence of Cellulomonas flavigena type strain (134).

    PubMed

    Abt, Birte; Foster, Brian; Lapidus, Alla; Clum, Alicia; Sun, Hui; Pukall, Rüdiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Rohde, Manfred; Göker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-07-29

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Genome sequencing and analysis of the model grass Brachypodium distachyon

    SciTech Connect

    Yang, Xiaohan; Kalluri, Udaya C; Tuskan, Gerald A

    2010-01-01

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

  6. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    PubMed Central

    Abt, Birte; Foster, Brian; Lapidus, Alla; Clum, Alicia; Sun, Hui; Pukall, Rüdiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Rohde, Manfred; Göker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304688

  7. The Release 6 reference sequence of the Drosophila melanogaster genome

    DOE PAGES

    Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.; ...

    2015-01-14

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less

  8. The Release 6 reference sequence of the Drosophila melanogaster genome.

    PubMed

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.

  9. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    SciTech Connect

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  10. Draft Genome Sequences of 11 Lactococcus lactis subsp. cremoris Strains

    PubMed Central

    Backus, Lennart; Boekhorst, Jos; Dijkstra, Annereinou; Beerthuyzen, Marke; Siezen, Roland J.; Bachmann, Herwig; van Hijum, Sacha A. F. T.

    2017-01-01

    ABSTRACT The lactic acid bacterium Lactococcus lactis is widely used for the fermentation of dairy products. Here, we present the draft genome sequences of 11 L. lactis subsp. cremoris strains isolated from different environments. PMID:28302789

  11. Draft Genome Sequences of 24 Lactococcus lactis Strains

    PubMed Central

    Backus, Lennart; Wels, Michiel; Boekhorst, Jos; Dijkstra, Annereinou R.; Beerthuyzen, Marke; Kelly, William J.; Siezen, Roland J.; van Hijum, Sacha A. F. T.

    2017-01-01

    ABSTRACT The lactic acid bacterium Lactococcus lactis is widely used for the production of fermented dairy products. Here, we present the draft genome sequences of 24 L. lactis strains isolated from different environments and geographic locations. PMID:28360177

  12. Complete genome sequence of Allochromatium vinosum DSM 180T

    PubMed Central

    Weissgerber, Thomas; Zigann, Renate; Bruce, David; Chang, Yun-juan; Detter, John C.; Han, Cliff; Hauser, Loren; Jeffries, Cynthia D.; Land, Miriam; Munk, A. Christine; Tapia, Roxanne; Dahl, Christiane

    2011-01-01

    Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacterium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from freshwater, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp genome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Genome Institute Community Sequencing Program. PMID:22675582

  13. Genome sequence of the fish pathogen Flavobacterium columnare ATCC 49512

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Flavobacterium columnare is a Gram-negative, rod shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. ...

  14. Complete Genome Sequences of Six Strains of the Genus Methylobacterium

    SciTech Connect

    Marx, Christopher J; Bringel, Francoise O.; Christoserdova, Ludmila; Moulin, Lionel; UI Hague, Muhammad Farhan; Fleischman, Darrell E.; Gruffaz, Christelle; Jourand, Philippe; Knief, Claudia; Lee, Ming-Chun; Muller, Emilie E. L.; Nadalig, Thierry; Peyraud, Remi; Roselli, Sandro; Russ, Lina; Goodwin, Lynne A.; Ivanov, Pavel S.; Ivanova, N; Kyrpides, Nikos C; Lajus, Aurelie; Medigue, Claudine; Nolan, Matt; Woyke, Tanja; Stolyar, Sergey; Vorholt, Julia A.; Vuilleumier, Stephane

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  15. Complete genome sequences of six strains of the genus methylobacterium

    SciTech Connect

    Marx, Christopher J; Bringel, Francoise O.; Christoserdova, Ludmila; Moulin, Lionel; Farhan Ul Haque, Muhammad; Fleischman, Darrell E.; Gruffaz, Christelle; Jourand, Philippe; Knief, Claudia; Lee, Ming-Chun; Muller, Emilie E. L.; Nadalig, Thierry; Peyraud, Remi; Roselli, Sandro; Russ, Lina; Aguero, Fernan; Goodwin, Lynne A.; Ivanova, N; Kyrpides, Nikos C; Lajus, Aurelie; Medigue, Claudine; Nolan, Matt; Woyke, Tanja; Stolyar, Sergey; Vorholt, Julia A.; Vuilleumier, Stephane

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  16. Genome Sequence of the Immunomodulatory Strain Bifidobacterium bifidum LMG 13195

    PubMed Central

    Gueimonde, Miguel; Ventura, Marco; Margolles, Abelardo

    2012-01-01

    In this work, we report the genome sequences of Bifidobacterium bifidum strain LMG13195. Results from our research group show that this strain is able to interact with human immune cells, generating functional regulatory T cells. PMID:23209243

  17. Complete Genome Sequence of Rahnella aquatilis CIP 78.65

    SciTech Connect

    Martinez, Robert J; Bruce, David; Detter, J C; Goodwin, Lynne A.; Han, James; Han, Cliff; Held, Brittany; Land, Miriam L; Mikhailova, Natalia; Nolan, Matt; Pennacchio, Len; Pitluck, Sam; Tapia, Roxanne; Woyke, Tanja; Sobeckya, Patricia A.

    2012-01-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

  18. Genome Sequence of Escherichia coli Tailed Phage Utah

    PubMed Central

    Leavitt, Justin C.; Heitkamp, Alexandra J.; Bhattacharjee, Ananda S.; Gilcrease, Eddie B.

    2017-01-01

    ABSTRACT Escherichia coli bacteriophage Utah is a member of the chi-like tailed phage cluster in the Siphoviridae family. We report here the complete 59,024-bp sequence of the genome of phage Utah. PMID:28360173

  19. Draft Genome Sequences of Nine Cyanobacterial Strains from Diverse Habitats

    PubMed Central

    Zhu, Tao; Hou, Shengwei

    2017-01-01

    ABSTRACT Here, we report the annotated draft genome sequences of nine different cyanobacteria, which were originally collected from different habitats, including hot springs, terrestrial, freshwater, and marine environments, and cover four of the five morphological subsections of cyanobacteria. PMID:28254973

  20. Genome Sequence of Mycoplasma hyorhinis Isolated from Cell Cultures

    PubMed Central

    Cibulski, Samuel Paulo; Siqueira, Franciele Maboni; Teixeira, Thais Fumaco; Mayer, Fabiana Quoos; Almeida, Luiz Gonzaga

    2016-01-01

    Mycoplasmas are major contaminants of mammalian cell cultures. Here, the complete genome sequence of Mycoplasma hyorhinis recovered from Madin-Darby bovine kidney (MDBK) cells is reported. PMID:27738034

  1. Draft Genome Sequences of Three Mycobacterium chimaera Respiratory Isolates

    PubMed Central

    Roycroft, Emma; Raftery, Philomena; Mok, Simone; Fitzgibbon, Margaret; Rogers, Thomas R.

    2015-01-01

    Mycobacterium chimaera is an opportunistic human pathogen implicated in both pulmonary and cardiovascular infections. Here, we report the draft genome sequences of three strains isolated from human respiratory specimens. PMID:26634757

  2. Sequence analysis of the complete mitochondrial genome of Youxian sheldrake.

    PubMed

    He, Shao-Ping; Liu, Li-Li; Yu, Qi-Fang; Li, Si; He, Jian-Hua

    2016-01-01

    Youxian sheldrake is excellent native breeds in Hunan province in China. The complete mitochondrial (mt) genome sequence plays an important role in the accurate determination of phylogenetic relationships among metazoans. This is the first study to determine the complete mitochondrial genome sequence of Youxian sheldrake using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, the total length of the mitogenome is 16,605 bp, with the base composition of 29.21% A, 22.18% T, 32.84% C, 15.77% G in the Youxian sheldrake. It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Youxian sheldrake provided an important data for further study of the phylogenetics of poultry, and available data for the genetics and breeding.

  3. Draft Genome Sequences of Nine Cyanobacterial Strains from Diverse Habitats.

    PubMed

    Zhu, Tao; Hou, Shengwei; Lu, Xuefeng; Hess, Wolfgang R

    2017-03-02

    Here, we report the annotated draft genome sequences of nine different cyanobacteria, which were originally collected from different habitats, including hot springs, terrestrial, freshwater, and marine environments, and cover four of the five morphological subsections of cyanobacteria.

  4. Draft Genome Sequences of Three Mycobacterium chimaera Respiratory Isolates.

    PubMed

    Mac Aogáin, Micheál; Roycroft, Emma; Raftery, Philomena; Mok, Simone; Fitzgibbon, Margaret; Rogers, Thomas R

    2015-12-03

    Mycobacterium chimaera is an opportunistic human pathogen implicated in both pulmonary and cardiovascular infections. Here, we report the draft genome sequences of three strains isolated from human respiratory specimens.

  5. Draft Genome Sequence of Paecilomyces hepiali, Isolated from Cordyceps sinensis.

    PubMed

    Yu, Yi; Wang, Wenting; Wang, Linping; Pang, Fang; Guo, Lanping; Song, Lai; Liu, Guiming; Feng, Chengqiang

    2016-07-07

    Paecilomyces hepiali is an endoparasitic fungus that commonly exists in the natural Cordyceps sinensis Here, we report the draft genome sequence of P. hepiali, which will facilitate the exploitation of medicinal compounds produced by the fungus.

  6. Draft Genome Sequence of Paecilomyces hepiali, Isolated from Cordyceps sinensis

    PubMed Central

    Yu, Yi; Wang, Wenting; Wang, Linping; Pang, Fang; Guo, Lanping; Song, Lai

    2016-01-01

    Paecilomyces hepiali is an endoparasitic fungus that commonly exists in the natural Cordyceps sinensis. Here, we report the draft genome sequence of P. hepiali, which will facilitate the exploitation of medicinal compounds produced by the fungus. PMID:27389266

  7. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems

    PubMed Central

    Flynn, James D.; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F.; Knief, Claudia; Op den Camp, Huub J. M.; Jetten, Mike S. M.; Khmelenina, Valentina N.; Trotsenko, Yuri A.; Murrell, J. Colin; Semrau, Jeremy D.; Svenning, Mette M.; Stein, Lisa Y.; Kyrpides, Nikos; Shapiro, Nicole; Woyke, Tanja; Bringel, Françoise; Vuilleumier, Stéphane; DiSpirito, Alan A.

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems. PMID:26798114

  8. Genome sequence of vanilla distortion mosaic virus infecting Coriandrum sativum.

    PubMed

    Adams, I P; Rai, S; Deka, M; Harju, V; Hodges, T; Hayward, G; Skelton, A; Fox, A; Boonham, N

    2014-12-01

    The 9573-nucleotide genome of a potyvirus was sequenced from a Coriandrum sativum plant from India with viral symptoms. On analysis, this virus was shown to have greater than 85 % nucleotide sequence identity to vanilla distortion mosaic virus (VDMV). Analysis of the putative coat protein sequence confirmed that this virus was in fact VDMV, with greater than 91 % amino acid sequence identity. The genome appears to encode a 3083-amino-acid polyprotein potentially cleaved into the 10 mature proteins expected in potyviruses. Phylogenetic analysis confirmed that VDMV is a distinct but ungrouped member of the genus Potyvirus.

  9. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  10. Functional Genomics for Epithelial-Mesenchymal Transition in Breast Cancer

    DTIC Science & Technology

    2012-09-01

    leads for therapeutic targeting in breast cancer. We are employing the high throughput functional genomic screens using epithelial mesenchymal...of sequencing from in vitro and in vivo hits in stream. We anticipate completion in the coming year. Body Task 1: To identify gene products...focus on Vim induction at the invasive edge of formed tumors generated by shRNA transduction. Task 2: To identify gene products that may

  11. Genome Sequence of Mycobacterium Phage Waterfoul

    PubMed Central

    Jackson, Paige N.; Embry, Ella K.; Johnson, Christa O.; Watson, Tiara L.; Weast, Sayre K.; DeGraw, Caroline J.; Douglas, Jessica R.; Sellers, J. Michael; D’Angelo, William A.

    2016-01-01

    Waterfoul is a newly isolated temperate siphovirus of Mycobacterium smegmatis mc2155. It was identified as a member of the K5 cluster of Mycobacterium phages and has a 61,248-bp genome with 95 predicted genes. PMID:27856585

  12. Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...

  13. A snapshot of the emerging tomato genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

  14. Complete Genome Sequence of the Oncolytic Sendai virus Strain Moscow

    PubMed Central

    Zainutdinov, Sergei S.; Tikunov, Artem Y.; Matveeva, Olga V.

    2016-01-01

    We report here the complete genome sequence of Sendai virus Moscow strain. Anecdotal evidence for the efficacy of oncolytic virotherapy exists for this strain. The RNA genome of the Moscow strain is 15,384 nucleotides in length and differs from the nearest strain, BB1, by 18 nucleotides and 11 amino acids. PMID:27516510

  15. Draft genome sequence of Phomopsis longicolla MSPL 10-6

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Phomopsis longicolla T.W. Hobbs is the primary cause of Phomopsis seed decay in soybean. We report the de novo assembled draft genome sequence of P. longicolla isolate MSPL10-6 with a 54.8-fold depth of coverage. The resulting draft genome was estimated to be approximately 64 Mb in size with an over...

  16. Complete Mitochondrial Genome Sequence of the Pezizomycete Pyronema confluens

    PubMed Central

    2016-01-01

    The complete mitochondrial genome of the ascomycete Pyronema confluens has been sequenced. The circular genome has a size of 191 kb and contains 48 protein-coding genes, 26 tRNA genes, and two rRNA genes. Of the protein-coding genes, 14 encode conserved mitochondrial proteins, and 31 encode predicted homing endonuclease genes. PMID:27174271

  17. Mitochondrial Genome Sequence of the Glass Sponge Oopsacas minuta

    PubMed Central

    Santini, Sébastien; Rocher, Caroline; Le Bivic, André

    2015-01-01

    We report the complete mitochondrial genome sequence of the Mediterranean glass sponge Oopsacas minuta. This 19-kb mitochondrial genome has 24 noncoding genes (22 tRNAs and 2 rRNAs) and 14 protein-encoding genes coding for 11 subunits of respiratory chain complexes and 3 ATP synthase subunits. PMID:26227597

  18. Complete genome sequence of pronghorn virus, a pestivirus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  19. Complete genome sequence of pronghorn virus, a pestivirus.

    PubMed

    Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

    2014-06-12

    The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids.

  20. Complete Genome Sequence of Pronghorn Virus, a Pestivirus

    PubMed Central

    Ridpath, Julia F.; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

    2014-01-01

    The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

  1. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage Smudge

    PubMed Central

    Cornell, Jessica L.; Breslin, Eileen; Schuhmacher, Zachary; Himelright, Madison; Berluti, Cassandra; Boyd, Charles; Carson, Rachel; Del Gallo, Elle; Giessler, Caris; Gilliam, Benjamin; Heatherly, Catherine; Nevin, Julius; Nguyen, Bryan; Nguyen, Justin; Parada, Jocelyn; Sutterfield, Blake; Tukruni, Muruj

    2016-01-01

    Smudge, a bacteriophage enriched from soil using Bacillus thuringiensis DSM-350 as the host, had its complete genome sequenced. Smudge is a myovirus with a genome consisting of 292 genes and was identified as belonging to the C1 cluster of Bacillus phages. PMID:27540049

  2. Draft Genome Sequence of Tannerella forsythia Clinical Isolate 9610

    PubMed Central

    Hanson-Drury, Sesha; Liu, Quanhui; Vo, Anh T.; Kim, Michelle; Watling, Michael; Bumgarner, Roger S.

    2017-01-01

    ABSTRACT We present here the draft genome sequence of Tannerella forsythia 9610, a clinical isolate obtained from a periodontitis patient. The genome is composed of 79 scaffolds with 82 contigs, for a length of 3,201,941 bp and a G+C of 47.3%. PMID:28336586

  3. Draft genome sequences of 10 strains of the genus exiguobacterium.

    PubMed

    Vishnivetskaya, Tatiana A; Chauhan, Archana; Layton, Alice C; Pfiffner, Susan M; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C; Markowitz, Victor M; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W; Pati, Amrita; Stamatis, Dimitrios; Reddy, T B K; Shapiro, Nicole; Nordberg, Henrik P; Cantor, Michael N; Hua, X Susan; Woyke, Tanja

    2014-10-16

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes.

  4. Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium

    PubMed Central

    Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  5. Draft Genome Sequence of Enterobacter cloacae Strain JD6301

    PubMed Central

    Wilson, Jessica G.; French, William T.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W.; Champlin, Franklin R.

    2014-01-01

    Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

  6. Insights into vertebrate evolution from the chicken genome sequence

    PubMed Central

    Furlong, Rebecca F

    2005-01-01

    The chicken has recently joined the ever-growing list of fully sequenced animal genomes. Its unique features include expanded gene families involved in egg and feather production as well as more surprising large families, such as those for olfactory receptors. Comparisons with other vertebrate genomes move us closer to defining a set of essential vertebrate genes. PMID:15693954

  7. Draft Genome Sequences of Itaconate-Producing Ustilaginaceae

    PubMed Central

    Geiser, Elena; Ludwig, Florian; Zambanini, Thiemo; Blank, Lars M.

    2016-01-01

    Some smut fungi of the family Ustilaginaceae produce itaconate from glucose. De novo genome sequencing of nine itaconate-producing Ustilaginaceae revealed genome sizes between 19 and 25 Mbp. Comparison to the itaconate cluster of U. maydis MB215 revealed all essential genes for itaconate production contributing to metabolic engineering for improving itaconate production. PMID:27979931

  8. Complete Genome Sequence of Pseudomonas aeruginosa Phage AAT-1.

    PubMed

    Andrade-Domínguez, Andrés; Kolter, Roberto

    2016-08-25

    Aspects of the interaction between phages and animals are of interest and importance for medical applications. Here, we report the genome sequence of the lytic Pseudomonas phage AAT-1, isolated from mammalian serum. AAT-1 is a double-stranded DNA phage, with a genome of 57,599 bp, containing 76 predicted open reading frames.

  9. Complete Genome Sequence of Desulfovibrio piger FI11049

    PubMed Central

    Nueno Palop, Carmen; Mayer, Melinda J.; Crost, Emmanuelle; Narbad, Arjan

    2017-01-01

    ABSTRACT The complete genome sequence of Desulfovibrio piger FI11049 was determined. The genome consists of a single circular chromosome of 2,807,531 bp encoding seven rRNA operons, 76 tRNA genes, and 2,535 coding genes. PMID:28209813

  10. Draft genome sequence of the silver pomfret fish, Pampus argenteus.

    PubMed

    AlMomin, Sabah; Kumar, Vinod; Al-Amad, Sami; Al-Hussaini, Mohsen; Dashti, Talal; Al-Enezi, Khaznah; Akbar, Abrar

    2016-01-01

    Silver pomfret, Pampus argenteus, is a fish species from coastal waters. Despite its high commercial value, this edible fish has not been sequenced. Hence, its genetic and genomic studies have been limited. We report the first draft genome sequence of the silver pomfret obtained using a Next Generation Sequencing (NGS) technology. We assembled 38.7 Gb of nucleotides into scaffolds of 350 Mb with N50 of about 1.5 kb, using high quality paired end reads. These scaffolds represent 63.7% of the estimated silver pomfret genome length. The newly sequenced and assembled genome has 11.06% repetitive DNA regions, and this percentage is comparable to that of the tilapia genome. The genome analysis predicted 16 322 genes. About 91% of these genes showed homology with known proteins. Many gene clusters were annotated to protein and fatty-acid metabolism pathways that may be important in the context of the meat texture and immune system developmental processes. The reference genome can pave the way for the identification of many other genomic features that could improve breeding and population-management strategies, and it can also help characterize the genetic diversity of P. argenteus.

  11. Complete Genome Sequence of Spiroplasma sp. TU-14

    PubMed Central

    Lo, Wen-Sui; Haryono, Mindia; Gasparich, Gail E.

    2017-01-01

    ABSTRACT Spiroplasma sp. TU-14 was isolated from a contaminated sample of Entomoplasma lucivorax PIPN-2T obtained from the International Organization for Mycoplasmology collection. Here, we report the complete genome sequence of this bacterium to facilitate the investigation of its biology and the comparative genomics among Spiroplasma spp. PMID:28082500

  12. Complete genome sequence of Aeromonas hydrophila AL06-06

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aeromonas hydrophila occurs in freshwater environments and infects fish and mammals. In this work, we report the complete genome sequence of Aeromonas hydrophila AL06-06, which was isolated from diseased goldfish and is being used for comparative genomic studies with A. hydrophila strains causing ba...

  13. Draft Genome Sequence of Subantarctic Rhodococcus sp. Strain 1139

    PubMed Central

    Baker, Anthony L.; Charleston, Michael A.; Britz, Margaret L.

    2017-01-01

    ABSTRACT The draft genome sequence of subantarctic Rhodococcus sp. strain 1139 is reported here. The genome size is 7.04 Mb with high G+C content (62.3%) and it contains a large number of genes involved in lipid synthesis. This lipid synthesis system is characteristic of oleaginous Actinobacteria, which are of interest for biofuel production. PMID:28385836

  14. Complete genome sequence of Streptococcus thermophilus strain ND03.

    PubMed

    Sun, Zhihong; Chen, Xia; Wang, Jicheng; Zhao, Wenjing; Shao, Yuyu; Wu, Lan; Zhou, Zhemin; Sun, Tiansong; Wang, Lei; Meng, He; Zhang, Heping; Chen, Wei

    2011-02-01

    Streptococcus thermophilus strain ND03 is a Chinese commercial dairy starter used for the manufacture of yogurt. It was isolated from naturally fermented yak milk in Qinghai, China. We present here the complete genome sequence of ND03 and compare it to three other published genomes of Streptococcus thermophilus strains.

  15. Single-Molecule Sequencing of the Drosophila serrata Genome

    PubMed Central

    Allen, Scott L.; Delaney, Emily K.; Kopp, Artyom; Chenoweth, Stephen F.

    2017-01-01

    Long-read sequencing technology promises to greatly enhance de novo assembly of genomes for nonmodel species. Although the error rates of long reads have been a stumbling block, sequencing at high coverage permits the self-correction of many errors. Here, we sequence and de novo assemble the genome of Drosophila serrata, a species from the montium subgroup that has been well-studied for latitudinal clines, sexual selection, and gene expression, but which lacks a reference genome. Using 11 PacBio single-molecule real-time (SMRT cells), we generated 12 Gbp of raw sequence data comprising ∼65 × whole-genome coverage. Read lengths averaged 8940 bp (NRead50 12,200) with the longest read at 53 kbp. We self-corrected reads using the PBDagCon algorithm and assembled the genome using the MHAP algorithm within the PBcR assembler. Total genome length was 198 Mbp with an N50 just under 1 Mbp. Contigs displayed a high degree of chromosome arm-level conservation with the D. melanogaster genome and many could be sensibly placed on the D. serrata physical map. We also provide an initial annotation for this genome using in silico gene predictions that were supported by RNA-seq data. PMID:28143951

  16. Complete genome sequence of Campylobacter gracilis ATCC 33236T

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The human oral pathogen Campylobacter gracilis has been isolated from periodontal and endodontal infections, and also from non-oral head, neck or lung infections. This study describes the whole-genome sequence of the human periodontal isolate ATCC 33236T (=FDC 1084), which is the first closed genome...

  17. First Complete Genome Sequence of Felis catus Gammaherpesvirus 1

    PubMed Central

    Lee, Justin S.; Vuyisich, Momchilo; Chain, Patrick; Lo, Chien-Chi; Kronmiller, Brent; Bracha, Shay; Avery, Anne C.; VandeWoude, Sue

    2015-01-01

    We sequenced the complete genome of Felis catus gammaherpesvirus 1 (FcaGHV1) from lymph node DNA of an infected cat. The genome includes a 121,556-nucleotide unique region with 87 predicted open reading frames (61 gammaherpesvirus conserved and 26 unique) flanked by multiple copies of a 966-nucleotide terminal repeat. PMID:26543105

  18. The tomato genome sequence provides insight into fleshy fruit evolution

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...

  19. Genome sequences of Listeria monocytogenes strains with resistance to arsenic

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Listeria monocytogenes frequently exhibits resistance to arsenic. We report here the draft genome sequences of eight genetically diverse arsenic-resistant L. monocytogenes strains from human listeriosis and food-associated environments. Availability of these genomes would help to elucidate the role ...

  20. Draft genome sequence of Therminicola potens strain JR

    SciTech Connect

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  1. Draft Genome Sequence of Klebsiella pneumoniae AWD5

    PubMed Central

    Rajkumari, Jina; Singha, L. Paikhomba

    2017-01-01

    ABSTRACT Here, we report the draft genome sequence of Klebsiella pneumoniae strain AWD5, isolated from an automobile workshop in India. The de novo assembly resulted in a 4,807,409 bp genome containing 25 rRNA genes, 81 tRNAs, and 4,636 coding sequences (CDS). It carries important genes for polyaromatic hydrocarbon degradation and benzoate degradation. PMID:28153891

  2. Complete Genome Sequences of 38 Gordonia sp. Bacteriophages

    PubMed Central

    Montgomery, Matthew T.; Bonilla, J. Alfred; Dejong, Randall; Garlena, Rebecca A.; Guerrero Bustamante, Carlos; Klyczek, Karen K.; Russell, Daniel A.; Wertz, John T.; Jacobs-Sera, Deborah; Hatfull, Graham F.

    2017-01-01

    ABSTRACT We report here the genome sequences of 38 newly isolated bacteriophages using Gordonia terrae 3612 (ATCC 25594) and Gordonia neofelifaecis NRRL59395 as bacterial hosts. All of the phages are double-stranded DNA (dsDNA) tail phages with siphoviral morphologies, with genome sizes ranging from 17,118 bp to 93,843 bp and spanning considerable nucleotide sequence diversity. PMID:28057748

  3. Complete genome sequence of the giant Pseudomonas phage Lu11.

    PubMed

    Adriaenssens, E M; Mattheus, W; Cornelissen, A; Shaburova, O; Krylov, V N; Kropinski, A M; Lavigne, R

    2012-06-01

    The complete genome sequence of the giant Pseudomonas phage Lu11 was determined, comparing 454 and Sanger sequencing. The double-stranded DNA (dsDNA) genome is 280,538 bp long and encodes 391 open reading frames (ORFs) and no tRNAs. The closest relative is Ralstonia phage ϕRSL1, encoding 40 similar proteins. As such, Lu11 can be considered phylogenetically unique within the Myoviridae and indicates the diversity of the giant phages within this family.

  4. Complete Mitochondrial Genome Sequence of Sunflower (Helianthus annuus L.)

    PubMed Central

    Ebert, Daniel P.; Kane, Nolan C.; Rieseberg, Loren H.

    2016-01-01

    This is the first complete mitochondrial genome sequence for sunflower and the first complete mitochondrial genome for any member of Asteraceae, the largest plant family, which includes over 23,000 named species. The master circle is 300,945-bp long and includes 27 protein-coding sequences, 18 tRNAs, and the 26S, 5S, and 18S rRNAs. PMID:27635002

  5. Use of information theory to study genome sequences

    NASA Astrophysics Data System (ADS)

    Ohya, Masanori; Sato, Keiko

    2000-12-01

    The genome sequence carries information about life as an order of four bases. It is considered that this order indicates a special code structure. In this paper we discuss how the mutual entropy, the main concept in Shannon's communication theory, can be used to study genome sequences, and how a measure introduced in our previous paper [10] for the analysis of similarities of code structures is applied for examining the coding structure of several species, in particular, HIV-1.

  6. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    PubMed

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  7. Genome Sequence of Cluster W Mycobacteriophage Taptic

    PubMed Central

    Mageeney, Catherine M.; Seier, Emily R.; Esposito, Elise C.; Graham, Lee H.; Heckman, Emily L.; Hipwell, Chelsea M.; Kelliher, Allison B.; Lando, Nicole A.; Morales, Patricia Y.; Russell, Daniel A.; Tsaousis, Barbara E.; Kenna, Margaret A.

    2017-01-01

    ABSTRACT The Taptic genome is the first to be annotated from the W cluster of mycobacteriophages infecting Mycobacterium smegmatis mc2155. All 92 predicted open reading frames (ORFs) and a single tRNA specifying glycine (tRNA-gly) are transcribed rightward. Many functionally uncharacterized ORFs appear to be W cluster specific, as nucleotide similarity is shared only with other W cluster genomes. PMID:28302785

  8. [Cancer Genome Atlas Pan-cancer Analysis Project].

    PubMed

    Zhang, Kun; Wang, Hong

    2015-04-01

    Cancer can exhibit different forms depending on the site of origin, cell types, the different forms of genetic mutations which also affect cancer therapeutic effect. Although many genes have been demonstrated to change a direct result of the change in phenotype, however, many cancers lineage complex molecular mechanisms are still not fully elucidated. Therefore, The Cancer Genome Atlas (TCGA) Research Network analyzed a large human tumors, in order to find the molecular changes in DNA, RNA, protein and epigenetic level, The results contain a wealth of data provides us with an opportunity for common, personality and new ideas throughout the cancer lineages form a whole description. Pan-cancer genome program first compares the 12 kinds of cancer types. Analysis of different tumor molecular changes and their functions, will tell us how effective treatment method is applied to a similar phenotype of the tumor.

  9. Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species

    PubMed Central

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  10. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    PubMed

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.

  11. Imaging genome abnormalities in cancer research.

    PubMed

    Heng, Henry HQ; Stevens, Joshua B; Liu, Guo; Bremer, Steven W; Ye, Christine J

    2004-01-13

    Increasing attention is focusing on chromosomal and genome structure in cancer research due to the fact that genomic instability plays a principal role in cancer initiation, progression and response to chemotherapeutic agents. The integrity of the genome (including structural, behavioral and functional aspects) of normal and cancer cells can be monitored with direct visualization by using a variety of cutting edge molecular cytogenetic technologies that are now available in the field of cancer research. Examples are presented in this review by grouping these methodologies into four categories visualizing different yet closely related major levels of genome structures. An integrated discussion is also presented on several ongoing projects involving the illustration of mitotic and meiotic chromatin loops; the identification of defective mitotic figures (DMF), a new type of chromosomal aberration capable of monitoring condensation defects in cancer; the establishment of a method that uses Non-Clonal Chromosomal Aberrations (NCCAs) as an index to monitor genomic instability; and the characterization of apoptosis related chromosomal fragmentations caused by drug treatments.

  12. Mitochondrial Genome Sequence of the Legume Vicia faba.

    PubMed

    Negruk, Valentine

    2013-01-01

    The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000 bp with a genome complexity of 387,745 bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806 bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs.

  13. A 454 sequencing approach to dipteran mitochondrial genome research.

    PubMed

    Ramakodi, Meganathan P; Singh, Baneshwar; Wells, Jeffrey D; Guerrero, Felix; Ray, David A

    2015-01-01

    The availability of complete mitochondrial genome (mtgenome) data for Diptera, one of the largest metazoan orders, in public databases is limited. The advent of high throughput sequencing technology provides the potential to generate mtgenomes for many species affordably and quickly. However, these technologies need to be validated for dipterans as the members of this clade play important economic and research roles. Illumina and 454 sequencing platforms are widely used in genomic research involving non-model organisms. The Illumina platform has already been utilized for generating mitochondrial genomes without using conventional long range PCR for insects whereas the power of 454 sequencing for generating mitochondrial genome drafts without PCR has not yet been validated for insects. Thus, this study examines the utility of 454 sequencing approach for dipteran mtgenomic research. We generated complete or nearly complete mitochondrial genomes for Cochliomyia hominivorax, Haematobia irritans, Phormia regina and Sarcophaga crassipalpis using a 454 sequencing approach. Comparisons between newly obtained and existing assemblies for C. hominivorax and H. irritans revealed no major discrepancies and verified the utility of 454 sequencing for dipteran mitochondrial genomes. We also report the complete mitochondrial sequences for two forensically important flies, P. regina and S. crassipalpis, which could be used to provide useful information to legal personnel. Comparative analyses revealed that dipterans follow similar codon usage and nucleotide biases that could be due to mutational and selection pressures. This study illustrates the utility of 454 sequencing to obtain complete mitochondrial genomes for dipterans without the aid of conventional molecular techniques such as PCR and cloning and validates this method of mtgenome sequencing in arthropods.

  14. A Concise Atlas of Thyroid Cancer Next-Generation Sequencing Panel ThyroSeq v.2

    PubMed Central

    Alsina, Jorge; Alsina, Raul; Gulec, Seza

    2017-01-01

    The next-generation sequencing technology allows high out-put genomic analysis. An innovative assay in thyroid cancer, ThyroSeq® was developed for targeted mutation detection by next generation sequencing technology in fine needle aspiration and tissue samples. ThyroSeq v.2 next generation sequencing panel offers simultaneous sequencing and detection in >1000 hotspots of 14 thyroid cancer-related genes and for 42 types of gene fusions known to occur in thyroid cancer. ThyroSeq is being increasingly used to further narrow the indeterminate category defined by cytology for thyroid nodules. From a surgical perspective, genomic profiling also provides prognostic and predictive information and closely relates to determination of surgical strategy. Both the genomic analysis technology and the informatics for the cancer genome data base are rapidly developing. In this paper, we have gathered existing information on the thyroid cancer-related genes involved in the initiation and progression of thyroid cancer. Our goal is to assemble a glossary for the current ThyroSeq genomic panel that can help elucidate the role genomics play in thyroid cancer oncogenesis. PMID:28117295

  15. Assessing the clinical utility of genomic expression data across human cancers

    PubMed Central

    Xu, Xinsen; Huang, Lei; Chan, Chun Hei; Yu, Tao; Miao, Runchen; Liu, Chang

    2016-01-01

    Cancer molecular profiling provides better understanding of tumor mechanisms and helps to improve the existing cancer management. Here we present the gene expression signatures from ∼9000 human tumors with clinical information across 32 malignancies from The Cancer Genome Atlas project (TCGA). Major predictors from the RNA sequencing data that were significantly correlated with cancer survival were identified. The expression level of these prognostic genes revealed significant genomic pathways that were clinically relevant to survival outcomes across human cancers. Furthermore, it is shown that in most cancer types, combinations of these genomic signatures with clinical information might yield improved predictions. Thus, with respect to clinical utility, our study reveals the promising values of genomic data from the pan-cancer perspective. PMID:27322207

  16. A Million Cancer Genome Warehouse

    DTIC Science & Technology

    2012-11-20

    Fitzpatrick, A. L., Agrawal, A., Barnes, K., Boyd, H. A., et al. (2011). Phenotype harmonization and cross‐study collaboration in GWAS consortia...Genome Warehouse is performing genome- wide association studies ( GWAS ) of both common and rare inherited single nucleotide polymorphisms (SNPs) to compare

  17. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    PubMed

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  18. Choosing a Benchtop Sequencing Machine to Characterise Helicobacter pylori Genomes

    PubMed Central

    Perkins, Timothy T.; Tay, Chin Yen; Thirriot, Fanny; Marshall, Barry

    2013-01-01

    The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori. PMID:23840736

  19. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis.

    PubMed

    Carlton, Jane M; Hirt, Robert P; Silva, Joana C; Delcher, Arthur L; Schatz, Michael; Zhao, Qi; Wortman, Jennifer R; Bidwell, Shelby L; Alsmark, U Cecilia M; Besteiro, Sébastien; Sicheritz-Ponten, Thomas; Noel, Christophe J; Dacks, Joel B; Foster, Peter G; Simillion, Cedric; Van de Peer, Yves; Miranda-Saavedra, Diego; Barton, Geoffrey J; Westrop, Gareth D; Müller, Sylke; Dessi, Daniele; Fiori, Pier Luigi; Ren, Qinghu; Paulsen, Ian; Zhang, Hanbang; Bastida-Corcuera, Felix D; Simoes-Barbosa, Augusto; Brown, Mark T; Hayes, Richard D; Mukherjee, Mandira; Okumura, Cheryl Y; Schneider, Rachel; Smith, Alias J; Vanacova, Stepanka; Villalvazo, Maria; Haas, Brian J; Pertea, Mihaela; Feldblyum, Tamara V; Utterback, Terry R; Shu, Chung-Li; Osoegawa, Kazutoyo; de Jong, Pieter J; Hrdy, Ivan; Horvathova, Lenka; Zubacova, Zuzana; Dolezal, Pavel; Malik, Shehre-Banoo; Logsdon, John M; Henze, Katrin; Gupta, Arti; Wang, Ching C; Dunne, Rebecca L; Upcroft, Jacqueline A; Upcroft, Peter; White, Owen; Salzberg, Steven L; Tang, Petrus; Chiu, Cheng-Hsun; Lee, Ying-Shiung; Embley, T Martin; Coombs, Graham H; Mottram, Jeremy C; Tachezy, Jan; Fraser-Liggett, Claire M; Johnson, Patricia J

    2007-01-12

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated in pathogenesis and phagocytosis of host proteins may exemplify adaptations of the parasite during its transition to a urogenital environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria.

  20. Complete genome sequence of Serratia plymuthica strain AS12

    PubMed Central

    Finlay, Roger D.; Alström, Sadhna; Goodwin, Lynne; Kyrpides, Nikos C.; Lucas, Susan; Lapidus, Alla; Bruce, David; Pitluck, Sam; Peters, Lin; Ovchinnikova, Galina; Chertkov, Olga; Han, James; Han, Cliff; Tapia, Roxanne; Detter, John C.; Land, Miriam; Hauser, Loren; Cheng, Jan-Fang; Ivanova, Natalia; Pagani, Ioanna; Klenk, Hans-Peter; Woyke, Tanja; Högberg, Nils

    2012-01-01

    A plant-associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest because it promotes plant growth and inhibits plant pathogens. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled “Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens”. PMID:22768360

  1. Clinical Application of Targeted Next Generation Sequencing for Colorectal Cancers

    PubMed Central

    Fontanges, Quitterie; De Mendonca, Ricardo; Salmon, Isabelle; Le Mercier, Marie; D’Haene, Nicky

    2016-01-01

    Promising targeted therapy and personalized medicine are making molecular profiling of tumours a priority. For colorectal cancer (CRC) patients, international guidelines made RAS (KRAS and NRAS) status a prerequisite for the use of anti-epidermal growth factor receptor agents (anti-EGFR). Daily, new data emerge on the theranostic and prognostic role of molecular biomarkers, which is a strong incentive for a validated, sensitive and broadly available molecular screening test in order to implement and improve multi-modal therapy strategy and clinical trials. Next generation sequencing (NGS) has begun to supplant other technologies for genomic profiling. Targeted NGS is a method that allows parallel sequencing of thousands of short DNA sequences in a single test offering a cost-effective approach for detecting multiple genetic alterations with a minimum amount of DNA. In the present review, we collected data concerning the clinical application of NGS technology in the setting of colorectal cancer. PMID:27999270

  2. Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine.

    PubMed

    Green, Robert C; Goddard, Katrina A B; Jarvik, Gail P; Amendola, Laura M; Appelbaum, Paul S; Berg, Jonathan S; Bernhardt, Barbara A; Biesecker, Leslie G; Biswas, Sawona; Blout, Carrie L; Bowling, Kevin M; Brothers, Kyle B; Burke, Wylie; Caga-Anan, Charlisse F; Chinnaiyan, Arul M; Chung, Wendy K; Clayton, Ellen W; Cooper, Gregory M; East, Kelly; Evans, James P; Fullerton, Stephanie M; Garraway, Levi A; Garrett, Jeremy R; Gray, Stacy W; Henderson, Gail E; Hindorff, Lucia A; Holm, Ingrid A; Lewis, Michelle Huckaby; Hutter, Carolyn M; Janne, Pasi A; Joffe, Steven; Kaufman, David; Knoppers, Bartha M; Koenig, Barbara A; Krantz, Ian D; Manolio, Teri A; McCullough, Laurence; McEwen, Jean; McGuire, Amy; Muzny, Donna; Myers, Richard M; Nickerson, Deborah A; Ou, Jeffrey; Parsons, Donald W; Petersen, Gloria M; Plon, Sharon E; Rehm, Heidi L; Roberts, J Scott; Robinson, Dan; Salama, Joseph S; Scollon, Sarah; Sharp, Richard R; Shirts, Brian; Spinner, Nancy B; Tabor, Holly K; Tarczy-Hornoch, Peter; Veenstra, David L; Wagle, Nikhil; Weck, Karen; Wilfond, Benjamin S; Wilhelmsen, Kirk; Wolf, Susan M; Wynn, Julia; Yu, Joon-Ho

    2016-06-02

    Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine.

  3. Cancer genomics object model: an object model for multiple functional genomics data for cancer research.

    PubMed

    Park, Yu Rang; Lee, Hye Won; Cho, Sung Bum; Kim, Ju Han

    2007-01-01

    The development of functional genomics including transcriptomics, proteomics and metabolomics allow us to monitor a large number of key cellular pathways simultaneously. Several technology-specific data models have been introduced for the representation of functional genomics experimental data, including the MicroArray Gene Expression-Object Model (MAGE-OM), the Proteomics Experiment Data Repository (PEDRo), and the Tissue MicroArray-Object Model (TMA-OM). Despite the increasing number of cancer studies using multiple functional genomics technologies, there is still no integrated data model for multiple functional genomics experimental and clinical data. We propose an object-oriented data model for cancer genomics research, Cancer Genomics Object Model (CaGe-OM). We reference four data models: Functional Genomic-Object Model, MAGE-OM, TMAOM and PEDRo. The clinical and histopathological information models are created by analyzing cancer management workflow and referencing the College of American Pathology Cancer Protocols and National Cancer Institute Common Data Elements. The CaGe-OM provides a comprehensive data model for integrated storage and analysis of clinical and multiple functional genomics data.

  4. A Platform for Designing Genome-Based Personalized Immunotherapy or Vaccine against Cancer

    PubMed Central

    Gupta, Sudheer; Chaudhary, Kumardeep; Dhanda, Sandeep Kumar; Kumar, Rahul; Kumar, Shailesh; Sehgal, Manika; Nagpal, Gandharva

    2016-01-01

    Due to advancement in sequencing technology, genomes of thousands of cancer tissues or cell-lines have been sequenced. Identification of cancer-specific epitopes or neoepitopes from cancer genomes is one of the major challenges in the field of immunotherapy or vaccine development. This paper describes a platform Cancertope, developed for designing genome-based immunotherapy or vaccine against a cancer cell. Broadly, the integrated resources on this platform are apportioned into three precise sections. First section explains a cancer-specific database of neoepitopes generated from genome of 905 cancer cell lines. This database harbors wide range of epitopes (e.g., B-cell, CD8+ T-cell, HLA class I, HLA class II) against 60 cancer-specific vaccine antigens. Second section describes a partially personalized module developed for predicting potential neoepitopes against a user-specific cancer genome. Finally, we describe a fully personalized module developed for identification of neoepitopes from genomes of cancerous and healthy cells of a cancer-patient. In order to assist the scientific community, wide range of tools are incorporated in this platform that includes screening of epitopes against human reference proteome (http://www.imtech.res.in/raghava/cancertope/). PMID:27832200

  5. Widespread mitovirus sequences in plant genomes

    PubMed Central

    Warner, Benjamin E.; Yerramsetty, Pradeep

    2015-01-01

    The exploration of the evolution of RNA viruses has been aided recently by the discovery of copies of fragments or complete genomes of non-retroviral RNA viruses (Non-retroviral Endogenous RNA Viral Elements, or NERVEs) in many eukaryotic nuclear genomes. Among the most prominent NERVEs are partial copies of the RNA dependent RNA polymerase (RdRP) of the mitoviruses in plant mitochondrial genomes. Mitoviruses are in the family Narnaviridae, which are the simplest viruses, encoding only a single protein (the RdRP) in their unencapsidated viral plus strand. Narnaviruses are known only in fungi, and the origin of plant mitochondrial mitovirus NERVEs appears to be horizontal transfer from plant pathogenic fungi. At least one mitochondrial mitovirus NERVE, but not its nuclear copy, is expressed. PMID:25870770

  6. Genomic heterogeneity of multiple synchronous lung cancer

    PubMed Central

    Liu, Yu; Zhang, Jianjun; Li, Lin; Yin, Guangliang; Zhang, Jianhua; Zheng, Shan; Cheung, Hannah; Wu, Ning; Lu, Ning; Mao, Xizeng; Yang, Longhai; Zhang, Jiexin; Zhang, Li; Seth, Sahil; Chen, Huang; Song, Xingzhi; Liu, Kan; Xie, Yongqiang; Zhou, Lina; Zhao, Chuanduo; Han, Naijun; Chen, Wenting; Zhang, Susu; Chen, Longyun; Cai, Wenjun; Li, Lin; Shen, Miaozhong; Xu, Ningzhi; Cheng, Shujun; Yang, Huanming; Lee, J. Jack; Correa, Arlene; Fujimoto, Junya; Behrens, Carmen; Chow, Chi-Wan; William, William N.; Heymach, John V.; Hong, Waun Ki; Swisher, Stephen; Wistuba, Ignacio I.; Wang, Jun; Lin, Dongmei; Liu, Xiangyang; Futreal, P. Andrew; Gao, Yanning

    2016-01-01

    Multiple synchronous lung cancers (MSLCs) present a clinical dilemma as to whether individual tumours represent intrapulmonary metastases or independent tumours. In this study we analyse genomic profiles of 15 lung adenocarcinomas and one regional lymph node metastasis from 6 patients with MSLC. All 15 lung tumours demonstrate distinct genomic profiles, suggesting all are independent primary tumours, which are consistent with comprehensive histopathological assessment in 5 of the 6 patients. Lung tumours of the same individuals are no more similar to each other than are lung adenocarcinomas of different patients from TCGA cohort matched for tumour size and smoking status. Several known cancer-associated genes have different mutations in different tumours from the same patients. These findings suggest that in the context of identical constitutional genetic background and environmental exposure, different lung cancers in the same individual may have distinct genomic profiles and can be driven by distinct molecular events. PMID:27767028

  7. Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology

    PubMed Central

    Cronn, Richard; Liston, Aaron; Parks, Matthew; Gernandt, David S.; Shen, Rongkun; Mockler, Todd

    2008-01-01

    Organellar DNA sequences are widely used in evolutionary and population genetic studies, however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified ∼120 kb plastomes from eight species (seven Pinus, one Picea) in 35 reactions. Pooled products were ligated to modified adapters that included 3 bp indexing tags and samples were multiplexed at four genomes per lane. Tagged microreads were assembled by de novo and reference-guided assembly methods, using previously published Pinus plastomes as surrogate references. Assemblies for these eight genomes are estimated at 88–94% complete, with an average sequence depth of 55× to 186×. Mononucleotide repeats interrupt contig assembly with increasing repeat length, and we estimate that the limit for their assembly is 16 bp. Comparisons to 37 kb of Sanger sequence show a validated error rate of 0.056%, and conspicuous errors are evident from the assembly process. This efficient sequencing approach yields high-quality draft genomes and should have immediate applicability to genomes with comparable complexity. PMID:18753151

  8. Genome sequence of the cultivated cotton Gossypium arboreum.

    PubMed

    Li, Fuguang; Fan, Guangyi; Wang, Kunbo; Sun, Fengming; Yuan, Youlu; Song, Guoli; Li, Qin; Ma, Zhiying; Lu, Cairui; Zou, Changsong; Chen, Wenbin; Liang, Xinming; Shang, Haihong; Liu, Weiqing; Shi, Chengcheng; Xiao, Guanghui; Gou, Caiyun; Ye, Wuwei; Xu, Xun; Zhang, Xueyan; Wei, Hengling; Li, Zhifang; Zhang, Guiyin; Wang, Junyi; Liu, Kun; Kohel, Russell J; Percy, Richard G; Yu, John Z; Zhu, Yu-Xian; Wang, Jun; Yu, Shuxun

    2014-06-01

    The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.

  9. Sequencing of a new target genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) genome project.

    PubMed

    Pittendrigh, B R; Clark, J M; Johnston, J S; Lee, S H; Romero-Severson, J; Dasch, G A

    2006-11-01

    The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus, determining the body louse genome will enhance studies of host-vector-pathogen interactions. Although not important as a major disease vector, head lice are of major social concern. Resistance to traditional pesticides used to control head and body lice have developed. It is imperative that new molecular targets be discovered for the development of novel compounds to control these insects. No complete genome sequence exists for a hemimetabolous insect species primarily because hemimetabolous insects often have large (2000 Mb) to very large (up to 16,300 Mb) genomes. Fortuitously, we determined that the human body louse has one of the smallest genome sizes known in insects, suggesting it may be a suitable choice as a minimal hemimetabolous genome in which many genes have been eliminated during its adaptation to human parasitism. Because many louse species infest birds and mammals, the body louse genome-sequencing project will facilitate studies of their comparative genomics. A 6-8X coverage of the body louse genome, plus sequenced expressed sequence tags, should provide the entomological, evolutionary biology, medical, and public health communities with useful genetic information.

  10. Contrasting DNA sequence organisation patterns in sauropsidian genomes.

    PubMed

    Epplen, J T; Diedrich, U; Wagenmann, M; Schmidtke, J; Engel, W

    1979-11-01

    The genomic DNA organisation patterns of four sauropsidian species, namely Python reticularis, Caiman crocodilus, Terrapene carolina triungius and Columba livia domestica were investigated by reassociation of short and long DNA fragments, by hyperchromicity measurements of reannealed fragments and by length estimations of S1-nuclease resistant repetitive duplexes. While the genomic DNA of the three reptilian species shows a short period interspersion pattern, the genome of the avian species is organised in a long period interspersion pattern apparently typical for birds. These findings are discussed in view of the close phylogenetic relationships of birds and reptiles, and also with regard to a possible relationship between the extent of sequence interspersion and genome size.

  11. Sequencing and comparative analyses of the genomes of zoysiagrasses

    PubMed Central

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-01-01

    Zoysia is a warm-season turfgrass, which comprises 11 allotetraploid species (2n = 4x = 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession ‘Nagirizaki’ (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella ‘Wakaba’ and Z. pacifica ‘Zanpa’ were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica ‘Kyoto’, Z. japonica ‘Miyagi’ and Z. matrella ‘Chiba Fair Green’, were accumulated, and aligned against the reference genome of ‘Nagirizaki’ along with those from ‘Wakaba’ and ‘Zanpa’. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the ‘Zoysia Genome Database’ at http://zoysia.kazusa.or.jp. PMID:26975196

  12. The sheep genome reference sequence: a work in progress.

    PubMed

    Archibald, A L; Cockett, N E; Dalrymple, B P; Faraut, T; Kijas, J W; Maddox, J F; McEwan, J C; Hutton Oddy, V; Raadsma, H W; Wade, C; Wang, J; Wang, W; Xun, X

    2010-10-01

    Until recently, the construction of a reference genome was performed using Sanger sequencing alone. The emergence of next-generation sequencing platforms now means reference genomes may incorporate sequence data generated from a range of sequencing platforms, each of which have different read length, systematic biases and mate-pair characteristics. The objective of this review is to inform the mammalian genomics community about the experimental strategy being pursued by the International Sheep Genomics Consortium (ISGC) to construct the draft reference genome of sheep (Ovis aries). Component activities such as data generation, sequence assembly and annotation are described, along with information concerning the key researchers performing the work. This aims to foster future participation from across the research community through the coordinated activities of the consortium. The review also serves as a 'marker paper' by providing information concerning the pre-publication release of the reference genome. This ensures the ISGC adheres to the framework for data sharing established at the recent Toronto International Data Release Workshop and provides guidelines for data users.

  13. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    SciTech Connect

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  14. Long-read sequence assembly of the gorilla genome.

    PubMed

    Gordon, David; Huddleston, John; Chaisson, Mark J P; Hill, Christopher M; Kronenberg, Zev N; Munson, Katherine M; Malig, Maika; Raja, Archana; Fiddes, Ian; Hillier, LaDeana W; Dunn, Christopher; Baker, Carl; Armstrong, Joel; Diekhans, Mark; Paten, Benedict; Shendure, Jay; Wilson, Richard K; Haussler, David; Chin, Chen-Shan; Eichler, Evan E

    2016-04-01

    Accurate sequence and assembly of genomes is a critical first step for studies of genetic variation. We generated a high-quality assembly of the gorilla genome using single-molecule, real-time sequence technology and a string graph de novo assembly algorithm. The new assembly improves contiguity by two to three orders of magnitude with respect to previously released assemblies, recovering 87% of missing reference exons and incomplete gene models. Although regions of large, high-identity segmental duplications remain largely unresolved, this comprehensive assembly provides new biological insight into genetic diversity, structural variation, gene loss, and representation of repeat structures within the gorilla genome. The approach provides a path forward for the routine assembly of mammalian genomes at a level approaching that of the current quality of the human genome.

  15. Complete genome sequence of Kangiella koreensis type strain (SW-125).

    PubMed

    Han, Cliff; Sikorski, Johannes; Lapidus, Alla; Nolan, Matt; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Copeland, Alex; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Saunders, Elizabeth; Brettin, Thomas; Göker, Markus; Tindall, Brian J; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Detter, John C

    2009-11-22

    Kangiella koreensis (Yoon et al. 2004) is the type species of the genus and is of phylogenetic interest because of the very isolated location of the genus Kangiella in the gammaproteobacterial order Oceanospirillales. K. koreensis SW-125(T) is a Gram-negative, non-motile, non-spore-forming bacterium isolated from tidal flat sediments at Daepo Beach, Yellow Sea, Korea. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first completed genome sequence from the genus Kangiella and only the fourth genome from the order Oceanospirillales. This 2,852,073 bp long single replicon genome with its 2647 protein-coding and 48 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  16. Complete genome sequence of Haliangium ochraceum type strain (SMP-2).

    PubMed

    Ivanova, Natalia; Daum, Chris; Lang, Elke; Abt, Birte; Kopitz, Markus; Saunders, Elizabeth; Lapidus, Alla; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Mavromatis, Konstantinos; Pati, Amrita; Mikhailova, Natalia; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Detter, John C; Brettin, Thomas; Rohde, Manfred; Göker, Markus; Bristow, Jim; Markowitz, Victor; Eisen, Jonathan A; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-28

    Haliangium ochraceum Fudou et al. 2002 is the type species of the genus Haliangium in the myxococcal family 'Haliangiaceae'. Members of the genus Haliangium are the first halophilic myxobacterial taxa described. The cells of the species follow a multicellular lifestyle in highly organized biofilms, called swarms, they decompose bacterial and yeast cells as most myxobacteria do. The fruiting bodies contain particularly small coccoid myxospores. H. ochraceum encodes the first actin homologue identified in a bacterial genome. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the myxococcal suborder Nannocystineae, and the 9,446,314 bp long single replicon genome with its 6,898 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  17. Long-read sequence assembly of the gorilla genome

    PubMed Central

    Gordon, David; Huddleston, John; Chaisson, Mark J. P.; Hill, Christopher M.; Kronenberg, Zev N.; Munson, Katherine M.; Malig, Maika; Raja, Archana; Fiddes, Ian; Hillier, LaDeana W.; Dunn, Christopher; Baker, Carl; Armstrong, Joel; Diekhans, Mark; Paten, Benedict; Shendure, Jay; Wilson, Richard K.; Haussler, David; Chin, Chen-Shan; Eichler, Evan E.

    2016-01-01

    Accurate sequence and assembly of genomes is a critical first step for studies of genetic variation. We generated a high-quality assembly of the gorilla genome using single-molecule, real-time sequence technology and a string graph de novo assembly algorithm. The new assembly improves contiguity by two to three orders of magnitude with respect to previously released assemblies, recovering 87% of missing reference exons and incomplete gene models. Although regions of large, high-identity segmental duplications remain largely unresolved, this comprehensive assembly provides new biological insight into genetic diversity, structural variation, gene loss, and representation of repeat structures within the gorilla genome. The approach provides a path forward for the routine assembly of mammalian genomes at a level approaching that of the current quality of the human genome. PMID:27034376

  18. Comparison of mitochondrial genome sequences of pangolins (Mammalia, Pholidota).

    PubMed

    Hassanin, Alexandre; Hugot, Jean-Pierre; van Vuuren, Bettine Jansen

    2015-04-01

    The complete mitochondrial genome was sequenced for three species of pangolins, Manis javanica, Phataginus tricuspis, and Smutsia temminckii, and comparisons were made with two other species, Manis pentadactyla and Phataginus tetradactyla. The genome of Manidae contains the 37 genes found in a typical mammalian genome, and the structure of the control region is highly conserved among species. In Manis, the overall base composition differs from that found in African genera. Phylogenetic analyses support the monophyly of the genera Manis, Phataginus, and Smutsia, as well as the basal division between Maninae and Smutsiinae. Comparisons with GenBank sequences reveal that the reference genomes of M. pentadactyla and P. tetradactyla (accession numbers NC_016008 and NC_004027) were sequenced from misidentified taxa, and that a new species of tree pangolin should be described in Gabon.

  19. Complete genome sequence of Pirellula staleyi type strain (ATCC 27377).

    PubMed

    Clum, Alicia; Tindall, Brian J; Sikorski, Johannes; Ivanova, Natalia; Mavrommatis, Konstantinos; Lucas, Susan; Glavina, Tijana; Del Rio; Nolan, Matt; Chen, Feng; Tice, Hope; Pitluck, Sam; Cheng, Jan-Fang; Chertkov, Olga; Brettin, Thomas; Han, Cliff; Detter, John C; Kuske, Cheryl; Bruce, David; Goodwin, Lynne; Ovchinikova, Galina; Pati, Amrita; Mikhailova, Natalia; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Rohde, Manfred; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

    2009-12-30

    Pirellula staleyi Schlesner and Hirsch 1987 is the type species of the genus Pirellula of the family Planctomycetaceae. Members of this pear- or teardrop-shaped bacterium show a clearly visible pointed attachment pole and can be distinguished from other Planctomycetes by a lack of true stalks. Strains closely related to the species have been isolated from fresh and brackish water, as well as from hypersaline lakes. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the order Planctomyces and only the second sequence from the phylum Planctobacteria/Planctomycetes. The 6,196,199 bp long genome with its 4773 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  20. Complete genome sequence of Shewanella putrefaciens. Final report

    SciTech Connect

    Heidelberg, John F.

    2001-04-01

    Seventy percent of the costs for genome sequencing Shewanella putrefaciens (oneidensis) were requested. These funds were expected to allow completion of the low-pass (5-fold) random sequencing and complete closure and annotation of the 200 kbp plasmid. Because of cost reduction that occurred during the period of this grant, these goals have been far exceeded. Currently, the S. putrefaciens genome is very nearly completely closed, even though the genome was significantly larger than expected and extremely repetitive. The entire genome sequence has been made BLAST searchable on the TIGR web page, and an extensive effort has been made to make data and analyses available to all researchers working on S. putrefaciens (oneidensis).

  1. Complete Genome Sequence of the Alfalfa latent virus

    PubMed Central

    Shao, Jonathan; Postnikova, Olga A.

    2015-01-01

    The first complete genome sequence of the Alfalfa latent carlavirus (ALV) was obtained by primer walking and Illumina RNA sequencing. The virus differs substantially from the Czech ALV isolate and the Pea streak virus isolate from Wisconsin. The absence of a clear nucleic acid-binding protein indicates ALV divergence from other carlaviruses. PMID:25883281

  2. Complete Genome Sequence of Spiroplasma sp. NBRC 100390

    PubMed Central

    Haryono, Mindia; Lo, Wen-Sui; Gasparich, Gail E.

    2017-01-01

    ABSTRACT Spiroplasma sp. NBRC 100390 was initially described as a duplicate of S. atrichopogonis GNAT3597T (=ATCC BAA-520T) but later found to be different in the 16S rDNA sequences. Here, we report the complete genome sequence of this bacterium to establish its identity and to facilitate future investigation. PMID:28280009

  3. Complete Genome Sequence of Kocuria palustris MU14/1

    PubMed Central

    Foecking, Mark F.

    2015-01-01

    Presented here is the first completely assembled genome sequence of Kocuria palustris, an actinobacterial species with broad ecological distribution. The single, circular chromosome of K. palustris MU14/1 comprises 2,854,447 bp, has a G+C content of 70.5%, and contains a deduced gene set of 2,521 coding sequences. PMID:26472837

  4. Sequencing the Genome of the Heirloom Watermelon Cultivar Charleston Gray

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the watermelon cultivar Charleston Gray, a major heirloom which has been used in breeding programs of many watermelon cultivars, was sequenced. Our strategy involved a hybrid approach using the Illumina and 454/Titanium next-generation sequencing technologies. For Illumina, shotgun g...

  5. Genome sequence of Stachybotrys chartarum Strain 51-11

    EPA Science Inventory

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  6. Draft Genome Sequence of Aeromonas sp. Strain EERV15

    PubMed Central

    Ehsani, Elham; Barrantes, Israel; Vandermaesen, Johanna; Geffers, Robert; Jarek, Michael; Boon, Nico; Springael, Dirk; Pieper, Dietmar H.

    2016-01-01

    We report here the draft genome sequence of Aeromonas sp. strain EERV15 isolated from sand filter. The organism most closely related to Aeromonas sp. EERV15 is Aeromonas veronii B565, with an average 83% amino acid sequence similarity of putatively encoded protein open reading frames. PMID:27540061

  7. Genomic Sequencing of Single Microbial Cells from Environmental Samples

    SciTech Connect

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

    2008-02-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

  8. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes

    PubMed Central

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics. PMID:26586806

  9. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes.

    PubMed

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-04

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics.

  10. Modeling genome coverage in single-cell sequencing

    PubMed Central

    Daley, Timothy; Smith, Andrew D.

    2014-01-01

    Motivation: Single-cell DNA sequencing is necessary for examining genetic variation at the cellular level, which remains hidden in bulk sequencing experiments. But because they begin with such small amounts of starting material, the amount of information that is obtained from single-cell sequencing experiment is highly sensitive to the choice of protocol employed and variability in library preparation. In particular, the fraction of the genome represented in single-cell sequencing libraries exhibits extreme variability due to quantitative biases in amplification and loss of genetic material. Results: We propose a method to predict the genome coverage of a deep sequencing experiment using information from an initial shallow sequencing experiment mapped to a reference genome. The observed coverage statistics are used in a non-parametric empirical Bayes Poisson model to estimate the gain in coverage from deeper sequencing. This approach allows researchers to know statistical features of deep sequencing experiments without actually sequencing deeply, providing a basis for optimizing and comparing single-cell sequencing protocols or screening libraries. Availability and implementation: The method is available as part of the preseq software package. Source code is available at http://smithlabresearch.org/preseq. Contact: andrewds@usc.edu Supplementary information: Supplementary material is available at Bioinformatics online. PMID:25107873

  11. The Cancer Genome Atlas Pan-Cancer analysis project.

    PubMed

    Weinstein, John N; Collisson, Eric A; Mills, Gordon B; Shaw, Kenna R Mills; Ozenberger, Brad A; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M

    2013-10-01

    The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

  12. Revisiting the sequencing of the first tree genome: Populus trichocarpa.

    PubMed

    Wullschleger, Stan D; Weston, D J; DiFazio, S P; Tuskan, G A

    2013-04-01

    Ten years ago, it was announced that the Joint Genome Institute with funds provided by the Department of Energy, Office of Science, Biological and Environmental Research would sequence the black cottonwood (Populus trichocarpa Torr. & Gray) genome. This landmark decision was the culmination of work by the forest science community to develop Populus as a model system. Since its public release in late 2006, the availability of the Populus genome has spawned research in plant biology, morphology, genetics and ecology. Here we address how the tree physiologist has used this resource. More specifically, we revisit our earlier contention that the rewards of sequencing the Populus genome would depend on how quickly scientists working with woody perennials could adopt molecular approaches to investigate the mechanistic underpinnings of basic physiological processes. Several examples illustrate the integration of functional and comparative genomics into the forest sciences, especially in areas that target improved understanding of the developmental differences between woody perennials and herbaceous annuals (e.g., phase transitions). Sequencing the Populus genome and the availability of genetic and genomic resources has also been instrumental in identifying candidate genes that underlie physiological and morphological traits of interest. Genome-enabled research has advanced our understanding of how phenotype and genotype are related and provided insights into the genetic mechanisms whereby woody perennials adapt to environmental stress. In the future, we anticipate that low-cost, high-throughput sequencing will continue to facilitate research in tree physiology and enhance our understanding at scales of individual organisms and populations. A challenge remains, however, as to how genomic resources, including the Populus genome, can be used to understand ecosystem function. Although examples are limited, progress in this area is encouraging and will undoubtedly improve as

  13. Draft Genome Sequence of Mycobacterium colombiense

    PubMed Central

    Bouam, Amar; Robert, Catherine; Levasseur, Anthony

    2017-01-01

    ABSTRACT Mycobacterium colombiense is a rapidly growing mycobacterium initially isolated from the blood of an HIV-positive patient in Colombia. Its 5,854,893-bp draft genome exhibits a G+C content of 67.64%, 5,233 protein-coding genes, and 54 predicted RNA genes. PMID:28385843

  14. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  15. Interplay Between the Cancer Genome and Epigenome

    PubMed Central

    Shen, Hui; Laird, Peter W.

    2013-01-01

    Cancer arises as a consequence of cumulative disruptions to cellular growth control, with Darwinian selection for those heritable changes which provide the greatest clonal advantage. These traits can be acquired and stably maintained by either genetic or epigenetic means. Here we explore the ways in which alterations in the genome and epigenome influence each other and cooperate to promote oncogenic transformation. Disruption of epigenomic control is pervasive in malignancy, and can be classified as an enabling characteristic of cancer cells, akin to genome instability and mutation. PMID:23540689

  16. Sequence Analysis of the Genome of Carnation (Dianthus caryophyllus L.)

    PubMed Central

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-01-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. ‘Francesco’ was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568 887 315 bp, consisting of 45 088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16 644 bp and 60 737 bp, respectively, and the longest scaffold was 1 287 144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. PMID:24344172

  17. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    PubMed

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp.

  18. Human Cancer Models Initiative | Office of Cancer Genomics

    Cancer.gov

    The Human Cancer Models Initiative (HCMI) is an international consortium that is generating novel human tumor-derived culture models, which are annotated with genomic and clinical data. In an effort to advance cancer research and more fully understand how in vitro findings are related to clinical biology, HCMI-developed models and related data will be available as a community resource for cancer research.

  19. Genome sequencing of a single tardigrade Hypsibius dujardini individual

    PubMed Central

    Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru

    2016-01-01

    Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies. PMID:27529330

  20. Genome Sequence of Luminous Piezophile Photobacterium phosphoreum ANT-2200

    PubMed Central

    Zhang, Sheng-Da; Barbe, Valérie; Garel, Marc; Zhang, Wei-Jia; Chen, Haitao; Santini, Claire-Lise; Murat, Dorothée; Jing, Hongmei; Zhao, Yuan; Lajus, Aurélie; Martini, Séverine; Pradel, Nathalie; Tamburini, Christian

    2014-01-01

    Bacteria of the genus Photobacterium thrive worldwide in oceans and show substantially varied lifestyles, including free-living, commensal, pathogenic, symbiotic, and piezophilic. Here, we present the genome sequence of a luminous, piezophilic Photobacterium phosphoreum strain, ANT-2200, isolated from a water column at 2,200 m depth in the Mediterranean Sea. It is the first genomic sequence of the P. phosphoreum group. An analysis of the sequence provides insight into the adaptation of bacteria to the deep-sea habitat. PMID:24744322

  1. Genome sequencing of a single tardigrade Hypsibius dujardini individual.

    PubMed

    Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru

    2016-08-16

    Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies.

  2. [Organization of simple sequences in the Drosophilia melanoga ter genome].

    PubMed

    Vashakidze, R P; Mamulashvili, N A; Kalandarishvili, K G; Kolchinskiĭ, A M; Zaalishvili, M M

    1988-01-01

    Fragments of Drosophila melanogaster DNA that intensively hybridize with simple sequences poly[(dG-dT).(dC-dA)], poly[(dA).(dT)] and poly[(dG-dA).(dC-dT)] were cloned. The first two types of simple sequences are organized in these clones as separated stretches of moderate length, repeated many times within 12-15 kb. Each cluster contains only one type of the simple sequences and originates from a unique in the genome. In contrast, poly[(dG-dA).(dC-dT)] occurs in the genome as several isolated motifs.

  3. Genomics of medulloblastoma: from Giemsa-banding to next-generation sequencing in 20 years.

    PubMed

    Northcott, Paul A; Rutka, James T; Taylor, Michael D

    2010-01-01

    Advances in the field of genomics have recently enabled the unprecedented characterization of the cancer genome, providing novel insight into the molecular mechanisms underlying malignancies in humans. The application of high-resolution microarray platforms to the study of medulloblastoma has revealed new oncogenes and tumor suppressors and has implicated changes in DNA copy number, gene expression, and methylation state in its etiology. Additionally, the integration of medulloblastoma genomics with patient clinical data has confirmed molecular markers of prognostic significance and highlighted the potential utility of molecular disease stratification. The advent of next-generation sequencing technologies promises to greatly transform our understanding of medulloblastoma pathogenesis in the next few years, permitting comprehensive analyses of all aspects of the genome and increasing the likelihood that genomic medicine will become part of the routine diagnosis and treatment of medulloblastoma.

  4. David Haussler, Ph.D., Lectures on Cancer Genomics - TCGA

    Cancer.gov

    In this lecture, Dr. David Haussler provides a historical overview of the field of genomics leading up to TCGA, including the Cancer Genomics Hub at the University of California, Santa Cruz, and the TCGA Pan-Cancer initiative.

  5. Childhood Cancer Genomics Gaps and Opportunities - Workshop Summary

    Cancer.gov

    NCI convened a workshop of representative research teams that have been leaders in defining the genomic landscape of childhood cancers to discuss the influence of genomic discoveries on the future of childhood cancer research.

  6. The Cancer Genome Atlas (TCGA): The next stage - TCGA

    Cancer.gov

    The Cancer Genome Atlas (TCGA), the NIH research program that has helped set the standards for characterizing the genomic underpinnings of dozens of cancers on a large scale, is moving to its next phase.

  7. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  8. Genomic sequencing in clinical practice: applications, challenges, and opportunities

    PubMed Central

    Krier, Joel B.; Kalia, Sarah S.; Green, Robert C.

    2016-01-01

    The development of massively parallel sequencing (or next-generation sequencing) has facilitated a rapid implementation of genomic sequencing in clinical medicine. Genomic sequencing (GS) is now an essential tool for evaluating rare disorders, identifying therapeutic targets in neoplasms, and screening for prenatal aneuploidy. Emerging applications, such as GS for preconception carrier screening and predisposition screening in healthy individuals, are being explored in research settings and utilized by members of the public eager to incorporate genomic information into their health management. The rapid pace of adoption has created challenges for all stakeholders in clinical GS, from standardizing variant interpretation approaches in clinical molecular laboratories to ensuring that nongeneticist clinicians are prepared for new types of clinical information. Clinical GS faces a pivotal moment, as the vast potential of new quantities and types of data enable further clinical innovation and complicated implementation questions continue to be resolved. PMID:27757064

  9. Next-generation sequencing and large genome assemblies

    PubMed Central

    Henson, Joseph; Tischler, German; Ning, Zemin

    2012-01-01

    The next-generation sequencing (NGS) revolution has drastically reduced time and cost requirements for sequencing of large genomes, and also qualitatively changed the problem of assembly. This article reviews the state of the art in de novo genome assembly, paying particular attention to mammalian-sized genomes. The strengths and weaknesses of the main sequencing platforms are highlighted, leading to a discussion of assembly and the new challenges associated with NGS data. Current approaches to assembly are outlined and the various software packages available are introduced and compared. The question of whether quality assemblies can be produced using short-read NGS data alone, or whether it must be combined with more expensive sequencing techniques, is considered. Prospects for future assemblers and tests of assembly performance are also discussed. PMID:22676195

  10. [DNA analysis for the post genome-sequencing era].

    PubMed

    Kambara, Hideki

    2002-05-01

    With the completion of the human genome sequencing, the new post genome-sequencing era has started. The major subjects are clarifying the function of genes to apply this information to medical as well as various industrial fields. Various DNA analysis methods and instruments for gene expression profiling as well as genetic diversity including SNPs typing are required and have been developed. Here, the history and technologies related to DNA analysis including the Wada project in the early 1980's, and the Human genome project from 1990 are described. Various new technologies have developed in this decade. They include a capillary gel array DNA sequencer, DNA chips, bead probe arrays, a new DNA sequencing method using pyrosequencing and an efficient SNP typing method by BAMPER.

  11. Genomic sequencing in clinical practice: applications, challenges, and opportunities.

    PubMed

    Krier, Joel B; Kalia, Sarah S; Green, Robert C

    2016-09-01

    The development of massively parallel sequencing (or next-generation sequencing) has facilitated a rapid implementation of genomic sequencing in clinical medicine. Genomic sequencing (GS) is now an essential tool for evaluating rare disorders, identifying therapeutic targets in neoplasms, and screening for prenatal aneuploidy. Emerging applications, such as GS for preconception carrier screening and predisposition screening in healthy individuals, are being explored in research settings and utilized by members of the public eager to incorporate genomic information into their health management. The rapid pace of adoption has created challenges for all stakeholders in clinical GS, from standardizing variant interpretation approaches in clinical molecular laboratories to ensuring that nongeneticist clinicians are prepared for new types of clinical information. Clinical GS faces a pivotal moment, as the vast potential of new quantities and types of data enable further clinical innovation and complicated implementation questions continue to be resolved.

  12. Open-Access Cancer Genomics - Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The completion of the Human Genome Project sparked a revolution in high-throughput genomics applied towards deciphering genetically complex diseases, like cancer. Now, almost 10 years later, we have a mountain of genomics data on many different cancer type

  13. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    PubMed Central

    Bergman, Casey M; Pfeiffer, Barret D; Rincón-Limas, Diego E; Hoskins, Roger A; Gnirke, Andreas; Mungall, Chris J; Wang, Adrienne M; Kronmiller, Brent; Pacleb, Joanne; Park, Soo; Stapleton, Mark; Wan, Kenneth; George, Reed A; de Jong, Pieter J; Botas, Juan; Rubin, Gerald M; Celniker, Susan E

    2002-01-01

    Background It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences. Conclusions Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone. PMID:12537575

  14. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  15. Next Generation DNA Sequencing and the Future of Genomic Medicine

    PubMed Central

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpretation, laboratory workflow, data storage, and ethical considerations. This review describes the current high-throughput sequencing platforms commercially available, and compares the inherent advantages and disadvantages of each. The potential applications for clinical diagnostics are considered, as well as the need for software and analysis tools to interpret the vast amount of data generated. Finally, we discuss the clinical and ethical implications of the wealth of genetic information generated by these methods. Despite the challenges, we anticipate that the evolution and refinement of high-throughput DNA sequencing technologies will catalyze a new era of personalized medicine based on individualized genomic analysis. PMID:24710010

  16. The Impact of the Cancer Genome Atlas on Lung Cancer

    PubMed Central

    Chang, Jeremy Tzu-Huai; Lee, Yee-Ming; Huang, R. Stephanie

    2015-01-01

    The Cancer Genome Atlas (TCGA) has profiled over 10,000 samples derived from 33 types of cancer to date, with the goal of improving our understanding of the molecular basis of cancer and advancing our ability to diagnose, treat, and prevent cancer. This review focuses on lung cancer as it is the leading cause of cancer-related mortality worldwide in both men and women. Particularly, non-small cell lung cancers (including lung adenocarcinoma and lung squamous cell carcinoma) were evaluated. Our goal is to demonstrate the impact of TCGA on lung cancer research under four themes: namely, diagnostic markers, disease progression markers, novel therapeutic targets, and novel tools. Examples were given related to DNA mutation, copy number variation, mRNA, and microRNA expression along with methylation profiling. PMID:26318634

  17. Overview | Office of Cancer Genomics

    Cancer.gov

    The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative uses comprehensive molecular characterization to determine the genetic changes that drive the initiation and progression of hard-to-treat childhood cancers. TARGET aims to identify therapeutic targets and prognostic markers so that new, more effective treatment strategies can be developed and applied. Novel pediatric cancer treatments are needed because:

  18. Genome Sequence of Mycobacteriophage ErnieJ.

    PubMed

    Robinson, Courtney J; London-Thomas, Laricca Y; Dickson, Leon A; Clinton, Tiffany A; Baig, Hana; Bute, Maude; Fahad, Mohammed; Farrakhan, Kanhai; Grady, Neshaun; Guthrie, Nicholas E; Hafid, Ruoa; Harvey, Jayla; Hunnicutt, Kellie; Larsen, Victoria L; McDuffie, Taashaylaray; McGee, Earyn N; Pailin, Jillian Y; Peacock, Bria; Thomas, Antolice; Anderson, Winston A

    2016-11-23

    ErnieJ, a cluster C mycobacteriophage that infects Mycobacterium smegmatis mc(2)155, was recovered from soil in Washington, DC. Its genome is 153,243 bp in size and encodes 227 predicted proteins, 30 tRNAs, and one transfer-messenger RNA (tmRNA). Ten percent of the predicted proteins have homologs in phages that infect nonmycobacterial Actinobacteria.

  19. Genome Sequence of Mycobacteriophage ErnieJ

    PubMed Central

    London-Thomas, Laricca Y.; Dickson, Leon A.; Clinton, Tiffany A.; Baig, Hana; Bute, Maude; Fahad, Mohammed; Farrakhan, Kanhai; Grady, Neshaun; Guthrie, Nicholas E.; Hafid, Ruoa; Harvey, Jayla; Hunnicutt, Kellie; Larsen, Victoria L.; McDuffie, Taashaylaray; McGee, Earyn N.; Pailin, Jillian Y.; Peacock, Bria; Thomas, Antolice; Anderson, Winston A.

    2016-01-01

    ErnieJ, a cluster C mycobacteriophage that infects Mycobacterium smegmatis mc2155, was recovered from soil in Washington, DC. Its genome is 153,243 bp in size and encodes 227 predicted proteins, 30 tRNAs, and one transfer-messenger RNA (tmRNA). Ten percent of the predicted proteins have homologs in phages that infect nonmycobacterial Actinobacteria. PMID:27881532

  20. Genome Sequencing Fishes out Longevity Genes.

    PubMed

    Lakhina, Vanisha; Murphy, Coleen T

    2015-12-03

    Understanding the molecular basis underlying aging is critical if we are to fully understand how and why we age-and possibly how to delay the aging process. Up until now, most longevity pathways were discovered in invertebrates because of their short lifespans and availability of genetic tools. Now, Reichwald et al. and Valenzano et al. independently provide a reference genome for the short-lived African turquoise killifish, establishing its role as a vertebrate system for aging research.