Science.gov

Sample records for cancer genome sequences

  1. Genome Sequencing and Cancer

    PubMed Central

    Mardis, Elaine R.

    2012-01-01

    New technologies for DNA sequencing, coupled with advanced analytical approaches, are now providing unprecedented speed and precision in decoding human genomes. This combination of technology and analysis, when applied to the study of cancer genomes, is revealing specific and novel information about the fundamental genetic mechanisms that underlie cancer’s development and progression. This review outlines the history of the past several years of development in this realm, and discusses the current and future applications that will further elucidate cancer’s genomic causes. PMID:22534183

  2. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    PubMed Central

    Li, Lijin; Goedegebuure, Peter; Mardis, Elaine R.; Ellis, Matthew J.C.; Zhang, Xiuli; Herndon, John M.; Fleming, Timothy P.; Carreno, Beatriz M.; Hansen, Ted H.; Gillanders, William E.

    2011-01-01

    New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines. PMID:24213133

  3. Reconstructing cancer genomes from paired-end sequencing data

    PubMed Central

    2012-01-01

    Background A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. Results By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. Conclusions We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/. PMID:22537039

  4. Cancer whole-genome sequencing: present and future.

    PubMed

    Nakagawa, H; Wardell, C P; Furuta, M; Taniguchi, H; Fujimoto, A

    2015-12-01

    Recent explosive advances in next-generation sequencing technology and computational approaches to massive data enable us to analyze a number of cancer genome profiles by whole-genome sequencing (WGS). To explore cancer genomic alterations and their diversity comprehensively, global and local cancer genome-sequencing projects, including ICGC and TCGA, have been analyzing many types of cancer genomes mainly by exome sequencing. However, there is limited information on somatic mutations in non-coding regions including untranslated regions, introns, regulatory elements and non-coding RNAs, and rearrangements, sometimes producing fusion genes, and pathogen detection in cancer genomes remain widely unexplored. WGS approaches can detect these unexplored mutations, as well as coding mutations and somatic copy number alterations, and help us to better understand the whole landscape of cancer genomes and elucidate functions of these unexplored genomic regions. Analysis of cancer genomes using the present WGS platforms is still primitive and there are substantial improvements to be made in sequencing technologies, informatics and computer resources. Taking account of the extreme diversity of cancer genomes and phenotype, it is also required to analyze much more WGS data and integrate these with multi-omics data, functional data and clinical-pathological data in a large number of sample sets to interpret them more fully and efficiently. PMID:25823020

  5. Genome sequencing and phenotypic analysis of single cells in cancer

    E-print Network

    Adalsteinsson, Viktor Arnarson

    2015-01-01

    Relatively little is known about metastatic cancer. The vast majority of cancer genome profiling (~99%) is done on primary tumors; yet, metastatic cancer is attributed to >90% of cancer-related deaths. The underlying ...

  6. The clinical potential and challenges of sequencing cancer genomes for personalized medical genomics.

    PubMed

    Cloonan, Nicole; Waddell, Nic; Grimmond, Sean M

    2010-11-01

    Next-generation sequencing is revolutionizing the way in which genomic-scale biological research is performed, and its effects are beginning to be translated medically. Large-scale international collaborations for the comprehensive sequencing of the genome, epigenome, and transcriptomes of cancers and corresponding 'normal' (germ-line) DNA are heralding the start of personalized medical genomics. The promise of eliminating conjecture when determining treatment approaches is certainly appealing for both patients and clinicians; however, several major issues must be resolved before next-generation sequencing will be adopted as a routine clinical tool for patients. This feature review explores the clinical potential and challenges of studying cancer genomes for personalized medical genomics. PMID:21046525

  7. Genome Sequencing Centers

    Cancer.gov

    The Cancer Genome Atlas (TCGA) Genome Sequencing Centers (GSCs) perform large-scale DNA sequencing using the latest sequencing technologies. Supported by the National Human Genome Research Institute (NHGRI) large-scale sequencing program, the GSCs generate the enormous volume of data required by TCGA, while continually improving existing technologies and methods to expand the frontier of what can be achieved in cancer genome sequencing.

  8. Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer.

    PubMed

    Murchison, Elizabeth P; Schulz-Trieglaff, Ole B; Ning, Zemin; Alexandrov, Ludmil B; Bauer, Markus J; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R; Cheetham, R Keira; Cheng, William; Connor, Thomas R; Cox, Anthony J; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J; Harris, Simon R; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J; Wedge, David C; Woods, Gregory M; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M J; Carter, Nigel P; Papenfuss, Anthony T; Futreal, P Andrew; Campbell, Peter J; Yang, Fengtang; Bentley, David R; Evers, Dirk J; Stratton, Michael R

    2012-02-17

    The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PMID:22341448

  9. Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer

    PubMed Central

    Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

    2012-01-01

    Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip PMID:22341448

  10. Evaluation of Paired-End Sequencing Strategies for Detection of Genome Rearrangements in Cancer

    PubMed Central

    Bashir, Ali; Volik, Stanislav; Collins, Colin; Bafna, Vineet; Raphael, Benjamin J.

    2008-01-01

    Paired-end sequencing is emerging as a key technique for assessing genome rearrangements and structural variation on a genome-wide scale. This technique is particularly useful for detecting copy-neutral rearrangements, such as inversions and translocations, which are common in cancer and can produce novel fusion genes. We address the question of how much sequencing is required to detect rearrangement breakpoints and to localize them precisely using both theoretical models and simulation. We derive a formula for the probability that a fusion gene exists in a cancer genome given a collection of paired-end sequences from this genome. We use this formula to compute fusion gene probabilities in several breast cancer samples, and we find that we are able to accurately predict fusion genes in these samples with a relatively small number of fragments of large size. We further demonstrate how the ability to detect fusion genes depends on the distribution of gene lengths, and we evaluate how different parameters of a sequencing strategy impact breakpoint detection, breakpoint localization, and fusion gene detection, even in the presence of errors that suggest false rearrangements. These results will be useful in calibrating future cancer sequencing efforts, particularly large-scale studies of many cancer genomes that are enabled by next-generation sequencing technologies. PMID:18404202

  11. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

    PubMed

    Alioto, Tyler S; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D; Hovig, Eivind; Heisler, Lawrence E; Beck, Timothy A; Simpson, Jared T; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S; Butler, Adam P; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C; Gut, Marta; Denroche, Robert E; Harding, Nicholas J; Yamaguchi, Takafumi N; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G; Anderson, Charlotte L; Waddell, Nicola; Pearson, John V; Grimmond, Sean M; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A; López-Otín, Carlos; Campo, Elías; Campbell, Peter J; Boutros, Paul C; Puente, Xose S; Gerhard, Daniela S; Pfister, Stefan M; McPherson, John D; Hudson, Thomas J; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T W; Gut, Ivo G

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ?100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  12. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

    PubMed Central

    Alioto, Tyler S.; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D.; Hovig, Eivind; Heisler, Lawrence E.; Beck, Timothy A.; Simpson, Jared T.; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S.; Butler, Adam P.; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W.; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C.; Gut, Marta; Denroche, Robert E.; Harding, Nicholas J.; Yamaguchi, Takafumi N.; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G.; Anderson, Charlotte L.; Waddell, Nicola; Pearson, John V.; Grimmond, Sean M.; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A.; López-Otín, Carlos; Campo, Elías; Campbell, Peter J.; Boutros, Paul C.; Puente, Xose S.; Gerhard, Daniela S.; Pfister, Stefan M.; McPherson, John D.; Hudson, Thomas J.; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T. W.; Gut, Ivo G.

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ?100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  13. Clinical applications of next generation sequencing in cancer: from panels, to exomes, to genomes

    PubMed Central

    Shen, Tony; Pajaro-Van de Stadt, Stefan Hans; Yeat, Nai Chien; Lin, Jimmy C.-H.

    2015-01-01

    This article will review recent impact of massively parallel next-generation sequencing (NGS) in our understanding and treatment of cancer. While whole exome sequencing (WES) remains popular and effective as a method of genetically profiling different cancers, advances in sequencing technology has enabled an increasing number of whole-genome based studies. Clinically, NGS has been used or is being developed for genetic screening, diagnostics, and clinical assessment. Though challenges remain, clinicians are in the early stages of using genetic data to make treatment decisions for cancer patients. As the integration of NGS in the study and treatment of cancer continues to mature, we believe that the field of cancer genomics will need to move toward more complete 100% genome sequencing. Current technologies and methods are largely limited to coding regions of the genome. A number of recent studies have demonstrated that mutations in non-coding regions may have direct tumorigenic effects or lead to genetic instability. Non-coding regions represent an important frontier in cancer genomics. PMID:26136771

  14. Cancer Genomics for Pediatric Cancers

    Cancer.gov

    Javed Khan, M.D., a molecular biologist at the National Cancer Institute (NCI) discusses programs such as TARGET (Therapeutically Applicable Research to Generate Effective Treatments), the Pediatric Cancer Genome Project, and TCGA (The Cancer Genome Atlas) that are sequencing the genomes of tumors from hundreds of children and adults with cancer to discover genetic changes causing or driving the disease. Genomic characterization will help clinicians prescribe appropriate treatments.

  15. Discrepancies in cancer genomic sequencing highlight opportunities for driver mutation discovery.

    PubMed

    Hudson, Andrew M; Yates, Tim; Li, Yaoyong; Trotter, Eleanor W; Fawdar, Shameem; Chapman, Phil; Lorigan, Paul; Biankin, Andrew; Miller, Crispin J; Brognard, John

    2014-11-15

    Cancer genome sequencing is being used at an increasing rate to identify actionable driver mutations that can inform therapeutic intervention strategies. A comparison of two of the most prominent cancer genome sequencing databases from different institutes (Cancer Cell Line Encyclopedia and Catalogue of Somatic Mutations in Cancer) revealed marked discrepancies in the detection of missense mutations in identical cell lines (57.38% conformity). The main reason for this discrepancy is inadequate sequencing of GC-rich areas of the exome. We have therefore mapped over 400 regions of consistent inadequate sequencing (cold-spots) in known cancer-causing genes and kinases, in 368 of which neither institute finds mutations. We demonstrate, using a newly identified PAK4 mutation as proof of principle, that specific targeting and sequencing of these GC-rich cold-spot regions can lead to the identification of novel driver mutations in known tumor suppressors and oncogenes. We highlight that cross-referencing between genomic databases is required to comprehensively assess genomic alterations in commonly used cell lines and that there are still significant opportunities to identify novel drivers of tumorigenesis in poorly sequenced areas of the exome. Finally, we assess other reasons for the observed discrepancy, such as variations in dbSNP filtering and the acquisition/loss of mutations, to give explanations as to why there is a discrepancy in pharmacogenomic studies, given recent concerns with poor reproducibility of data. PMID:25256751

  16. Massively Parallel Validation of Cancer Mutations and Other Variants Identified by Whole Cancer Genome and Exome Sequencing - Georges Natsoulis, TCGA Scientific Symposium 2011

    Cancer.gov

    Home News and Events Multimedia Library Videos Parallel Validation of Cancer Mutations and Other Variants - Georges Natsoulis Massively Parallel Validation of Cancer Mutations and Other Variants Identified by Whole Cancer Genome and Exome Sequencing

  17. Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

    Cancer.gov

    Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.

  18. The Cancer Genome Atlas (TCGA)

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  19. Complete sequence of an ovarian cancer inbred Sprague-Dawley rat model mitochondrial genome.

    PubMed

    Wei, De-Hua; Shi, Hui-Rong; Liao, Yu-Mei

    2016-03-01

    The Sprague-Dawley rat strain is a commonly used model for ovarian cancer disease study. We sequenced this rat strain mitochondrial genome for the first time (GenBank Accession No. KM114604). Its mitogenome was 16,312?bp and coding 13 protein-coding genes, 2 ribosomal RNA genes and 22 transfer RNA genes. A total of 96 SNPs were examined when compared to reference BN sequence. PMID:25391032

  20. The Tip of the Iceberg: Clinical Implications of Genomic Sequencing Projects in Head and Neck Cancer.

    PubMed

    Birkeland, Andrew C; Ludwig, Megan L; Meraj, Taha S; Brenner, J Chad; Prince, Mark E

    2015-01-01

    Recent genomic sequencing studies have provided valuable insight into genetic aberrations in head and neck squamous cell carcinoma. Despite these great advances, certain hurdles exist in translating genomic findings to clinical care. Further correlation of genetic findings to clinical outcomes, additional analyses of subgroups of head and neck cancers and follow-up investigation into genetic heterogeneity are needed. While the development of targeted therapy trials is of key importance, numerous challenges exist in establishing and optimizing such programs. This review discusses potential upcoming steps for further genetic evaluation of head and neck cancers and implementation of genetic findings into precision medicine trials. PMID:26506389

  1. Return of Results from Genomic Sequencing: A Policy Discussion of Secondary Findings for Cancer Predisposition

    PubMed Central

    Johnson, Kimberly J.; Gehlert, Sarah

    2014-01-01

    Advances in DNA sequencing technology now allow for the rapid genome-wide identification of inherited and acquired genetic variants including those that have been identified as pathogenic alleles for a number of diseases including cancer. Whole genome and exome sequencing are increasingly becoming a part of both clinical practice and research studies. In 2013 the American College of Medical Genetics and Genomics (ACMG) recommended that results of pathogenic genetic variants in 56 genes, nearly half of which comprise cancer genes (including BRCA1, BRCA2, TP53, MLH1, MLH2, MSH6, PMS2, and APC),be returned to patients who have their genome sequenced independent of the purpose for the test. This recommendation has been highly controversial for several reasons, particularly the recommendation that individuals be returned secondary findings of disease causing variants for adult onset conditions regardless of age and without consideration of patient preferences. In addition, the policy regarding returning results of secondary findings from genomic sequencing studies in research settings is currently unclear. In response to these emerging ethical issues, the Washington University Brown School in St. Louis, MO, United Stateshosted a policy forum entitled “First do no harm: Genetic privacy in the age of genomic sequencing” on February 25th, 2014. The forum included a panel of experts to discuss their views on ethical issues related to return of results in both the clinical and research settings. In this report, we highlight key issues related to return of results from genome sequencing tests that emerged during the forum. PMID:25229012

  2. Center for Cancer Genomics | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approach

  3. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine

    PubMed Central

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer. PMID:24479672

  4. Whole Genome Sequencing

    MedlinePLUS

    ... you want to learn. Search form Search Whole Genome Sequencing You are here Home Testing & Services Testing ... the full story, click here . What is whole genome sequencing? Whole genome sequencing is the mapping out ...

  5. Clinical genomics information management software linking cancer genome sequence and clinical decisions.

    PubMed

    Watt, Stuart; Jiao, Wei; Brown, Andrew M K; Petrocelli, Teresa; Tran, Ben; Zhang, Tong; McPherson, John D; Kamel-Reid, Suzanne; Bedard, Philippe L; Onetto, Nicole; Hudson, Thomas J; Dancey, Janet; Siu, Lillian L; Stein, Lincoln; Ferretti, Vincent

    2013-09-01

    Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician. PMID:23603536

  6. Draft Genome Sequences of Helicobacter pylori Strains Isolated from Regions of Low and High Gastric Cancer Risk in Colombia

    E-print Network

    Sheh, Alexander

    The draft genome sequences of six Colombian Helicobacter pylori strains are presented. These strains were isolated from patients from regions of high and low gastric cancer risk in Colombia and were characterized by ...

  7. TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence - TCGA

    Cancer.gov

    Carolyn Hutter, Ph.D., Program Director of NHGRI's Division of Genomic Medicine, discusses the expansion of TCGA's Pan-Cancer efforts to include the Pan-Cancer Analysis of Whole Genomes (PAWG) project.

  8. A genome-wide view of microsatellite instability: old stories of cancer mutations revisited with new sequencing technologies

    PubMed Central

    Kim, Tae-Min; Park, Peter J

    2014-01-01

    Microsatellites are simple tandem repeats that are present at millions of loci in the human genome. Microsatellite instability (MSI) refers to DNA slippage events on microsatellites that occur frequently in cancer genomes when there is a defect in the DNA mismatch repair system. These somatic mutations can result in inactivation of tumor suppressor genes or disrupt other non-coding regulatory sequences, thereby playing a role in carcinogenesis. Here, we will discuss the ways in which high-throughput sequencing data can facilitate a genome- or exome-wide discovery and more detailed investigation of MSI events in microsatellite-unstable cancer genomes. We will address the methodological aspects of this approach and highlight insights from recent analyses of colorectal and endometrial cancer genomes from The Cancer Genome Atlas project. These include identification of novel MSI targets within and across tumor types and the relationship between the likelihood of MSI events to chromatin structure. Given the increasing popularity of exome and genome sequencing of cancer genomes, a comprehensive characterization of MSI may serve as a valuable marker of cancer evolution and aid in a search for therapeutic targets. PMID:25371413

  9. The cancer genome

    PubMed Central

    Stratton, Michael R.; Campbell, Peter J.; Futreal, P. Andrew

    2010-01-01

    All cancers arise as a result of changes that have occurred in the DNA sequence of the genomes of cancer cells. Over the past quarter of a century much has been learnt about these mutations and the abnormal genes that operate in human cancers. We are now, however, moving into an era in which it will be possible to obtain the complete DNA sequence of large numbers of cancer genomes. These studies will provide us with a detailed and comprehensive perspective on how individual cancers have developed. PMID:19360079

  10. Use of Whole Genome Sequencing for Diagnosis and Discovery in the Cancer Genetics Clinic

    PubMed Central

    Foley, Samantha B.; Rios, Jonathan J.; Mgbemena, Victoria E.; Robinson, Linda S.; Hampel, Heather L.; Toland, Amanda E.; Durham, Leslie; Ross, Theodora S.

    2014-01-01

    Despite the potential of whole-genome sequencing (WGS) to improve patient diagnosis and care, the empirical value of WGS in the cancer genetics clinic is unknown. We performed WGS on members of two cohorts of cancer genetics patients: those with BRCA1/2 mutations (n = 176) and those without (n = 82). Initial analysis of potentially pathogenic variants (PPVs, defined as nonsynonymous variants with allele frequency < 1% in ESP6500) in 163 clinically-relevant genes suggested that WGS will provide useful clinical results. This is despite the fact that a majority of PPVs were novel missense variants likely to be classified as variants of unknown significance (VUS). Furthermore, previously reported pathogenic missense variants did not always associate with their predicted diseases in our patients. This suggests that the clinical use of WGS will require large-scale efforts to consolidate WGS and patient data to improve accuracy of interpretation of rare variants. While loss-of-function (LoF) variants represented only a small fraction of PPVs, WGS identified additional cancer risk LoF PPVs in patients with known BRCA1/2 mutations and led to cancer risk diagnoses in 21% of non-BRCA cancer genetics patients after expanding our analysis to 3209 ClinVar genes. These data illustrate how WGS can be used to improve our ability to discover patients' cancer genetic risks. PMID:26023681

  11. Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability

    PubMed Central

    2012-01-01

    Background Recent advances in the treatment of cancer have focused on targeting genomic aberrations with selective therapeutic agents. In rare tumors, where large-scale clinical trials are daunting, this targeted genomic approach offers a new perspective and hope for improved treatments. Cancers of the ampulla of Vater are rare tumors that comprise only about 0.2% of gastrointestinal cancers. Consequently, they are often treated as either distal common bile duct or pancreatic cancers. Methods We analyzed DNA from a resected cancer of the ampulla of Vater and whole blood DNA from a 63 year-old man who underwent a pancreaticoduodenectomy by whole genome sequencing, achieving 37× and 40× coverage, respectively. We determined somatic mutations and structural alterations. Results We identified relevant aberrations, including deleterious mutations of KRAS and SMAD4 as well as a homozygous focal deletion of the PTEN tumor suppressor gene. These findings suggest that these tumors have a distinct oncogenesis from either common bile duct cancer or pancreatic cancer. Furthermore, this combination of genomic aberrations suggests a therapeutic context for dual mTOR/PI3K inhibition. Conclusions Whole genome sequencing can elucidate an oncogenic context and expose potential therapeutic vulnerabilities in rare cancers. PMID:22762308

  12. The Cancer Genome Atlas - TCGA - Home Page

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  13. Genomic Datasets for Cancer Research

    Cancer.gov

    A variety of datasets from genome-wide association studies of cancer and other genotype-phenotype studies, including sequencing and molecular diagnostic assays, are available to approved investigators through the Extramural National Cancer Institute Data Access Committee.

  14. Flexible positions, managed hopes: the promissory bioeconomy of a whole genome sequencing cancer study.

    PubMed

    Haase, Rachel; Michie, Marsha; Skinner, Debra

    2015-04-01

    Genomic research has rapidly expanded its scope and ambition over the past decade, promoted by both public and private sectors as having the potential to revolutionize clinical medicine. This promissory bioeconomy of genomic research and technology is generated by, and in turn generates, the hopes and expectations shared by investors, researchers and clinicians, patients, and the general public alike. Examinations of such bioeconomies have often focused on the public discourse, media representations, and capital investments that fuel these "regimes of hope," but also crucial are the more intimate contexts of small-scale medical research, and the private hopes, dreams, and disappointments of those involved. Here we examine one local site of production in a university-based clinical research project that sought to identify novel cancer predisposition genes through whole genome sequencing in individuals at high risk for cancer. In-depth interviews with 24 adults who donated samples to the study revealed an ability to shift flexibly between positioning themselves as research participants on the one hand, and as patients or as family members of patients, on the other. Similarly, interviews with members of the research team highlighted the dual nature of their positions as researchers and as clinicians. For both parties, this dual positioning shaped their investment in the project and valuing of its possible outcomes. In their narratives, all parties shifted between these different relational positions as they managed hopes and expectations for the research project. We suggest that this flexibility facilitated study implementation and participation in the face of potential and probable disappointment on one or more fronts, and acted as a key element in the resilience of this local promissory bioeconomy. We conclude that these multiple dimensions of relationality and positionality are inherent and essential in the creation of any complex economy, "bio" or otherwise. PMID:25697637

  15. Local sequence assembly reveals a high-resolution profile of somatic structural variations in 97 cancer genomes.

    PubMed

    Zhuang, Jiali; Weng, Zhiping

    2015-09-30

    Genomic structural variations (SVs) are pervasive in many types of cancers. Characterizing their underlying mechanisms and potential molecular consequences is crucial for understanding the basic biology of tumorigenesis. Here, we engineered a local assembly-based algorithm (laSV) that detects SVs with high accuracy from paired-end high-throughput genomic sequencing data and pinpoints their breakpoints at single base-pair resolution. By applying laSV to 97 tumor-normal paired genomic sequencing datasets across six cancer types produced by The Cancer Genome Atlas Research Network, we discovered that non-allelic homologous recombination is the primary mechanism for generating somatic SVs in acute myeloid leukemia. This finding contrasts with results for the other five types of solid tumors, in which non-homologous end joining and microhomology end joining are the predominant mechanisms. We also found that the genes recursively mutated by single nucleotide alterations differed from the genes recursively mutated by SVs, suggesting that these two types of genetic alterations play different roles during cancer progression. We further characterized how the gene structures of the oncogene JAK1 and the tumor suppressors KDM6A and RB1 are affected by somatic SVs and discussed the potential functional implications of intergenic SVs. PMID:26283183

  16. Cancer Genome Characterization Initiative | Office of Cancer Genomics

    Cancer.gov

    CGCI supports cutting-edge genomics research on adult and pediatric cancers. Researchers develop and apply advanced sequencing and other genome-based methods to identify novel genetic abnormalities in tumors. The extensive genetic profiles generated by CGCI may inform better cancer diagnosis and treatment.

  17. Center for Cancer Genomics Launches New Website

    Cancer.gov

    CCG was established to unify NCI’s activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approaches, CCG aims to accelerate structural, functional and computational research to explore cancer mechanisms, discover new cancer targets, and develop new therapeutics.

  18. The complete mitochondrial genome sequence and mutations of the Lung cancer model inbred rat strain (Muridae; Rattus).

    PubMed

    Zhou, Da-Ming; Zhu, Dan-Dan; Lu, Qing-Feng; Qiao, Wen-Bo; Zhuang, Yong-Zhi

    2016-03-01

    We reported the complete mitochondrial genome sequencing of an important Lung cancer model inbred rat strain for the first time. The total length of the mitogenome was 16,312?bp. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region. The mutation sites were analyzed by comparing with the reference BN strain. PMID:25391029

  19. Draft Genome Sequence of a Helicobacter pylori Strain Isolated from a Patient with Diffuse Gastritis from a Region of High Cancer Risk in Colombia

    PubMed Central

    Bayona Rojas, Martin; Barragán Vidal, Carlos; Trujillo, Clara Esperanza; Bravo, María Mercedes

    2015-01-01

    The draft genome sequence of one Colombian Helicobacter pylori strain is presented. This strain was isolated from a patient with diffuse gastritis from Tibaná, Boyacá, a region with high gastric cancer risk. PMID:25858838

  20. A study based on whole-genome sequencing yields a rare variant at 8q24 associated with prostate cancer

    PubMed Central

    Gudmundsson, Julius; Sulem, Patrick; Gudbjartsson, Daniel F.; Masson, Gisli; Agnarsson, Bjarni A.; Benediktsdottir, Kristrun R.; Sigurdsson, Asgeir; Magnusson, Olafur Th.; Gudjonsson, Sigurjon A.; Magnusdottir, Droplaug N.; Johannsdottir, Hrefna; Helgadottir, Hafdis Th.; Stacey, Simon N.; Jonasdottir, Adalbjorg; Olafsdottir, Stefania B.; Thorleifsson, Gudmar; Jonasson, Jon G.; Tryggvadottir, Laufey; Navarrete, Sebastian; Fuertes, Fernando; Helfand, Brian T.; Hu, Qiaoyan; Csiki, Irma E.; Mates, Ioan N.; Jinga, Viorel; Aben, Katja K. H.; van Oort, Inge M.; Vermeulen, Sita H.; Donovan, Jenny L.; Hamdy, Freddy C.; Ng, Chi-Fai; Chiu, Peter K.F.; Lau, Kin-Mang; Ng, Maggie C.Y.; Gulcher, Jeffrey R.; Kong, Augustine; Catalona, William J.; Mayordomo, Jose I.; Einarsson, Gudmundur V.; Barkardottir, Rosa B.; Jonsson, Eirikur; Mates, Dana; Neal, David E.; Kiemeney, Lambertus A.; Thorsteinsdottir, Unnur; Rafnar, Thorunn; Stefansson, Kari

    2013-01-01

    Western countries, prostate cancer is the most prevalent cancer of men, and one of the leading causes of cancer-related death in men. Several genome-wide association studies have yielded numerous common variants conferring risk of prostate cancer. In the present study we analyzed 32.5 million variants discovered by whole-genome sequencing 1,795 Icelanders. One variant was found to be associated with prostate cancer in European populations: rs188140481[A] (OR = 2.90, Pcomb = 6.2×10?34) located on 8q24, with an average risk allele control frequency of 0.54%. This variant is only very weakly correlated (r2 ? 0.06) with previously reported risk variants on 8q24, and remains significant after adjustment for all of them. Carriers of rs188140481[A] were diagnosed with prostate cancer 1.26 years younger than non-carriers (P = 0.0059). We also report results for the previously described HOXB13 mutation (rs138213197[T]), confirming it as prostate cancer risk variant in populations from all over Europe. PMID:23104005

  1. A study based on whole-genome sequencing yields a rare variant at 8q24 associated with prostate cancer.

    PubMed

    Gudmundsson, Julius; Sulem, Patrick; Gudbjartsson, Daniel F; Masson, Gisli; Agnarsson, Bjarni A; Benediktsdottir, Kristrun R; Sigurdsson, Asgeir; Magnusson, Olafur Th; Gudjonsson, Sigurjon A; Magnusdottir, Droplaug N; Johannsdottir, Hrefna; Helgadottir, Hafdis Th; Stacey, Simon N; Jonasdottir, Adalbjorg; Olafsdottir, Stefania B; Thorleifsson, Gudmar; Jonasson, Jon G; Tryggvadottir, Laufey; Navarrete, Sebastian; Fuertes, Fernando; Helfand, Brian T; Hu, Qiaoyan; Csiki, Irma E; Mates, Ioan N; Jinga, Viorel; Aben, Katja K H; van Oort, Inge M; Vermeulen, Sita H; Donovan, Jenny L; Hamdy, Freddy C; Ng, Chi-Fai; Chiu, Peter K F; Lau, Kin-Mang; Ng, Maggie C Y; Gulcher, Jeffrey R; Kong, Augustine; Catalona, William J; Mayordomo, Jose I; Einarsson, Gudmundur V; Barkardottir, Rosa B; Jonsson, Eirikur; Mates, Dana; Neal, David E; Kiemeney, Lambertus A; Thorsteinsdottir, Unnur; Rafnar, Thorunn; Stefansson, Kari

    2012-12-01

    In Western countries, prostate cancer is the most prevalent cancer of men and one of the leading causes of cancer-related death in men. Several genome-wide association studies have yielded numerous common variants conferring risk of prostate cancer. Here, we analyzed 32.5 million variants discovered by whole-genome sequencing 1,795 Icelanders. We identified a new low-frequency variant at 8q24 associated with prostate cancer in European populations, rs188140481[A] (odds ratio (OR) = 2.90; P(combined) = 6.2 × 10(-34)), with an average risk allele frequency in controls of 0.54%. This variant is only very weakly correlated (r(2) ? 0.06) with previously reported risk variants at 8q24, and its association remains significant after adjustment for all known risk-associated variants. Carriers of rs188140481[A] were diagnosed with prostate cancer 1.26 years younger than non-carriers (P = 0.0059). We also report results for a previously described HOXB13 variant (rs138213197[T]), confirming it as a prostate cancer risk variant in populations from across Europe. PMID:23104005

  2. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges

    PubMed Central

    Liu, Biao; Morrison, Carl D.; Johnson, Candace S.; Trump, Donald L.; Qin, Maochun; Conroy, Jeffrey C.; Wang, Jianmin; Liu, Song

    2013-01-01

    Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections. PMID:24240121

  3. Genome sequencing conference II

    SciTech Connect

    Not Available

    1990-01-01

    Genome Sequencing Conference 2 was held September 30 to October 30, 1990. 26 speaker abstracts and 33 poster presentations were included in the program report. New and improved methods for DNA sequencing and genetic mapping were presented. Many of the papers were concerned with accuracy and speed of acquisition of data with computers and automation playing an increasing role. Individual papers have been processed separately for inclusion on the database.

  4. Unlocking hidden genomic sequence

    PubMed Central

    Keith, Jonathan M.; Cochran, Duncan A. E.; Lala, Gita H.; Adams, Peter; Bryant, Darryn; Mitchelson, Keith R.

    2004-01-01

    Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs. PMID:14973330

  5. Prenatal Whole Genome Sequencing

    PubMed Central

    Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

    2014-01-01

    With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means. PMID:22777977

  6. Office of Cancer Genomics |

    Cancer.gov

    The mission of the NCI’s Office of Cancer Genomics (OCG) is to enhance the understanding of the molecular mechanisms of cancer, advance and accelerate genomics science and technology development, and efficiently translate the genomics data to improve cancer prevention, early detection, diagnosis and treatment.

  7. Towards Sequencing Cotton (Gossypium) Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly. Generating larger amounts of sequence data more quickly does not address the difficulties of sequencing and assembling complex genomes de novo. The cotton genomes represent a...

  8. Rice genomics: current status of genome sequencing.

    PubMed

    Matsumoto, T; Wu, J; Baba, T; Katayose, Y; Yamamoto, K; Sakata, K; Yano, M; Sasaki, T

    2001-01-01

    Since its establishment in 1991, the Rice Genome Research Program (RGP) has produced some basic tools for rice genome analysis, including a cDNA catalogue, a genetic linkage map and a yeast artificial chromosome (YAC)-based physical map. For the further development of rice genomics, RGP launched in 1998 an international collaborative project on rice genome sequencing. A P1-derived artificial chromosome (PAC)-based, sequence-ready physical map has been constructed using the PCR markers from cDNA sequences (expressed sequence tag [EST] markers). Selected PAC clones with 100-150 kb inserts from chromosomes 1 and 6 have been subjected to shotgun sequencing. The assembled genomic sequences, after predicting the gene-coding region, have been published both through a public database and through our website. As of January 2000, 1.9 Mb from 13 PAC clones were published. Future prospects for understanding rice genomic information at the nucleotide level are discussed. PMID:11387985

  9. Whole-genome sequencing analysis of phenotypic heterogeneity and anticipation in Li–Fraumeni cancer predisposition syndrome

    PubMed Central

    Ariffin, Hany; Hainaut, Pierre; Puzio-Kuter, Anna; Choong, Soo Sin; Chan, Adelyne Sue Li; Tolkunov, Denis; Rajagopal, Gunaretnam; Kang, Wenfeng; Lim, Leon Li Wen; Krishnan, Shekhar; Chen, Kok-Siong; Achatz, Maria Isabel; Karsa, Mawar; Shamsani, Jannah; Levine, Arnold J.; Chan, Chang S.

    2014-01-01

    The Li–Fraumeni syndrome (LFS) and its variant form (LFL) is a familial predisposition to multiple forms of childhood, adolescent, and adult cancers associated with germ-line mutation in the TP53 tumor suppressor gene. Individual disparities in tumor patterns are compounded by acceleration of cancer onset with successive generations. It has been suggested that this apparent anticipation pattern may result from germ-line genomic instability in TP53 mutation carriers, causing increased DNA copy-number variations (CNVs) with successive generations. To address the genetic basis of phenotypic disparities of LFS/LFL, we performed whole-genome sequencing (WGS) of 13 subjects from two generations of an LFS kindred. Neither de novo CNV nor significant difference in total CNV was detected in relation with successive generations or with age at cancer onset. These observations were consistent with an experimental mouse model system showing that trp53 deficiency in the germ line of father or mother did not increase CNV occurrence in the offspring. On the other hand, individual records on 1,771 TP53 mutation carriers from 294 pedigrees were compiled to assess genetic anticipation patterns (International Agency for Research on Cancer TP53 database). No strictly defined anticipation pattern was observed. Rather, in multigeneration families, cancer onset was delayed in older compared with recent generations. These observations support an alternative model for apparent anticipation in which rare variants from noncarrier parents may attenuate constitutive resistance to tumorigenesis in the offspring of TP53 mutation carriers with late cancer onset. PMID:25313051

  10. | Office of Cancer Genomics

    Cancer.gov

    This past July, I started a journey into the fields of communications and cancer research when I joined the Office of Cancer Genomics (OCG) as a fellow in the National Cancer Institute (NCI) Health Communications Internship Program (HCIP).

  11. Resources | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers.

  12. Advances in plant genome sequencing.

    PubMed

    Hamilton, John P; Buell, C Robin

    2012-04-01

    The study of plant biology in the 21st century is, and will continue to be, vastly different from that in the 20th century. One driver for this has been the use of genomics methods to reveal the genetic blueprints for not one but dozens of plant species, as well as resolving genome differences in thousands of individuals at the population level. Genomics technology has advanced substantially since publication of the first plant genome sequence, that of Arabidopsis thaliana, in 2000. Plant genomics researchers have readily embraced new algorithms, technologies and approaches to generate genome, transcriptome and epigenome datasets for model and crop species that have permitted deep inferences into plant biology. Challenges in sequencing any genome include ploidy, heterozygosity and paralogy, all which are amplified in plant genomes compared to animal genomes due to the large genome sizes, high repetitive sequence content, and rampant whole- or segmental genome duplication. The ability to generate de novo transcriptome assemblies provides an alternative approach to bypass these complex genomes and access the gene space of these recalcitrant species. The field of genomics is driven by technological improvements in sequencing platforms; however, software and algorithm development has lagged behind reductions in sequencing costs, improved throughput, and quality improvements. It is anticipated that sequencing platforms will continue to improve the length and quality of output, and that the complementary algorithms and bioinformatic software needed to handle large, repetitive genomes will improve. The future is bright for an exponential improvement in our understanding of plant biology. PMID:22449051

  13. | Office of Cancer Genomics

    Cancer.gov

    My name is Nicholas Griner and I am the Scientific Program Manager for the Cancer Genome Characterization Initiative (CGCI) in the Office of Cancer Genomics (OCG). Until recently, I spent most of my scientific career working in a cancer research laboratory. In my postdoctoral training, my research focused on identifying novel pathways that contribute to both prostate and breast cancers and studying proteins within these pathways that may be targeted with cancer drugs.

  14. Cancer Genomics Overview

    Cancer.gov

    Genomic information about cancer is leading to better diagnoses and treatment strategies that are tailored to patients’ tumors. Precision medicine is the application of genomic insights to a therapeutic approach adapted specifically for each patient.

  15. Complete Genome Sequence of Bacilli bacterium Strain VT-13-104 Isolated from the Intestine of a Patient with Duodenal Cancer

    PubMed Central

    Tetz, Victor

    2015-01-01

    We report the complete genome sequence of Bacilli bacterium strain VT-13-104 isolated from the intestine of a patient with duodenal cancer. The genome is composed of 3,573,421 bp, with a G+C content of 35.7%. It possesses 3,254 predicted protein-coding genes encoding multidrug resistance transporters, resistance to antibiotics, and virulence factors. PMID:26139715

  16. Targeted Sequencing of the Mitochondrial Genome of Women at High Risk of Breast Cancer without Detectable Mutations in BRCA1/2

    PubMed Central

    Blein, Sophie; Barjhoux, Laure; Damiola, Francesca; Dondon, Marie-Gabrielle; Eon-Marchais, Séverine; Marcou, Morgane; Caron, Olivier; Lortholary, Alain; Buecher, Bruno; Berthet, Pascaline; Noguès, Catherine; Lasset, Christine; Gauthier-Villars, Marion; Mazoyer, Sylvie; Stoppa-Lyonnet, Dominique; Andrieu, Nadine; Cox, David G.

    2015-01-01

    Breast Cancer is a complex multifactorial disease for which high-penetrance mutations have been identified. Approaches used to date have identified genomic features explaining about 50% of breast cancer heritability. A number of low- to medium penetrance alleles (per-allele odds ratio < 1.5 and 4.0, respectively) have been identified, suggesting that the remaining heritability is likely to be explained by the cumulative effect of such alleles and/or by rare high-penetrance alleles. Relatively few studies have specifically explored the mitochondrial genome for variants potentially implicated in breast cancer risk. For these reasons, we propose an exploration of the variability of the mitochondrial genome in individuals diagnosed with breast cancer, having a positive breast cancer family history but testing negative for BRCA1/2 pathogenic mutations. We sequenced the mitochondrial genome of 436 index breast cancer cases from the GENESIS study. As expected, no pathogenic genomic pattern common to the 436 women included in our study was observed. The mitochondrial genes MT-ATP6 and MT-CYB were observed to carry the highest number of variants in the study. The proteins encoded by these genes are involved in the structure of the mitochondrial respiration chain, and variants in these genes may impact reactive oxygen species production contributing to carcinogenesis. More functional and epidemiological studies are needed to further investigate to what extent variants identified may influence familial breast cancer risk. PMID:26406445

  17. Genome Sequence Databases (Overview): Sequencing and Assembly

    SciTech Connect

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  18. Fungal Genome Sequencing and Bioenergy

    SciTech Connect

    Baker, Scott E.; Thykaer, Jette; Adney, William S.; Brettin, T.; Brockman, Fred J.; D'haeseleer, Patrik; Martinez, Antonio D.; Miller, R. M.; Rokhsar, Daniel S.; Schadt, Christopher W.; Torok, Tamas; Tuskan, Gerald; Bennett, Joan W.; Berka, Randy; Briggs, Steve; Heitman, Joseph; Taylor, John; Turgeon, Barbara G.; Werner-Washburne, Maggie; Himmel, Michael E.

    2008-09-30

    To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

  19. Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer | Office of Cancer Genomics

    Cancer.gov

    Endometrial cancer is the most common gynecological malignancy, with more than 280,000 cases occurring annually worldwide. Although previous studies have identified important common somatic mutations in endometrial cancer, they have primarily focused on a small set of known cancer genes and have thus provided a limited view of the molecular basis underlying this disease. Here we have developed an integrated systems-biology approach to identifying novel cancer genes contributing to endometrial tumorigenesis.

  20. | Office of Cancer Genomics

    Cancer.gov

    Welcome to the first National Cancer Institute (NCI) Office of Cancer Genomics (OCG) electronic newsletter. We are proud to launch this new communication tool to provide updates on ongoing projects, announce new projects, and highlight how OCG's efforts further the NCI mission to improve the lives of cancer patients by advancing the understanding of cancer's mechanisms at the molecular level.

  1. Projects | Office of Cancer Genomics

    Cancer.gov

    The goal of the Burkitt Lymphoma Genome Sequencing Project (BLGSP) is to explore potential genetic changes in patients with Burkitt lymphoma (BL)Opens in a New Tab that could lead to better prevention, detection, and treatment of this rare and aggressive cancer.

  2. NIH Launches Comprehensive Effort to Explore Cancer Genomics | Office of Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing.

  3. | Office of Cancer Genomics

    Cancer.gov

    Dr. Louis Staudt, a member of the National Academy of Sciences, is a leading expert in lymphoma research within NCI’s intramural research program. He was recently named the Director of the Center for Cancer Genomics (CCG), the organization that encompasses the Office of Cancer Genomics. In this short interview, Dr. Staudt discusses the objectives, challenges, and future directions of the Center.

  4. | Office of Cancer Genomics

    Cancer.gov

    My name is Subhashini Jagu, and I am the Scientific Program Manager for the Cancer Target Discovery and Development (CTD2) Network at the Office of Cancer Genomics (OCG). In my new role, I help CTD2 work toward its mission, which is to develop new scientific approaches to accelerate the translation of genomic discoveries into new treatments. Collaborative efforts that bring together a variety of expertise and infrastructure are needed to understand and successfully treat cancer, a highly complex disease.

  5. Testing personalized medicine: patient and physician expectations of next-generation genomic sequencing in late-stage cancer care

    PubMed Central

    Miller, Fiona A; Hayeems, Robin Z; Bytautas, Jessica P; Bedard, Philippe L; Ernst, Scott; Hirte, Hal; Hotte, Sebastien; Oza, Amit; Razak, Albiruni; Welch, Stephen; Winquist, Eric; Dancey, Janet; Siu, Lillian L

    2014-01-01

    Developments in genomics, including next-generation sequencing technologies, are expected to enable a more personalized approach to clinical care, with improved risk stratification and treatment selection. In oncology, personalized medicine is particularly advanced and increasingly used to identify oncogenic variants in tumor tissue that predict responsiveness to specific drugs. Yet, the translational research needed to validate these technologies will be conducted in patients with late-stage cancer and is expected to produce results of variable clinical significance and incidentally identify genetic risks. To explore the experiential context in which much of personalized cancer care will be developed and evaluated, we conducted a qualitative interview study alongside a pilot feasibility study of targeted DNA sequencing of metastatic tumor biopsies in adult patients with advanced solid malignancies. We recruited 29/73 patients and 14/17 physicians; transcripts from semi-structured interviews were analyzed for thematic patterns using an interpretive descriptive approach. Patient hopes of benefit from research participation were enhanced by the promise of novel and targeted treatment but challenged by non-findings or by limited access to relevant trials. Family obligations informed a willingness to receive genetic information, which was perceived as burdensome given disease stage or as inconsequential given faced challenges. Physicians were optimistic about long-term potential but conservative about immediate benefits and mindful of elevated patient expectations; consent and counseling processes were expected to mitigate challenges from incidental findings. These findings suggest the need for information and decision tools to support physicians in communicating realistic prospects of benefit, and for cautious approaches to the generation of incidental genetic information. PMID:23860039

  6. Whole-Genome Sequences of DA and F344 Rats with Different Susceptibilities to Arthritis, Autoimmunity, Inflammation and Cancer

    PubMed Central

    Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A.; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S.

    2013-01-01

    DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301

  7. Somatic retrotransposition in the cancer genome

    E-print Network

    Helman, Elena

    2014-01-01

    Cancer is a complex disease of the genome exhibiting myriad somatic mutations, from single nucleotide changes to various chromosomal rearrangements. The technological advances of next-generation sequencing enable high-throughput ...

  8. Genomic Sequencing in Determining Treatment in Patients With Metastatic Cancer or Cancer That Cannot Be Removed by Surgery

    ClinicalTrials.gov

    2015-10-27

    Metastatic Neoplasm; Recurrent Neoplasm; Recurrent Non-Small Cell Lung Carcinoma; Stage IIIA Non-Small Cell Lung Cancer; Stage IIIB Non-Small Cell Lung Cancer; Stage IV Non-Small Cell Lung Cancer; Unresectable Malignant Neoplasm

  9. Data Policies | Office of Cancer Genomics

    Cancer.gov

    OCG accelerates the discovery and development of better cancer diagnosis and treatment strategies by making data and materials from its programs available to the cancer research community. OCG enables researchers to search and download data generated by its active programs in databases that are easily accessible through program-specific data matrices. For the tumor genome characterization initiatives, CGCI and TARGET, the datasets contain clinical information, genomic characterization data, and high-throughput sequencing analysis of tumor genomes.

  10. Whole-genome sequencing of asian lung cancers: second-hand smoke unlikely to be responsible for higher incidence of lung cancer among Asian never-smokers.

    PubMed

    Krishnan, Vidhya G; Ebert, Philip J; Ting, Jason C; Lim, Elaine; Wong, Swee-Seong; Teo, Audrey S M; Yue, Yong G; Chua, Hui-Hoon; Ma, Xiwen; Loh, Gary S L; Lin, Yuhao; Tan, Joanna H J; Yu, Kun; Zhang, Shenli; Reinhard, Christoph; Tan, Daniel S W; Peters, Brock A; Lincoln, Stephen E; Ballinger, Dennis G; Laramie, Jason M; Nilsen, Geoffrey B; Barber, Thomas D; Tan, Patrick; Hillmer, Axel M; Ng, Pauline C

    2014-11-01

    Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status. PMID:25189529

  11. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing

    E-print Network

    Helman, Elena

    Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon ...

  12. | Office of Cancer Genomics

    Cancer.gov

    Ringing in the New Year is always a time for reflection. With several individual projects stirring in the Office and a new Center for Cancer Genomics recently inaugurated, 2011 was a prodigious year for OCG.

  13. Microbial genome sequencing and pathogenesis.

    PubMed

    Tang, C M; Hood, D W; Moxon, E R

    1998-02-01

    The year 1997 saw the publication of the complete nucleotide sequence of Helicobacter pylori and Escherichia coli. It is conceivable that the complete nucleotide sequence for all the major human bacterial pathogens will be available by the end of the century. Database alignments have been used to ascribe the putative functions of open reading frames in the sequenced isolates and to define the differences between bacterial species at the nucleotide level. The most striking finding from all genome projects has been the high proportion of open reading frames that have no known function. Experimental data demonstrating the utility of the genome sequencing projects are only just beginning to emerge. PMID:10066467

  14. Burkitt Lymphoma | Office of Cancer Genomics

    Cancer.gov

    The goal of the Burkitt Lymphoma Genome Sequencing Project (BLGSP) is to explore potential genetic changes in patients with Burkitt lymphoma (BL)Opens in a New Tab that could lead to better prevention, detection, and treatment of this rare and aggressive cancer. The Office of Cancer Genomics (OCG) at the National Cancer Institute (NCI) initiated BLGSP in collaboration with the Foundation for Burkitt Lymphoma Research.

  15. The genomic complexity of primary human prostate cancer

    E-print Network

    Carter, Scott L.

    Prostate cancer is the second most common cause of male cancer deaths in the United States. However, the full range of prostate cancer genomic alterations is incompletely characterized. Here we present the complete sequence ...

  16. Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer

    Cancer.gov

    Endometrial cancer is the most common gynecological malignancy, with more than 280,000 cases occurring annually worldwide. Although previous studies have identified important common somatic mutations in endometrial cancer, they have primarily focused on a small set of known cancer genes and have thus provided a limited view of the molecular basis underlying this disease. Here we have developed an integrated systems-biology approach to identifying novel cancer genes contributing to endometrial tumorigenesis.

  17. NCI Community Cancer Centers Program - Related Programs - The Cancer Genome Atlas

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a large-scale collaborative effort by NCI and the National Human Genome Research Institute (NHGRI) to systematically characterize the genomic changes that occur in cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  18. Screening for Genomic Rearrangements in Families with Breast and Ovarian Cancer Identifies BRCA1 Mutations Previously Missed by Conformation-Sensitive Gel Electrophoresis or Sequencing

    PubMed Central

    Unger, Meredith A.; Nathanson, Katherine L.; Calzone, Kathleen; Antin-Ozerkis, Danielle; Shih, Helen A.; Martin, Anne-Marie; Lenoir, Gilbert M.; Mazoyer, Sylvie; L. Weber, Barbara

    2000-01-01

    The frequency of genomic rearrangements in BRCA1 was assessed in 42 American families with breast and ovarian cancer who were seeking genetic testing and who were subsequently found to be negative for BRCA1 and BRCA2 coding-region mutations. An affected individual from each family was tested by PCR for the exon 13 duplication (Puget et al. 1999a) and by Southern blot analysis for novel genomic rearrangements. The exon 13 duplication was detected in one family, and four families had other genomic rearrangements. A total of 5 (11.9%) of the 42 families with breast/ovarian cancer who did not have BRCA1 and BRCA2 coding-region mutations had mutations in BRCA1 that were missed by conformation-sensitive gel electrophoresis or sequencing. Four of five families with BRCA1 genomic rearrangements included at least one individual with both breast and ovarian cancer; therefore, 4 (30.8%) of 13 families with a case of multiple primary breast and ovarian cancer had a genomic rearrangement in BRCA1. Families with genomic rearrangements had prior probabilities of having a BRCA1 mutation, ranging from 33% to 97% (mean 70%) (Couch et al. 1997). In contrast, in families without rearrangements, prior probabilities of having a BRCA1 mutation ranged from 7% to 92% (mean 37%). Thus, the prior probability of detecting a BRCA1 mutation may be a useful predictor when considering the use of Southern blot analysis for families with breast/ovarian cancer who do not have detectable coding-region mutations. PMID:10978226

  19. Bioinformatics and Genomic Technology354 Detection of Copy Number Variations in Cancer Genomes from High

    E-print Network

    Hochreiter, Sepp

    a model across samples for each genomic position, it is not affected by read count variations along, and the core algorithm of cn.MOPS have been optimized for CNV detection in cancer genomes. We demonstrate the improved performance of the enhanced cn.MOPS algorithm for cancer genomes on whole genome sequencing data

  20. Remarkable similarities of chromosomal rearrangements between primary human breast cancers and matched distant metastases as revealed by whole-genome sequencing.

    PubMed

    Tang, Man-Hung Eric; Dahlgren, Malin; Brueffer, Christian; Tjitrowirjo, Tamara; Winter, Christof; Chen, Yilun; Olsson, Eleonor; Wang, Kun; Törngren, Therese; Sjöström, Martin; Grabau, Dorthe; Bendahl, Pär-Ola; Rydén, Lisa; Niméus, Emma; Saal, Lao H; Borg, Åke; Gruvberger-Saal, Sofia K

    2015-11-10

    To better understand and characterize chromosomal structural variation during breast cancer progression, we enumerated chromosomal rearrangements for 11 patients by performing low-coverage whole-genome sequencing of 11 primary breast tumors and their 13 matched distant metastases. The tumor genomes harbored a median of 85 (range 18-404) rearrangements per tumor, with a median of 82 (26-310) in primaries compared to 87 (18-404) in distant metastases. Concordance between paired tumors from the same patient was high with a median of 89% of rearrangements shared (range 61-100%), whereas little overlap was found when comparing all possible pairings of tumors from different patients (median 3%). The tumors exhibited diverse genomic patterns of rearrangements: some carried events distributed throughout the genome while others had events mostly within densely clustered chromothripsis-like foci at a few chromosomal locations. Irrespectively, the patterns were highly conserved between the primary tumor and metastases from the same patient. Rearrangements occurred more frequently in genic areas than expected by chance and among the genes affected there was significant enrichment for cancer-associated genes including disruption of TP53, RB1, PTEN, and ESR1, likely contributing to tumor development. Our findings are most consistent with chromosomal rearrangements being early events in breast cancer progression that remain stable during the development from primary tumor to distant metastasis. PMID:26439695

  1. Ovarian cancer: genomic analysis

    PubMed Central

    Wei, W.; Dizon, D.; Vathipadiekal, V.; Birrer, M. J.

    2013-01-01

    Objectives Despite improvements in the management of ovarian cancer patients over the last 30 years, there has been only a minimal improvement in overall survival. While targeted therapeutic approaches for the treatment of cancer have evolved, major challenges in ovarian cancer research persist, including the identification of predictive biomarkers with clinical relevance, so that empirical drug selection can be avoided. In this article, we review published genomic analysis studies including data generated in our laboratory and how they have been incorporated into modern clinical trials in a rational and effective way. Methods Multiple published genomic analysis studies were collected for review and discussion with emphasis on their potential clinical applicability. Results Genomic analysis has been shown to be a powerful tool to identify dysregulated genes, aberrantly activated pathways and to uncover uniqueness of subclasses of ovarian tumors. The application of this technology has provided a solid molecular basis for different clinical behaviors associated with tumor histology and grade. Genomic signatures have been obtained to predict clinical end points for patients with cancer, including response rates, progression-free survival, and overall survival. In addition, genomic analysis has provided opportunities to identify biomarkers, which either result in a modification of existing clinical management or to stratification of patients to novel therapeutic approaches designed as clinical trials. Conclusions Genomic analyses have accelerated the identification of relevant biomarkers and extended our understanding of the molecular biology of ovarian cancer. This in turn, will hopefully lead to a paradigm shift from empirical, uniform treatment to a more rational, personalized treatment of ovarian cancers. However, validation of potential biomarkers on both the statistical and biological levels is needed to confirm they are of clinical relevance, in order to increase the likelihood that the desired outcome can be predicted and achieved. PMID:24265410

  2. Sequencing and mapping of the onion genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

  3. How the environment shapes cancer genomes

    PubMed Central

    Pfeifer, Gerd P.

    2014-01-01

    Purpose of review The mutational patterns of cancer genomes allow conclusions or generation of hypotheses as to what mechanisms or environmental, dietary or occupational exposures might have created the mutations and therefore will have contributed to the formation of the cancer. The arguments for cancer causation are particularly convincing when epidemiological evidence can support the theory that a particular exposure is linked to the cancer and when the mutational process can be recapitulated in experimental systems. In this review, I will summarize recent evidence from cancer genome sequencing studies to exemplify how the environment can modulate tumor genomes. Recent findings Mutation data from cancer genomes clearly implicate the UVB component of sunlight in melanoma skin cancers, tobacco carcinogen-induced DNA damage in lung cancers and aristolochic acid, a chemical compound found in certain herbal medicines, in urothelial carcinomas of exposed populations. However, large-scale sequencing is beginning to unveil other unique mutational spectra in particular cancers, such as A to C mutations at 5?AA dinucleotides in esophageal adenocarcinomas and complex mutational patterns in liver cancer. These data sets can form the basis for future studies aimed at identifying the carcinogens at work. Summary The findings have substantial implications for our understanding of cancer etiology and cancer prevention. PMID:25402978

  4. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

  5. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

  6. The Cancer Genome Atlas ovarian cancer analysis

    Cancer.gov

    An analysis of genomic changes in ovarian cancer has provided the most comprehensive and integrated view of cancer genes for any cancer type to date. Ovarian serous adenocarcinoma tumors from 500 patients were examined by The Cancer Genome Atlas (TCGA) Re

  7. | Office of Cancer Genomics

    Cancer.gov

    The Office of Cancer Genomics is proud to regularly support internship programs including The Health Communications Internship Program (HCIP). This past July the OCG welcomed a new HCIP intern to a one-year appointment. Gene Gillespie earned his Ph.D. from UCLA in 2011 and is interested in pursuing a career in science and medical writing. He presents a few personal and scientific thoughts on cancer in this month’s eNews perspective.

  8. | Office of Cancer Genomics

    Cancer.gov

    It's April, and that can only mean one thing at the NCI. No, it's not the DC Cherry Blossom Festival, but the Annual Meeting of the American Association for Cancer Research (AACR). This year's gathering of oncology-laden minds was inundated with a plethora of multiple symposia, educational and scientific sessions, workshops, talks and poster presentations that revolved around the theme of cancer genomics. The opening plenary session featured the NCI Director, Dr.

  9. Cancer Target Discovery and Development | Office of Cancer Genomics

    Cancer.gov

    CTD2 bridges the gap between the enormous volumes of data generated by genomic characterization studies and the ability to use these data for the development of human cancer therapeutics. It specializes in computational and functional genomics approaches critical for translating next-generation sequencing data.

  10. Genomic tumor evolution of breast cancer.

    PubMed

    Sato, Fumiaki; Saji, Shigehira; Toi, Masakazu

    2016-01-01

    Owing to recent technical development of comprehensive genome-wide analysis such as next generation sequencing, deep biological insights of breast cancer have been revealed. Information of genomic mutations and rearrangements in patients' tumors is indispensable to understand the mechanism in carcinogenesis, progression, metastasis, and resistance to systemic treatment of breast cancer. To date, comprehensive genomic analyses illustrate not only base substitution patterns and lists of driver mutations and key rearrangements, but also a manner of tumor evolution. Breast cancer genome is dynamically changing and evolving during cancer development course from non-invasive disease via invasive primary tumor to metastatic tumor, and during treatment exposure. The accumulation pattern of base substitution and genomic rearrangement looks gradual and punctuated, respectively, in analogy with contrasting theories for evolution manner of species, Darwin's phyletic gradualism, and Eldredge and Gould's "punctuated equilibrium". Liquid biopsy is a non-invasive method to detect the genomic evolution of breast cancer. Genomic mutation patterns in circulating tumor cells and circulating cell-free tumor DNA represent those of tumors existing in patient body. Liquid biopsy methods are now under development for future application to clinical practice of cancer treatment. In this article, latest knowledge regarding breast cancer genome, especially in terms of 'tumor evolution', is summarized. PMID:25998191

  11. Collaborators | Office of Cancer Genomics

    Cancer.gov

    The TARGET initiative is jointly managed within the National Cancer Institute (NCI) by the Office of Cancer Genomics (OCG)Opens in a New Tab and the Cancer Therapy Evaluation Program (CTEP)Opens in a New Tab.

  12. NIH Launches Comprehensive Effort to Explore Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing.

  13. Fuzzy Genome Sequence Assembly for Single and Environmental Genomes

    E-print Network

    Nicolescu, Monica

    and to the first genome sequence as- sembly, Bacteriophage X174 [38]. In 1990 the Human Genome Project in 2003, two years before its projected date. #12;2 Sara Nasser, et al In 1993 The Institute for Genome advancements in technology that lead the to complete sequencing of the Human Genome and the H. influenzae

  14. Educational Resources | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers.

  15. Challenges of sequencing human genomes

    PubMed Central

    Ding, Li; Mardis, Elaine R.; Wilson, Richard K.

    2010-01-01

    Massively parallel sequencing technologies continue to alter the study of human genetics. As the cost of sequencing declines, next-generation sequencing (NGS) instruments and datasets will become increasingly accessible to the wider research community. Investigators are understandably eager to harness the power of these new technologies. Sequencing human genomes on these platforms, however, presents numerous production and bioinformatics challenges. Production issues like sample contamination, library chimaeras and variable run quality have become increasingly problematic in the transition from technology development lab to production floor. Analysis of NGS data, too, remains challenging, particularly given the short-read lengths (35–250 bp) and sheer volume of data. The development of streamlined, highly automated pipelines for data analysis is critical for transition from technology adoption to accelerated research and publication. This review aims to describe the state of current NGS technologies, as well as the strategies that enable NGS users to characterize the full spectrum of DNA sequence variation in humans. PMID:20519329

  16. Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,

    E-print Network

    Purugganan, Michael D.

    COMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant, Arabidopsis thaliana, was published ,6 years ago (Arabidopsis Genome Initiative, 2000). Since Information Entrez Genome Projects website reports that sequencing of several more plant genomes is in prog

  17. Sequencing crop genomes: approaches and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Plant genome sequencing methodology parrallels the sequencing of the human genome. The first projects were slow and very expensive. BAC by BAC approaches were utilized first and whole-genome shotgun sequencing rapidly replaced that approach. So called 'next generation' technologies such as short rea...

  18. Programs | Office of Cancer Genomics

    Cancer.gov

    OCG facilitates cancer genomics research through a series of highly-focused programs. These programs generate and disseminate genomic data for use by the cancer research community. OCG programs also promote advances in technology-based infrastructure and create valuable experimental reagents and tools. OCG programs encourage collaboration by interconnecting with other genomics and cancer projects in order to accelerate translation of findings into the clinic. Below are OCG’s current, completed, and initiated programs:

  19. Programs | Office of Cancer Genomics

    Cancer.gov

    OCG facilitates cancer genomics research through a series of highly-focused programs. These programs generate and disseminate genomic data for use by the cancer research community. OCG programs also promote advances in technology-based infrastructure and create valuable experimental reagents and tools. OCG programs encourage collaboration by interconnecting with other genomics and cancer projects in order to accelerate translation of findings into the clinic.

  20. Dr. Marco Marra: Pioneer and Visionary in Cancer Genomics Research | Office of Cancer Genomics

    Cancer.gov

    Dr. Marco Marra is a highly distinguished genomics and bioinformatics researcher. He is the Director of Canada’s Michael Smith Genome Sciences Centre at the BC Cancer Agency and holds a faculty position at the University of British Columbia. The Centre is a state-of-the-art sequencing facility in Vancouver, Canada, with a major focus on the study of cancers.  Many of their research projects are undertaken in collaborations with other Canadian and international institutions.

  1. The genomic evolution of human prostate cancer

    PubMed Central

    Mitchell, T; Neal, D E

    2015-01-01

    Prostate cancers are highly prevalent in the developed world, with inheritable risk contributing appreciably to tumour development. Genomic heterogeneity within individual prostate glands and between patients derives predominantly from structural variants and copy-number aberrations. Subtypes of prostate cancers are being delineated through the increasing use of next-generation sequencing, but these subtypes are yet to be used to guide the prognosis or therapeutic strategy. Herein, we review our current knowledge of the mutational landscape of human prostate cancer, describing what is known of the common mutations underpinning its development. We evaluate recurrent prostate-specific mutations prior to discussing the mutational events that are shared both in prostate cancer and across multiple cancer types. From these data, we construct a putative overview of the genomic evolution of human prostate cancer. PMID:26125442

  2. Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments

    PubMed Central

    Sun, Kun; Jiang, Peiyong; Chan, K. C. Allen; Wong, John; Cheng, Yvonne K. Y.; Liang, Raymond H. S.; Chan, Wai-kong; Ma, Edmond S. K.; Chan, Stephen L.; Cheng, Suk Hang; Chan, Rebecca W. Y.; Tong, Yu K.; Ng, Simon S. M.; Wong, Raymond S. M.; Hui, David S. C.; Leung, Tse Ngong; Leung, Tak Y.; Lai, Paul B. S.; Chiu, Rossa W. K.; Lo, Yuk Ming Dennis

    2015-01-01

    Plasma consists of DNA released from multiple tissues within the body. Using genome-wide bisulfite sequencing of plasma DNA and deconvolution of the sequencing data with reference to methylation profiles of different tissues, we developed a general approach for studying the major tissue contributors to the circulating DNA pool. We tested this method in pregnant women, patients with hepatocellular carcinoma, and subjects following bone marrow and liver transplantation. In most subjects, white blood cells were the predominant contributors to the circulating DNA pool. The placental contributions in the plasma of pregnant women correlated with the proportional contributions as revealed by fetal-specific genetic markers. The graft-derived contributions to the plasma in the transplant recipients correlated with those determined using donor-specific genetic markers. Patients with hepatocellular carcinoma showed elevated plasma DNA contributions from the liver, which correlated with measurements made using tumor-associated copy number aberrations. In hepatocellular carcinoma patients and in pregnant women exhibiting copy number aberrations in plasma, comparison of methylation deconvolution results using genomic regions with different copy number status pinpointed the tissue type responsible for the aberrations. In a pregnant woman diagnosed as having follicular lymphoma during pregnancy, methylation deconvolution indicated a grossly elevated contribution from B cells into the plasma DNA pool and localized B cells as the origin of the copy number aberrations observed in plasma. This method may serve as a powerful tool for assessing a wide range of physiological and pathological conditions based on the identification of perturbed proportional contributions of different tissues into plasma. PMID:26392541

  3. Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments.

    PubMed

    Sun, Kun; Jiang, Peiyong; Chan, K C Allen; Wong, John; Cheng, Yvonne K Y; Liang, Raymond H S; Chan, Wai-Kong; Ma, Edmond S K; Chan, Stephen L; Cheng, Suk Hang; Chan, Rebecca W Y; Tong, Yu K; Ng, Simon S M; Wong, Raymond S M; Hui, David S C; Leung, Tse Ngong; Leung, Tak Y; Lai, Paul B S; Chiu, Rossa W K; Lo, Yuk Ming Dennis

    2015-10-01

    Plasma consists of DNA released from multiple tissues within the body. Using genome-wide bisulfite sequencing of plasma DNA and deconvolution of the sequencing data with reference to methylation profiles of different tissues, we developed a general approach for studying the major tissue contributors to the circulating DNA pool. We tested this method in pregnant women, patients with hepatocellular carcinoma, and subjects following bone marrow and liver transplantation. In most subjects, white blood cells were the predominant contributors to the circulating DNA pool. The placental contributions in the plasma of pregnant women correlated with the proportional contributions as revealed by fetal-specific genetic markers. The graft-derived contributions to the plasma in the transplant recipients correlated with those determined using donor-specific genetic markers. Patients with hepatocellular carcinoma showed elevated plasma DNA contributions from the liver, which correlated with measurements made using tumor-associated copy number aberrations. In hepatocellular carcinoma patients and in pregnant women exhibiting copy number aberrations in plasma, comparison of methylation deconvolution results using genomic regions with different copy number status pinpointed the tissue type responsible for the aberrations. In a pregnant woman diagnosed as having follicular lymphoma during pregnancy, methylation deconvolution indicated a grossly elevated contribution from B cells into the plasma DNA pool and localized B cells as the origin of the copy number aberrations observed in plasma. This method may serve as a powerful tool for assessing a wide range of physiological and pathological conditions based on the identification of perturbed proportional contributions of different tissues into plasma. PMID:26392541

  4. Cancer Genomics Research Laboratory

    Cancer.gov

    CGR’s high throughput laboratory is equipped with state-of-the-art laboratory equipment and automation systems for a large number of applications. CGR supports DCEG in all stages of cancer research from planning to publishing, including experimental design and project management, sample handling, genotyping and sequencing assay design and execution, development and implementation of bioinformatic pipelines, and downstream scientific research and analytical support.

  5. Value of a newly sequenced bacterial genome

    PubMed Central

    Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  6. Marsupial Genome Sequences: Providing Insight into Evolution and Disease

    PubMed Central

    Deakin, Janine E.

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

  7. Marsupial genome sequences: providing insight into evolution and disease.

    PubMed

    Deakin, Janine E

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

  8. Expanding the computational toolbox for mining cancer genomes

    PubMed Central

    Ding, Li; Wendl, Michael C.; McMichael, Joshua F.; Raphael, Benjamin J.

    2014-01-01

    High-throughput DNA sequencing has revolutionized cancer genomics with numerous discoveries relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations including single nucleotide variants (SNVs), insertions and deletions (indels), structural aberrations, and gene fusions. Additional computational techniques have proved useful to define those mutations, genes, and molecular networks that drive diverse cancer phenotypes as well as determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic, epigenomic alterations and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application. PMID:25001846

  9. The fungal genome initiative and lessons learned from genome sequencing.

    PubMed

    Cuomo, Christina A; Birren, Bruce W

    2010-01-01

    The sequence of Saccharomyces cerevisiae enabled systematic genome-wide experimental approaches, demonstrating the power of having the complete genome of an organism. The rapid impact of these methods on research in yeast mobilized an effort to expand genomic resources for other fungi. The "fungal genome initiative" represents an organized genome sequencing effort to promote comparative and evolutionary studies across the fungal kingdom. Through such an approach, scientists can not only better understand specific organisms but also illuminate the shared and unique aspects of fungal biology that underlie the importance of fungi in biomedical research, health, food production, and industry. To date, assembled genomes for over 100 fungi are available in public databases, and many more sequencing projects are underway. Here, we discuss both examples of findings from comparative analysis of fungal sequences, with a specific emphasis on yeast genomes, and on the analytical approaches taken to mine fungal genomes. New sequencing methods are accelerating comparative studies of fungi by reducing the cost and difficulty of sequencing. This has driven more common use of sequencing applications, such as to study genome-wide variation in populations or to deeply profile RNA transcripts. These and further technological innovations will continue to be piloted in yeasts and other fungi, and will expand the applications of sequencing to study fungal biology. PMID:20946837

  10. Contact | Office of Cancer Genomics

    Cancer.gov

    For more information about the Office of Cancer Genomics, please contact: Office of Cancer Genomics National Cancer Institute 31 Center Drive, 10A07 Bethesda, Maryland 20892-2580 Phone: (301) 451-8027 Fax: (301) 480-4368 Email: ocg@mail.nih.gov *Please note that this site will not function properly in Internet Explorer unless you completely turn off the Compatibility View*

  11. Whole Genome and Transcriptome Sequencing of a B3 Thymoma

    PubMed Central

    Petrini, Iacopo; Rajan, Arun; Pham, Trung; Voeller, Donna; Davis, Sean; Gao, James; Wang, Yisong; Giaccone, Giuseppe

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina) and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37). Copy number (CN) aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X) was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs) and 2 insertion/deletions (INDELs) were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma. PMID:23577124

  12. Patterns of somatic mutation in human cancer genomes

    PubMed Central

    Greenman, Christopher; Stephens, Philip; Smith, Raffaella; Dalgliesh, Gillian L.; Hunter, Christopher; Bignell, Graham; Davies, Helen; Teague, Jon; Butler, Adam; Stevens, Claire; Edkins, Sarah; O'Meara, Sarah; Vastrik, Imre; Schmidt, Esther E.; Avis, Tim; Barthorpe, Syd; Bhamra, Gurpreet; Buck, Gemma; Choudhury, Bhudipa; Clements, Jody; Cole, Jennifer; Dicks, Ed; Forbes, Simon; Gray, Kris; Halliday, Kelly; Harrison, Rachel; Hills, Katy; Hinton, Jon; Jenkinson, Andy; Jones, David; Menzies, Andy; Mironenko, Tatiana; Perry, Janet; Raine, Keiran; Richardson, Dave; Shepherd, Rebecca; Small, Alexandra; Tofts, Calli; Varian, Jennifer; Webb, Tony; West, Sofie; Widaa, Sara; Yates, Andy; Cahill, Daniel P.; Louis, David N.; Goldstraw, Peter; Nicholson, Andrew G.; Brasseur, Francis; Looijenga, Leendert; Weber, Barbara L.; Chiew, Yoke-Eng; deFazio, Anna; Greaves, Mel F.; Green, Anthony R.; Campbell, Peter; Birney, Ewan; Easton, Douglas F.; Chenevix-Trench, Georgia; Tan, Min-Han; Khoo, Sok Kean; Teh, Bin Tean; Yuen, Siu Tsan; Leung, Suet Yi; Wooster, Richard; Futreal, P. Andrew; Stratton, Michael R.

    2009-01-01

    Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274 megabases (Mb) of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers. There was substantial variation in the number and pattern of mutations in individual cancers reflecting different exposures, DNA repair defects and cellular origins. Most somatic mutations are likely to be ‘passengers’ that do not contribute to oncogenesis. However, there was evidence for ‘driver’ mutations contributing to the development of the cancers studied in approximately 120 genes. Systematic sequencing of cancer genomes therefore reveals the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated. PMID:17344846

  13. Sequencing a Genome by Walking With Clone-end Sequences

    E-print Network

    Sequencing a Genome by Walking With Clone-end Sequences: A Mathematical Analysis Serafim Batzoglou-insert clones (such as bacterial artificial chromosomes (BACs)) and then (ii) to take successive 'walking' steps by selecting and sequencing minimally overlapping clones, using information such as clone-end sequences

  14. Genomics at the Ontario Institute for Cancer Research

    SciTech Connect

    Ali, Johar

    2010-06-02

    Johar Ali of the Ontario Institute for Cancer Research discusses genomics and next-gen applications at the OICR on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  15. Towards a reference pecan genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The cost of generating DNA sequence data has declined dramatically over the previous 15 years as a result of the Human Genome Project and the potential applications of genome sequencing for human medicine. This cost reduction has generated renewed interest among crop breeding scientists in applying...

  16. Complete genome sequence of ‘Candidatus Liberibacter africanus’

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of ‘Candidatus Liberibacter africanus’ (Laf), strain ptsapsy, was obtained by an Illumina HiSeq 2000. The Laf genome comprises 1,192,232 nucleotides, 34.5% GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S and 5S) ...

  17. Human Genome Sequencing in Health and Disease

    PubMed Central

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  18. Genomic Resources for Cancer Epidemiology

    Cancer.gov

    This page provides links to research resources, complied by the Epidemiology and Genomics Research Program, that may be of interest to genetic epidemiologists conducting cancer research, but is not exhaustive.

  19. Twenty years of bacterial genome sequencing.

    PubMed

    Loman, Nicholas J; Pallen, Mark J

    2015-12-01

    Twenty years ago, the publication of the first bacterial genome sequence, from Haemophilus influenzae, shook the world of bacteriology. In this Timeline, we review the first two decades of bacterial genome sequencing, which have been marked by three revolutions: whole-genome shotgun sequencing, high-throughput sequencing and single-molecule long-read sequencing. We summarize the social history of sequencing and its impact on our understanding of the biology, diversity and evolution of bacteria, while also highlighting spin-offs and translational impact in the clinic. We look forward to a 'sequencing singularity', where sequencing becomes the method of choice for as-yet unthinkable applications in bacteriology and beyond. PMID:26548914

  20. Cancer Genome Anatomy Project | Office of Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) Cancer Genome Anatomy Project (CGAP) is an online resource designed to provide the research community access to biological tissue characterization data. Request a free copy of the CGAP Website Virtual Tour CD from ocg@mail.nih.gov.

  1. Genomic alterations in pancreatic cancer and their relevance to therapy

    PubMed Central

    Takai, Erina; Yachida, Shinichi

    2015-01-01

    Pancreatic cancer is a highly lethal cancer type, for which there are few viable therapeutic options. But, with the advance of sequencing technologies for global genomic analysis, the landscape of genomic alterations in pancreatic cancer is becoming increasingly well understood. In this review, we summarize current knowledge of genomic alterations in 12 core signaling pathways or cellular processes in pancreatic ductal adenocarcinoma, which is the most common type of malignancy in the pancreas, including four commonly mutated genes and many other genes that are mutated at low frequencies. We also describe the potential implications of these genomic alterations for development of novel therapeutic approaches in the context of personalized medicine. PMID:26483879

  2. Genomic sequencing of Pleistocene cave bears

    SciTech Connect

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  3. The genome sequence of Drosophila melanogaster.

    SciTech Connect

    2000-03-24

    The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

  4. A Million Cancer Genome Warehouse David Haussler

    E-print Network

    McAuliffe, Jon

    Architecture Performance Demands for a Million Cancer Genome Warehouse Production Workload Clinical Trial Research and other Special Studies Ad Hoc Research Software Principles for a Million Cancer Genome/Management Statistics/Accuracy Demands for a Million Cancer Genome Warehouse Use of a Reference Genome Why Uncertain

  5. Research | Office of Cancer Genomics

    Cancer.gov

    Continuing advances in high-throughput genomic technologies and tools provide researchers an increasingly more detailed view of the genetic alterations found in cancers. CGCI researchers develop some of these emerging approaches and apply them towards the characterization of certain pediatric and adult cancers.

  6. Genome sequence of Coxiella burnetii strain Namibia

    PubMed Central

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  7. Genome sequence of Coxiella burnetii strain Namibia.

    PubMed

    Walter, Mathias C; Öhrman, Caroline; Myrtennäs, Kerstin; Sjödin, Andreas; Byström, Mona; Larsson, Pär; Macellaro, Anna; Forsman, Mats; Frangoulidis, Dimitrios

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  8. Streptococcal taxonomy based on genome sequence analyses

    PubMed Central

    2013-01-01

    The identification of the clinically relevant viridans streptococci group, at species level, is still problematic. The aim of this study was to extract taxonomic information from the complete genome sequences of 67 streptococci, comprising 19 species, by means of genomic analyses, multilocus sequence analysis (MLSA), average amino acid identity (AAI), genomic signatures, genome-to-genome distances (GGD) and codon usage bias. We then attempted to determine the usefulness of these genomic tools for species identification in streptococci. Our results showed that MLSA, AAI and GGD analyses are robust markers to identify streptococci at the species level, for instance, S. pneumoniae, S. mitis, and S. oralis. A Streptococcus species can be defined as a group of strains that share ? 95% DNA similarity in MLSA and AAI, and > 70% DNA identity in GGD. This approach allows an advanced understanding of bacterial diversity. PMID:24358875

  9. NIH researchers complete whole-exome sequencing of skin cancer

    Cancer.gov

    A team led by researchers at NIH is the first to systematically survey the landscape of the melanoma genome, the DNA code of the deadliest form of skin cancer. The researchers have made surprising new discoveries using whole-exome sequencing, an approach that decodes the 1-2 percent of the genome that contains protein-coding genes.

  10. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  11. Recurrent somatic mutations in regulatory regions of human cancer genomes.

    PubMed

    Melton, Collin; Reuter, Jason A; Spacek, Damek V; Snyder, Michael

    2015-07-01

    Aberrant regulation of gene expression in cancer can promote survival and proliferation of cancer cells. Here we integrate whole-genome sequencing data from The Cancer Genome Atlas (TCGA) for 436 patients from 8 cancer subtypes with ENCODE and other regulatory annotations to identify point mutations in regulatory regions. We find evidence for positive selection of mutations in transcription factor binding sites, consistent with these sites regulating important cancer cell functions. Using a new method that adjusts for sample- and genomic locus-specific mutation rates, we identify recurrently mutated sites across individuals with cancer. Mutated regulatory sites include known sites in the TERT promoter and many new sites, including a subset in proximity to cancer-related genes. In reporter assays, two new sites display decreased enhancer activity upon mutation. These data demonstrate that many regulatory regions contain mutations under selective pressure and suggest a greater role for regulatory mutations in cancer than previously appreciated. PMID:26053494

  12. Sequencing and comparing whole mitochondrial genomes ofanimals

    SciTech Connect

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  13. Dana-Farber Cancer Institute | Office of Cancer Genomics

    Cancer.gov

    Functional Annotation of Cancer Genomes Principal Investigator: William C. Hahn, M.D., Ph.D. The comprehensive characterization of cancer genomes has and will continue to provide an increasingly complete catalog of genetic alterations in specific cancers. However, most epithelial cancers harbor hundreds of genetic alterations as a consequence of genomic instability. Therefore, the functional consequences of the majority of mutations remain unclear.

  14. [Molecular targeted therapy and genomic evolution of breast cancer].

    PubMed

    Sato, Fumiaki; Toi, Masakazu

    2015-08-01

    Owing to development of next generation sequencer (NGS), deep biological insights of breast cancer have been provided. Information of genomic mutations and rearrangements in patients' tumors is required to understand the mechanism in resistance of molecular targeted medicine. To date, NGS analyses illustrated not only base substitution patterns and lists of driver mutations and key rearrangements, but also a manner of tumor evolution. Breast cancer genome is dynamically changing and evolving during cancer development course and treatment procedures. Liquid biopsy is a non-invasive method to detect the genomic evolution of breast cancer, which is now under development for future application to clinical practice of cancer treatment. In this article, latest knowledge regarding breast cancer genome, especially in terms of 'evolution', is summarized. PMID:26281691

  15. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge

    PubMed Central

    Czerwi?ska, Patrycja; Wiznerowicz, Maciej

    2015-01-01

    The Cancer Genome Atlas (TCGA) is a public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive “atlas” of cancer genomic profiles. So far, TCGA researchers have analysed large cohorts of over 30 human tumours through large-scale genome sequencing and integrated multi-dimensional analyses. Studies of individual cancer types, as well as comprehensive pan-cancer analyses have extended current knowledge of tumorigenesis. A major goal of the project was to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer. This review discusses the current status of TCGA Research Network structure, purpose, and achievements. PMID:25691825

  16. Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing

    E-print Network

    Batzoglou, Serafim

    and Molecular Biophysics, 2 Bioinformatics Research Laboratory, 3 Human Genome Sequencing Center, Department of chimpanzee whole-genome shotgun sequencing reads onto the human genome. The results of these comparisons

  17. Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships

    PubMed Central

    2014-01-01

    Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia. PMID:25001059

  18. Using the Potato Genome Sequence! Robin Buell!

    E-print Network

    Douches, David S.

    Using the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant;· RHPOTKEY BAC library (78000 clones; 9-10 g.e.) · Library clones fingerprintedGen Sequencing? 10 #12;Initial Strategy heterozygous clone (RH89-039-16) Contig assembly

  19. Complete genome sequence of trivittatus virus.

    PubMed

    Groseth, Allison; Vine, Veronica; Weisend, Carla; Ebihara, Hideki

    2015-10-01

    Trivittatus virus (family Bunyaviridae, genus Orthobunyavirus) represents an important genetic intermediate between the California encephalitis group and the Bwamba/Pongola and Nyando groups. Here, we report the first complete genome sequence of the prototype (Eklund) strain, isolated in 1948, which, interestingly, shows only a few differences when compared to partial sequences of modern strains. PMID:26212363

  20. Genomics of Cancer and a New Era for Cancer Prevention

    PubMed Central

    Brennan, Paul; Wild, Christopher P.

    2015-01-01

    A primary justification for dedicating substantial amounts of research funding to large-scale cancer genomics projects of both somatic and germline DNA is that the biological insights will lead to new treatment targets and strategies for cancer therapy. While it is too early to judge the success of these projects in terms of clinical breakthroughs, an alternative rationale is that new genomics techniques can be used to reduce the overall burden of cancer by prevention of new cases occurring and also by detecting them earlier. In particular, it is now becoming apparent that studying the genomic profile of tumors can help to identify new carcinogens and may subsequently result in implementing strategies that limit exposure. In parallel, it may be feasible to utilize genomic biomarkers to identify cancers at an earlier and more treatable stage using screening or other early detection approaches based on prediagnostic biospecimens. While the potential for these techniques is large, their successful outcome will depend on international collaboration and planning similar to that of recent sequencing initiatives. PMID:26540230

  1. Cancer Genome Anatomy Project (CGAP) | Office of Cancer Genomics

    Cancer.gov

    CGAP generated a wide range of genomics data on cancerous cells that are accessible through easy-to-use online tools. Researchers, educators, and students can find "in silico" answers to biological questions through the CGAP website. Request a free copy of the CGAP Website Virtual Tour CD from ocg@mail.nih.gov to learn how to navigate the website.

  2. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    SciTech Connect

    Nierman, William C.

    2000-02-14

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phred Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.

  3. [Genome sequencing and personalized medicine: perspectives and limitations].

    PubMed

    Le Gall, Jean-Yves; Debré, Patrice

    2014-01-01

    DNA sequencing technologies have advanced at an exponential rate in recent years: the first human genome was sequenced in 2001 after many years of effort by dozens of international laboratories at a cost of tens of millions of dollars, while in 2013 a genome can be sequenced within 24 hours for a few hundred dollars (exome sequencing takes only a few hours). More and more hospital laboratories are acquiring new high-throughput sequencing devices ("next-generation sequencers", NGS), allowing them to analyze tens or hundreds of genes, or even the entire exome. This is having a major impact on medical concepts and practices, especially with respect to genetics and oncology. This ability to search for mutations simultaneously in a large number of genes is finding applications in the diagnosis of Mendelian diseases (including at birth), routine screening for heterozygotes, and pre-conception diagnosis. NGS is now sufficiently sensitive to analyze circulating fetal DNA in maternal blood (cell-free fetal DNA, cffDNA), enabling applications such as non invasive diagnosis of fetal sex (and X-linked diseases), fetal rhesus among rhesus-negative women, trisomy and, in the near future, Mendelian mutations. Data on multifactorial diseases are still preliminary, but it should soon be possible to identify "strong" factors of genetic predisposition that have so far been beyond the scope of genome-wide association studies (GWAS). In the field of constitutional oncogenetics, NGS can also be used for simultaneous analysis of genes involved in " hereditary " cancers (21 breast cancer genes, 6 colon cancer genes, etc.). More generally, NGS can identify all genomic abnormalities (deletions, translocations, mutations) in a given malignant tissue (hemopathy or solid tumor), and has the potential to distinguish between important mutations (those that drive tumor progression) from " bystander " or accessory mutations, and also to identify "druggable" mutations amenable to targeted therapies (e.g. imatinib and Bcr/Abl rearrangement; verumafemib and the BRAF V600E mutation). Systematic sequencing of all the genes involved in drug metabolism and responsiveness will lead to individualized pharmacogenetics. Finally, sequencing of the tumoral and constitutional genomes, identfication of somatic mutations, and detection of pharmacogenetic variants will open up the era of personalized medicine. The first results of these targeted therapeutic indications show a gain in the duration of remission and survival, although the cost-effectiveness of these approaches remains to be determined. Finally, this huge capacity for genome sequencing raises a number of regulatory and ethical issues. PMID:26259290

  4. Genome-wide analysis of noncoding regulatory mutations in cancer.

    PubMed

    Weinhold, Nils; Jacobsen, Anders; Schultz, Nikolaus; Sander, Chris; Lee, William

    2014-11-01

    Cancer primarily develops because of somatic alterations in the genome. Advances in sequencing have enabled large-scale sequencing studies across many tumor types, emphasizing the discovery of alterations in protein-coding genes. However, the protein-coding exome comprises less than 2% of the human genome. Here we analyze the complete genome sequences of 863 human tumors from The Cancer Genome Atlas and other sources to systematically identify noncoding regions that are recurrently mutated in cancer. We use new frequency- and sequence-based approaches to comprehensively scan the genome for noncoding mutations with potential regulatory impact. These methods identify recurrent mutations in regulatory elements upstream of PLEKHS1, WDR74 and SDHD, as well as previously identified mutations in the TERT promoter. SDHD promoter mutations are frequent in melanoma and are associated with reduced gene expression and poor prognosis. The non-protein-coding cancer genome remains widely unexplored, and our findings represent a step toward targeting the entire genome for clinical purposes. PMID:25261935

  5. Clinical tumor sequencing: opportunities and challenges for precision cancer medicine.

    PubMed

    Damodaran, Senthilkumar; Berger, Michael F; Roychowdhury, Sameek

    2015-01-01

    Advances in tumor genome sequencing have enabled discovery of actionable alterations leading to novel therapies. Currently, there are approved targeted therapies across various tumors that can be matched to genomic alterations, such as point mutations, gene amplification, and translocations. Tools to detect these genomic alterations have emerged as a result of decreasing costs and improved throughput enabled by next-generation sequencing (NGS) technologies. NGS has been successfully utilized for developing biomarkers to assess susceptibility, diagnosis, prognosis, and treatment of cancers. However, clinical application presents some potential challenges in terms of tumor specimen acquisition, analysis, privacy, interpretation, and drug development in rare cancer subsets. Although whole-genome sequencing offers the most complete strategy for tumor analysis, its present utility in clinical care is limited. Consequently, targeted gene capture panels are more commonly employed by academic institutions and commercial vendors for clinical grade cancer genomic testing to assess molecular eligibility for matching therapies, whereas whole-exome and transcriptome (RNASeq) sequencing are being utilized for discovery research. This review discusses the strategies, clinical challenges, and opportunities associated with the application of cancer genomic testing for precision cancer medicine. PMID:25993170

  6. Melanoma genome sequencing reveals frequent PREX2 mutations

    PubMed Central

    Berger, Michael F.; Hodis, Eran; Heffernan, Timothy P.; Deribe, Yonathan Lissanu; Lawrence, Michael S.; Protopopov, Alexei; Ivanova, Elena; Watson, Ian R.; Nickerson, Elizabeth; Ghosh, Papia; Zhang, Hailei; Zeid, Rhamy; Ren, Xiaojia; Cibulskis, Kristian; Sivachenko, Andrey Y.; Wagle, Nikhil; Sucker, Antje; Sougnez, Carrie; Onofrio, Robert; Ambrogio, Lauren; Auclair, Daniel; Fennell, Timothy; Carter, Scott L.; Drier, Yotam; Stojanov, Petar; Singer, Meredith A.; Voet, Douglas; Jing, Rui; Saksena, Gordon; Barretina, Jordi; Ramos, Alex H.; Pugh, Trevor J.; Stransky, Nicolas; Parkin, Melissa; Winckler, Wendy; Mahan, Scott; Ardlie, Kristin; Baldwin, Jennifer; Wargo, Jennifer; Schadendorf, Dirk; Meyerson, Matthew; Gabriel, Stacey B.; Golub, Todd R.; Wagner, Stephan N.; Lander, Eric S.; Getz, Gad; Chin, Lynda; Garraway, Levi A.

    2012-01-01

    Melanoma is notable for its metastatic propensity, lethality in the advanced setting, and association with ultraviolet (UV) exposure early in life1. To obtain a comprehensive genomic view of melanoma, we sequenced the genomes of 25 metastatic melanomas and matched germline DNA. A wide range of point mutation rates was observed: lowest in melanomas whose primaries arose on non-UV exposed hairless skin of the extremities (3 and 14 per Mb genome), intermediate in those originating from hair-bearing skin of the trunk (range = 5 to 55 per Mb), and highest in a patient with a documented history of chronic sun exposure (111 per Mb). Analysis of whole-genome sequence data identified PREX2 - a PTEN-interacting protein and negative regulator of PTEN in breast cancer2 - as a significantly mutated gene with a mutation frequency of approximately 14% in an independent extension cohort of 107 human melanomas. PREX2 mutations are biologically relevant, as ectopic expression of mutant PREX2 accelerated tumor formation of immortalized human melanocytes in vivo. Thus, whole-genome sequencing of human melanoma tumors revealed genomic evidence of UV pathogenesis and discovered a new recurrently mutated gene in melanoma. PMID:22622578

  7. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  8. Genomic Instability in Cancer

    PubMed Central

    Abbas, Tarek; Keaton, Mignon A.; Dutta, Anindya

    2013-01-01

    One of the fundamental challenges facing the cell is to accurately copy its genetic material to daughter cells. When this process goes awry, genomic instability ensues in which genetic alterations ranging from nucleotide changes to chromosomal translocations and aneuploidy occur. Organisms have developed multiple mechanisms that can be classified into two major classes to ensure the fidelity of DNA replication. The first class includes mechanisms that prevent premature initiation of DNA replication and ensure that the genome is fully replicated once and only once during each division cycle. These include cyclin-dependent kinase (CDK)-dependent mechanisms and CDK-independent mechanisms. Although CDK-dependent mechanisms are largely conserved in eukaryotes, higher eukaryotes have evolved additional mechanisms that seem to play a larger role in preventing aberrant DNA replication and genome instability. The second class ensures that cells are able to respond to various cues that continuously threaten the integrity of the genome by initiating DNA-damage-dependent “checkpoints” and coordinating DNA damage repair mechanisms. Defects in the ability to safeguard against aberrant DNA replication and to respond to DNA damage contribute to genomic instability and the development of human malignancy. In this article, we summarize our current knowledge of how genomic instability arises, with a particular emphasis on how the DNA replication process can give rise to such instability. PMID:23335075

  9. | Office of Cancer Genomics

    Cancer.gov

    The advent of highly active anti-retroviral therapy (HAART) has considerably slowed disease progression from HIV to full-blown AIDS, thereby increasing the number of people living with HIV. It is not known why the incidence of certain cancers, but not others, increases in patients with HIV infection. Among the cancers with elevated prevalence is aggressive B-cell Non-Hodgkin lymphoma (NHL) and late-stage lung cancer.

  10. Genome Sequence of the Palaeopolyploid soybean

    SciTech Connect

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  11. Overview | Office of Cancer Genomics

    Cancer.gov

    The Cancer Target Discovery and Development (CTD2) initiative is a collaborative network of OCG-supported entities, or Centers. The program strives to functionally validate discoveries from large-scale genomic initiatives and advance them toward precision medicine through the efforts of the Centers and open access data sharing.

  12. Using comparative genomics to reorder the human genome sequence into a virtual sheep genome

    PubMed Central

    Dalrymple, Brian P; Kirkness, Ewen F; Nefedov, Mikhail; McWilliam, Sean; Ratnakumar, Abhirami; Barris, Wes; Zhao, Shaying; Shetty, Jyoti; Maddox, Jillian F; O'Grady, Margaret; Nicholas, Frank; Crawford, Allan M; Smith, Tim; de Jong, Pieter J; McEwan, John; Oddy, V Hutton; Cockett, Noelle E

    2007-01-01

    Background Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? Results A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the human, dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence hits to the cow and dog genomes were also converted to the equivalent human genome coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed. To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion process allowed us to transfer human genes and other genome features to the virtual sheep genome to display on a sheep genome browser. Conclusion We demonstrate that limited sequencing of BACs combined with positioning on a well assembled genome and integrating locations from other less well assembled genomes can yield extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are currently limited. PMID:17663790

  13. Sequencing and analysis of a genomic fragment provide an insight into the Dunaliella viridis genomic sequence.

    PubMed

    Sun, Xiao-Ming; Tang, Yuan-Ping; Meng, Xiang-Zong; Zhang, Wen-Wen; Li, Shan; Deng, Zhi-Rui; Xu, Zheng-Kai; Song, Ren-Tao

    2006-11-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)(n) type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features. PMID:17091199

  14. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    PubMed Central

    Yates, Lucy R; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Loo, Peter Van; Aas, Turid; Alexandrov, Ludmil B; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre-Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R; Sotiriou, Christos; Richardson, Andrea L; Lønning, Per Eystein; Wedge, David C; Campbell, Peter J

    2015-01-01

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient’s tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole genome and targeted sequencing to multiple samples from each of 50 patients’ tumors (total 303). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13/50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resisting chemotherapy and acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer. PMID:26099045

  15. The genome sequence of Schizosaccharomyces pombe.

    PubMed

    Wood, V; Gwilliam, R; Rajandream, M-A; Lyne, M; Lyne, R; Stewart, A; Sgouros, J; Peat, N; Hayles, J; Baker, S; Basham, D; Bowman, S; Brooks, K; Brown, D; Brown, S; Chillingworth, T; Churcher, C; Collins, M; Connor, R; Cronin, A; Davis, P; Feltwell, T; Fraser, A; Gentles, S; Goble, A; Hamlin, N; Harris, D; Hidalgo, J; Hodgson, G; Holroyd, S; Hornsby, T; Howarth, S; Huckle, E J; Hunt, S; Jagels, K; James, K; Jones, L; Jones, M; Leather, S; McDonald, S; McLean, J; Mooney, P; Moule, S; Mungall, K; Murphy, L; Niblett, D; Odell, C; Oliver, K; O'Neil, S; Pearson, D; Quail, M A; Rabbinowitsch, E; Rutherford, K; Rutter, S; Saunders, D; Seeger, K; Sharp, S; Skelton, J; Simmonds, M; Squares, R; Squares, S; Stevens, K; Taylor, K; Taylor, R G; Tivey, A; Walsh, S; Warren, T; Whitehead, S; Woodward, J; Volckaert, G; Aert, R; Robben, J; Grymonprez, B; Weltjens, I; Vanstreels, E; Rieger, M; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Düsterhöft, A; Fritzc, C; Holzer, E; Moestl, D; Hilbert, H; Borzym, K; Langer, I; Beck, A; Lehrach, H; Reinhardt, R; Pohl, T M; Eger, P; Zimmermann, W; Wedler, H; Wambutt, R; Purnelle, B; Goffeau, A; Cadieu, E; Dréano, S; Gloux, S; Lelaure, V; Mottier, S; Galibert, F; Aves, S J; Xiang, Z; Hunt, C; Moore, K; Hurst, S M; Lucas, M; Rochet, M; Gaillardin, C; Tallada, V A; Garzon, A; Thode, G; Daga, R R; Cruzado, L; Jimenez, J; Sánchez, M; del Rey, F; Benito, J; Domínguez, A; Revuelta, J L; Moreno, S; Armstrong, J; Forsburg, S L; Cerutti, L; Lowe, T; McCombie, W R; Paulsen, I; Potashkin, J; Shpakovski, G V; Ussery, D; Barrell, B G; Nurse, P; Cerrutti, L

    2002-02-21

    We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization. PMID:11859360

  16. Finishing the euchromatic sequence of the human genome

    E-print Network

    Brutlag, Doug

    foundation for biomedical research in the decades ahead. The Human Genome Project (HGP) was launched in 1990Finishing the euchromatic sequence of the human genome International Human Genome Sequencing ........................................................................................................................................................................................................................... The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich

  17. Genome sequence of the Brown Norway rat yields insights into

    E-print Network

    Payseur, Bret

    Genome sequence of the Brown Norway rat yields insights into mammalian evolution Rat Genome, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality `draft' covering over 90% of the genome

  18. Complete Genome Sequence Analysis of Bacillus subtilis T30.

    PubMed

    Xu, Shuang-Yong; Boitano, Matthew; Clark, Tyson A; Vincze, Tamas; Fomenkov, Alexey; Kumar, Sanjay; Too, Priscilla Hiu-Mei; Gonchar, Danila; Degtyarev, Sergey K; Roberts, Richard J

    2015-01-01

    The complete genome sequence of Bacillus subtilis T30 was determined by SMRT sequencing. The entire genome contains 4,138 predicted genes. The genome carries one intact prophage sequence (37.4 kb) similar to Bacillus phage SPBc2 and one incomplete prophage genome of 39.9 kb similar to Bacillus phage phi105. PMID:25953183

  19. Complete Genome Sequence Analysis of Bacillus subtilis T30

    PubMed Central

    Boitano, Matthew; Clark, Tyson A.; Vincze, Tamas; Fomenkov, Alexey; Kumar, Sanjay; Too, Priscilla Hiu-Mei; Gonchar, Danila; Degtyarev, Sergey K.

    2015-01-01

    The complete genome sequence of Bacillus subtilis T30 was determined by SMRT sequencing. The entire genome contains 4,138 predicted genes. The genome carries one intact prophage sequence (37.4 kb) similar to Bacillus phage SPBc2 and one incomplete prophage genome of 39.9 kb similar to Bacillus phage phi105. PMID:25953183

  20. An International Plan to Sequence the Onion Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

  1. Microbial species delineation using whole genome sequences

    PubMed Central

    Varghese, Neha J.; Mukherjee, Supratim; Ivanova, Natalia; Konstantinidis, Konstantinos T.; Mavrommatis, Kostas; Kyrpides, Nikos C.; Pati, Amrita

    2015-01-01

    Increased sequencing of microbial genomes has revealed that prevailing prokaryotic species assignments can be inconsistent with whole genome information for a significant number of species. The long-standing need for a systematic and scalable species assignment technique can be met by the genome-wide Average Nucleotide Identity (gANI) metric, which is widely acknowledged as a robust measure of genomic relatedness. In this work, we demonstrate that the combination of gANI and the alignment fraction (AF) between two genomes accurately reflects their genomic relatedness. We introduce an efficient implementation of AF,gANI and discuss its successful application to 86.5M genome pairs between 13,151 prokaryotic genomes assigned to 3032 species. Subsequently, by comparing the genome clusters obtained from complete linkage clustering of these pairs to existing taxonomy, we observed that nearly 18% of all prokaryotic species suffer from anomalies in species definition. Our results can be used to explore central questions such as whether microorganisms form a continuum of genetic diversity or distinct species represented by distinct genetic signatures. We propose that this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, be used to address previous inconsistencies in species classification and as the primary guide for new taxonomic species assignment, supplemented by the traditional polyphasic approach, as required. PMID:26150420

  2. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

  3. Genome Sequence of Phytophthora ramorum: Implications for Management1

    E-print Network

    157 Genome Sequence of Phytophthora ramorum: Implications for Management1 Brett Tyler2 , Sucheta A draft genome sequence has been determined for Phytophthora ramorum, together with a draft sequence of the soybean pathogen Phytophthora sojae. The P. ramorum genome was sequenced to a depth of 7-fold coverage

  4. International network of cancer genome projects

    PubMed Central

    2010-01-01

    The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumors from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of over 25,000 cancer genomes at the genomic, epigenomic, and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically-relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies. PMID:20393554

  5. Mapping and sequencing the human genome

    SciTech Connect

    1988-01-01

    Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

  6. AACR 2015: Pan-Cancer Analysis of Whole Genomes

    Cancer.gov

    The Pan-Cancer analysis of Whole Genomes (PCAWG) project of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) is co-ordinating analysis of more than 2,000 whole cancer genomes. Each genome is characterized through a suite of centralized algorithms, including alignment to the reference genome, standardized quality assessment and calling of all classes of somatic mutation.

  7. Genomic Sequencing George M. Church, Walter Gilbert

    E-print Network

    Church, George M.

    at deoxycytidines, and nucleic acid- protein interactions at single nucleotide resolution. How can we visualize copies of articles, and you may use content in the JSTOR archive only for your personal, non ABSTRACT Unique DNA sequences can be determined directly from mouse genomic DNA. A denaturing gel separates

  8. Complete Genome Sequence of Treponema pallidum, the

    E-print Network

    Salzberg, Steven

    Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete Claire M. Fraser,* Steven J and substantiates the considerable di- versity observed among pathogenic spirochetes. Venereal syphilis was first century with the age of exploration. Syphilis was ubiquitous by the 19th century and has been called

  9. Genomes and evolution From sequence to organism

    E-print Network

    Patel, Nipam H.

    Genomes and evolution From sequence to organism Editorial overview Evan E Eichler and Nipam H Patel. Nipam H Patel Depts. of Integrative Biology and Molecular Cell Biology, University of California, 3060 VLSB #3140, Berkeley, CA 94720-3140, USA e-mail: nipam@uclink.berkeley.edu URL: http

  10. Telomeric repeat-containing RNA/G-quadruplex-forming sequences cause genome-wide alteration of gene expression in human cancer cells in vivo

    PubMed Central

    Hirashima, Kyotaro; Seimiya, Hiroyuki

    2015-01-01

    Telomere erosion causes cell mortality, suggesting that longer telomeres enable more cell divisions. In telomerase-positive human cancer cells, however, telomeres are often kept shorter than those of surrounding normal tissues. Recently, we showed that cancer cell telomere elongation represses innate immune genes and promotes their differentiation in vivo. This implies that short telomeres contribute to cancer malignancy, but it is unclear how such genetic repression is caused by elongated telomeres. Here, we report that telomeric repeat-containing RNA (TERRA) induces a genome-wide alteration of gene expression in telomere-elongated cancer cells. Using three different cell lines, we found that telomere elongation up-regulates TERRA signal and down-regulates innate immune genes such as STAT1, ISG15 and OAS3 in vivo. Ectopic TERRA oligonucleotides repressed these genes even in cells with short telomeres under three-dimensional culture conditions. This appeared to occur from the action of G-quadruplexes (G4) in TERRA, because control oligonucleotides had no effect and a nontelomeric G4-forming oligonucleotide phenocopied the TERRA oligonucleotide. Telomere elongation and G4-forming oligonucleotides showed similar gene expression signatures. Most of the commonly suppressed genes were involved in the innate immune system and were up-regulated in various cancers. We propose that TERRA G4 counteracts cancer malignancy by suppressing innate immune genes. PMID:25653161

  11. Cancer Genomics: Diversity and Disparity Across Ethnicity and Geography.

    PubMed

    Tan, Daniel S W; Mok, Tony S K; Rebbeck, Timothy R

    2016-01-01

    Ethnic and geographic differences in cancer incidence, prognosis, and treatment outcomes can be attributed to diversity in the inherited (germline) and somatic genome. Although international large-scale sequencing efforts are beginning to unravel the genomic underpinnings of cancer traits, much remains to be known about the underlying mechanisms and determinants of genomic diversity. Carcinogenesis is a dynamic, complex phenomenon representing the interplay between genetic and environmental factors that results in divergent phenotypes across ethnicities and geography. For example, compared with whites, there is a higher incidence of prostate cancer among Africans and African Americans, and the disease is generally more aggressive and fatal. Genome-wide association studies have identified germline susceptibility loci that may account for differences between the African and non-African patients, but the lack of availability of appropriate cohorts for replication studies and the incomplete understanding of genomic architecture across populations pose major limitations. We further discuss the transformative potential of routine diagnostic evaluation for actionable somatic alterations, using lung cancer as an example, highlighting implications of population disparities, current hurdles in implementation, and the far-reaching potential of clinical genomics in enhancing cancer prevention, diagnosis, and treatment. As we enter the era of precision cancer medicine, a concerted multinational effort is key to addressing population and genomic diversity as well as overcoming barriers and geographical disparities in research and health care delivery. PMID:26578615

  12. Genome instability, cancer and aging

    PubMed Central

    Maslov, Alexander Y.; Vijg, Jan

    2015-01-01

    DNA damage-driven genome instability underlies the diversity of life forms generated by the evolutionary process but is detrimental to the somatic cells of individual organisms. The cellular response to DNA damage can be roughly divided in two parts. First, when damage is severe, programmed cell death may occur or, alternatively, temporary or permanent cell cycle arrest. This protects against cancer but can have negative effects on the long term, e.g., by depleting stem cell reservoirs. Second, damage can be repaired through one or more of the many sophisticated genome maintenance pathways. However, erroneous DNA repair and incomplete restoration of chromatin after damage is resolved, produce mutations and epimutations, respectively, both of which have been shown to accumulate with age. An increased burden of mutations and/or epimutations in aged tissues increases cancer risk and adversely affects gene transcriptional regulation, leading to progressive decline in organ function. Cellular degeneration and uncontrolled cell proliferation are both major hallmarks of aging. Despite the fact that one seems to exclude the other, they both may be driven by a common mechanism. Here, we review age related changes in the mammalian genome and their possible functional consequences, with special emphasis on genome instability in stem/progenitor cells. PMID:19344750

  13. Gambling on a shortcut to genome sequencing

    SciTech Connect

    Roberts, L.

    1991-06-21

    Almost from the start of the Human Genome Project, a debate has been raging over whether to sequence the entire human genome, all 3 billion bases, or just the genes - a mere 2% or 3% of the genome, and by far the most interesting part. In England, Sydney Brenner convinced the Medical Research Council (MRC) to start with the expressed genes, or complementary DNAs. But the US stance has been that the entire sequence is essential if we are to understand the blueprint of man. Craig Venter of the National Institute of Neurological Disorders and Stroke says that focusing on the expressed genes may be even more useful than expected. His strategy involves randomly selecting clones from cDNA libraries which theoretically contain all the genes that are switched on at a particular time in a particular tissue. Then the researchers sequence just a short stretch of each clone, about 400 to 500 bases, to create can expressed sequence tag or EST. The sequences of these ESTs are then stored in a database. Using that information, other researchers can then recreate that EST by using polymerase chain reaction techniques.

  14. Whole-genome sequencing in bacteriology: state of the art

    PubMed Central

    Dark, Michael J

    2013-01-01

    Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

  15. Whole-genome sequencing in bacteriology: state of the art.

    PubMed

    Dark, Michael J

    2013-01-01

    Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

  16. Draft Genome Sequence of Mycobacterium arupense Strain GUC1

    PubMed Central

    Greninger, Alexander L.; Cunningham, Gail; Yu, Joanna M.; Hsu, Elaine D.; Chiu, Charles Y.

    2015-01-01

    We report the draft genome sequence of Mycobacterium arupense strain GUC1 from a sputum sample of a patient with bronchiectasis. This is the first draft genome sequence of Mycobacterium arupense, a rapidly growing nonchromogenic mycobacteria. PMID:26067970

  17. Genome sequencing and analysis of the model grass Brachypodium distachyon

    E-print Network

    Green, Pamela

    ARTICLES Genome sequencing and analysis of the model grass Brachypodium distachyon The International Brachypodium Initiative* Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our

  18. | Office of Cancer Genomics

    Cancer.gov

    Caused by infection with human immunodeficiency virus (HIV), acquired immunodeficiency syndrome (AIDS) is a complex and devastating disease brought about by the systematic destruction of a person's immune response. A weakened immune system can lead to a variety of opportunistic infections in affected persons, as well as a distinct spectrum of tumors known as AIDS-defining cancers. Some of these malignancies, such as Kaposi sarcoma, are also observed in other immunocompromised populations, while others are seen at increased rates only in AIDS patients.

  19. Genome Sequencing Reveals a Phage in Helicobacter pylori

    PubMed Central

    Lehours, Philippe; Vale, Filipa F.; Bjursell, Magnus K.; Melefors, Ojar; Advani, Reza; Glavas, Steve; Guegueniat, Julia; Gontier, Etienne; Lacomme, Sabrina; Alves Matos, António; Menard, Armelle; Mégraud, Francis; Engstrand, Lars; Andersson, Anders F.

    2011-01-01

    ABSTRACT Helicobacter pylori chronically infects the gastric mucosa in more than half of the human population; in a subset of this population, its presence is associated with development of severe disease, such as gastric cancer. Genomic analysis of several strains has revealed an extensive H. pylori pan-genome, likely to grow as more genomes are sampled. Here we describe the draft genome sequence (63 contigs; 26× mean coverage) of H. pylori strain B45, isolated from a patient with gastric mucosa-associated lymphoid tissue (MALT) lymphoma. The major finding was a 24.6-kb prophage integrated in the bacterial genome. The prophage shares most of its genes (22/27) with prophage region II of Helicobacter acinonychis strain Sheeba. After UV treatment of liquid cultures, circular DNA carrying the prophage integrase gene could be detected, and intracellular tailed phage-like particles were observed in H. pylori cells by transmission electron microscopy, indicating that phage production can be induced from the prophage. PCR amplification and sequencing of the integrase gene from 341 H. pylori strains from different geographic regions revealed a high prevalence of the prophage (21.4%). Phylogenetic reconstruction showed four distinct clusters in the integrase gene, three of which tended to be specific for geographic regions. Our study implies that phages may play important roles in the ecology and evolution of H. pylori. PMID:22086490

  20. Integrative clinical genomics of advanced prostate cancer.

    PubMed

    Robinson, Dan; Van Allen, Eliezer M; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim; Bradley, Robert K; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P; Hussain, Maha; Feng, Felix Y; Tomlins, Scott A; Cooney, Kathleen A; Smith, David C; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E; Morris, Michael J; Solomon, Stephen B; Durack, Jeremy C; Reuter, Victor E; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T; Bowden, Michaela; Balk, Stephen P; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y; Mostaghel, Elahe A; Cheng, Heather H; Mulcahy, Hyojeong; True, Lawrence D; Plymate, Stephen R; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Perez-Lopez, Raquel; Demichelis, Francesca; Robinson, Brian D; Schiffman, Marc; Nanus, David M; Tagawa, Scott T; Sigaras, Alexandros; Eng, Kenneth W; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I; Scher, Howard I; Pienta, Kenneth J; Kantoff, Philip; de Bono, Johann S; Rubin, Mark A; Nelson, Peter S; Garraway, Levi A; Sawyers, Charles L; Chinnaiyan, Arul M

    2015-05-21

    Toward development of a precision medicine framework for metastatic, castration-resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole-exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53, and PTEN were frequent (40%-60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified new genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, ?-catenin, and ZBTB16/PLZF. Moreover, aberrations of BRCA2, BRCA1, and ATM were observed at substantially higher frequencies (19.3% overall) compared to those in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration, including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides clinically actionable information that could impact treatment decisions for these affected individuals. PMID:26000489

  1. Letter to the Editor Toward Sequencing Cotton (Gossypium) Genomes

    E-print Network

    Chee, Peng W.

    Letter to the Editor Toward Sequencing Cotton (Gossypium) Genomes Despite rapidly decreasing costs complex ge- nomes de novo. The cotton (Gossypium spp.) genomes represent a challenging case. To this end, a coalition of cotton genome scientists has developed a strategy for sequencing the cotton genomes, which

  2. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  3. Analysis of gains and losses of DNA sequences along all human chromosomes by comparative genomic hybridization implicates 6q and several other chromosomal sites as putative tumor suppressor gene loci in prostate cancer

    SciTech Connect

    Visakorpi, T.; Karhu, R.; Kallioniemi, A.

    1994-09-01

    Genetic changes associated with the development of prostate cancer are poorly known. We sought to identify regions that contain important genes for the development of prostate cancer by using comparative genomic hybridization (CGH) for genome-wide screening of gains and losses of DNA sequences. In CGH, differentially labeled tumor and normal DNAs are co-hybridized to normal metaphase spreads to visualize chromosomal regions with losses and gains of DNA sequences. Analysis of 31 uncultured primary prostate cancers showed that deletions predominated over gains with a ratio of 5:1. The most commonly deleted regions were 8p; 32% (minimal common region p12-pter), 13q; 32% (q21-q31), 6q; 22% (cen-q21), 16q; 19% (cen-q23), 18q; 19% (q22-qter) and 9p; 16%(p23-pter). Gain of the entire long arm of chromosome 8 was found in 6% of cases but no high-level amplifications were found in any of the specimens. Of the aberrations found by CGH, 6q represents a previously unreported, major site for deletion in prostate cancer. Analysis of loss of heterozygosity (LOH) was used to confirm the presence of 6q and other deletions found by CGH. LOH and CGH data showed an about 75% concordance. The significance of genetic aberrations in prostate cancer are being evaluated by correlating CGH findings with clinical outcome as well as by comparing genetic changes observed in the primary tumor with those found in recurrent lesions and metastases of the same patient.

  4. CGCI Investigators Reveal Comprehensive Landscape of Diffuse Large B-Cell Lymphoma (DLBCL) Genomes | Office of Cancer Genomics

    Cancer.gov

    Researchers from British Columbia Cancer Agency used whole genome sequencing to analyze 40 DLBCL cases and 13 cell lines in order to fill in the gaps of the complex landscape of DLBCL genomes. Their analysis, “Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing,” was published online in Blood on May 22. The authors are Ryan Morin, Marco Marra, and colleagues.

  5. CGCI Investigators Reveal Comprehensive Landscape of Diffuse Large B-Cell Lymphoma (DLBCL) Genomes | Office of Cancer Genomics

    Cancer.gov

    Researchers from British Columbia Cancer Agency used whole genome sequencing to analyze 40 DLBCL cases and 13 cell lines in order to fill in the gaps of the complex landscape of DLBCL genomes. Their analysis, “Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing,” was published online in Blood on May 22. The authors are Ryan Morin, Marco Marra, and colleagues.  

  6. Revealing the Complexity of Breast Cancer by Next Generation Sequencing.

    PubMed

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of "-omic" platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  7. Cancer Vulnerabilities Unveiled by Genomic Loss

    E-print Network

    Nijhawan, Deepak

    Due to genome instability, most cancers exhibit loss of regions containing tumor suppressor genes and collateral loss of other genes. To identify cancer-specific vulnerabilities that are the result of copy number losses, ...

  8. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi. PMID:26542221

  9. Simple sequence repeats in prokaryotic genomes

    PubMed Central

    Mrázek, Jan; Guo, Xiangxue; Shah, Apurva

    2007-01-01

    Simple sequence repeats (SSRs) in DNA sequences are composed of tandem iterations of short oligonucleotides and may have functional and/or structural properties that distinguish them from general DNA sequences. They are variable in length because of slip-strand mutations and may also affect local structure of the DNA molecule or the encoded proteins. Long SSRs (LSSRs) are common in eukaryotes but rare in most prokaryotes. In pathogens, SSRs can enhance antigenic variance of the pathogen population in a strategy that counteracts the host immune response. We analyze representations of SSRs in >300 prokaryotic genomes and report significant differences among different prokaryotes as well as among different types of SSRs. LSSRs composed of short oligonucleotides (1–4 bp length, designated LSSR1–4) are often found in host-adapted pathogens with reduced genomes that are not known to readily survive in a natural environment outside the host. In contrast, LSSRs composed of longer oligonucleotides (5–11 bp length, designated LSSR5–11) are found mostly in nonpathogens and opportunistic pathogens with large genomes. Comparisons among SSRs of different lengths suggest that LSSR1–4 are likely maintained by selection. This is consistent with the established role of some LSSR1–4 in enhancing antigenic variance. By contrast, abundance of LSSR5–11 in some genomes may reflect the SSRs' general tendency to expand rather than their specific role in the organisms' physiology. Differences among genomes in terms of SSR representations and their possible interpretations are discussed. PMID:17485665

  10. Whole genome sequencing of matched primary and metastatic acral melanomas

    PubMed Central

    Turajlic, Samra; Furney, Simon J.; Lambros, Maryou B.; Mitsopoulos, Costas; Kozarewa, Iwanka; Geyer, Felipe C.; MacKay, Alan; Hakas, Jarle; Zvelebil, Marketa; Lord, Christopher J.; Ashworth, Alan; Thomas, Meirion; Stamp, Gordon; Larkin, James; Reis-Filho, Jorge S.; Marais, Richard

    2012-01-01

    Next generation sequencing has enabled systematic discovery of mutational spectra in cancer samples. Here, we used whole genome sequencing to characterize somatic mutations and structural variation in a primary acral melanoma and its lymph node metastasis. Our data show that the somatic mutational rates in this acral melanoma sample pair were more comparable to the rates reported in cancer genomes not associated with mutagenic exposure than in the genome of a melanoma cell line or the transcriptome of melanoma short-term cultures. Despite the perception that acral skin is sun-protected, the dominant mutational signature in these samples is compatible with damage due to ultraviolet light exposure. A nonsense mutation in ERCC5 discovered in both the primary and metastatic tumors could also have contributed to the mutational signature through accumulation of unrepaired dipyrimidine lesions. However, evidence of transcription-coupled repair was suggested by the lower mutational rate in the transcribed regions and expressed genes. The primary and the metastasis are highly similar at the level of global gene copy number alterations, loss of heterozygosity and single nucleotide variation (SNV). Furthermore, the majority of the SNVs in the primary tumor were propagated in the metastasis and one nonsynonymous coding SNV and one splice site mutation appeared to arise de novo in the metastatic lesion. PMID:22183965

  11. Assessing the Costs and Cost-Effectiveness of Genomic Sequencing.

    PubMed

    Christensen, Kurt D; Dukhovny, Dmitry; Siebert, Uwe; Green, Robert C

    2015-01-01

    Despite dramatic drops in DNA sequencing costs, concerns are great that the integration of genomic sequencing into clinical settings will drastically increase health care expenditures. This commentary presents an overview of what is known about the costs and cost-effectiveness of genomic sequencing. We discuss the cost of germline genomic sequencing, addressing factors that have facilitated the decrease in sequencing costs to date and anticipating the factors that will drive sequencing costs in the future. We then address the cost-effectiveness of diagnostic and pharmacogenomic applications of genomic sequencing, with an emphasis on the implications for secondary findings disclosure and the integration of genomic sequencing into general patient care. Throughout, we ground the discussion by describing efforts in the MedSeq Project, an ongoing randomized controlled clinical trial, to understand the costs and cost-effectiveness of integrating whole genome sequencing into cardiology and primary care settings. PMID:26690481

  12. Complete Genome Sequences of 138 Mycobacteriophages

    PubMed Central

    2012-01-01

    Bacteriophages are the most numerous biological entities in the biosphere, and although their genetic diversity is high, it remains ill defined. Mycobacteriophages—the viruses of mycobacterial hosts—provide insights into this diversity as well as tools for manipulating Mycobacterium tuberculosis. We report here the complete genome sequences of 138 new mycobacteriophages, which—together with the 83 mycobacteriophages previously reported—represent the largest collection of phages known to infect a single common host, Mycobacterium smegmatis mc2 155. PMID:22282335

  13. Why Assembling Plant Genome Sequences Is So Challenging

    PubMed Central

    Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

    2012-01-01

    In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

  14. The Cancer Genome Atlas Data Portal Now Available

    Cancer.gov

    Published on Office of Cancer Genomics (http://ocg.cancer.gov) Home > The Cancer Genome Atlas Data Portal Now Available The Cancer Genome Atlas Data Portal Now Available [1] October 01, 2007 We provide 3 ways to download data: The Cancer Genome Atlas

  15. The Cancer Genome Atlas Data Portal Now Available

    Cancer.gov

    Published on Office of Cancer Genomics (https://ocg.cancer.gov) Home > The Cancer Genome Atlas Data Portal Now Available The Cancer Genome Atlas Data Portal Now Available [1] October 01, 2007 We provide 3 ways to download data: The Cancer Genome Atlas

  16. Translational genomics in cancer research: converting profiles into personalized cancer medicine

    PubMed Central

    Patel, Lalit; Parker, Brittany; Yang, Da; Zhang, Wei

    2013-01-01

    Cancer genomics is a rapidly growing discipline in which the genetic molecular basis of malignancy is studied at the scale of whole genomes. While the discipline has been successful with respect to identifying specific oncogenes and tumor suppressors involved in oncogenesis, it is also challenging our approach to managing patients suffering from this deadly disease. Specifically cancer genomics is driving clinical oncology to take a more molecular approach to diagnosis, prognostication, and treatment selection. We review here recent work undertaken in cancer genomics with an emphasis on translation of genomic findings. Finally, we discuss scientific challenges and research opportunities emerging from findings derived through analysis of tumors with high-depth sequencing. PMID:24349831

  17. A decision support framework for genomically informed investigational cancer therapy.

    PubMed

    Meric-Bernstam, Funda; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann Marie; Brusco, Lauren; Chen, Ken; Routbort, Mark; Patel, Keyur P; Zeng, Jia; Kopetz, Scott; Davies, Michael A; Piha-Paul, Sarina A; Hong, David S; Eterovic, Agda Karina; Tsimberidou, Apostolia M; Broaddus, Russell; Bernstam, Elmer V; Shaw, Kenna R; Mendelsohn, John; Mills, Gordon B

    2015-07-01

    Rapidly improving understanding of molecular oncology, emerging novel therapeutics, and increasingly available and affordable next-generation sequencing have created an opportunity for delivering genomically informed personalized cancer therapy. However, to implement genomically informed therapy requires that a clinician interpret the patient's molecular profile, including molecular characterization of the tumor and the patient's germline DNA. In this Commentary, we review existing data and tools for precision oncology and present a framework for reviewing the available biomedical literature on therapeutic implications of genomic alterations. Genomic alterations, including mutations, insertions/deletions, fusions, and copy number changes, need to be curated in terms of the likelihood that they alter the function of a "cancer gene" at the level of a specific variant in order to discriminate so-called "drivers" from "passengers." Alterations that are targetable either directly or indirectly with approved or investigational therapies are potentially "actionable." At this time, evidence linking predictive biomarkers to therapies is strong for only a few genomic markers in the context of specific cancer types. For these genomic alterations in other diseases and for other genomic alterations, the clinical data are either absent or insufficient to support routine clinical implementation of biomarker-based therapy. However, there is great interest in optimally matching patients to early-phase clinical trials. Thus, we need accessible, comprehensive, and frequently updated knowledge bases that describe genomic changes and their clinical implications, as well as continued education of clinicians and patients. PMID:25863335

  18. NIH Announces Two Integral Components of The Cancer Genome Atlas Pilot Project | Office of Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both parts of the National Institutes of Health (NIH), today announced another two of the components of The Cancer Genome Atlas (TCGA) Pilot Project, a three-year, $100 million collaboration to test the feasibility of using large-scale genome analysis technologies to identify important genetic changes involved in cancer. Lung, brain (glioblastoma), and ovarian cancers have been chosen as the tumors for study by TCGA Pilot Project.

  19. A sequence-based survey of the complex structural organization of tumor genomes

    PubMed Central

    Raphael, Benjamin J; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V; Trask, Barbara J; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J; Mills, Gordon B; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela L; Tao, Quanzhou; Aerni, Sarah J; Brown, Raymond P; Bashir, Ali; Gray, Joe W; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M; Collins, Colin C

    2008-01-01

    Background The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using end sequencing profiling, which relies on paired-end sequencing of cloned tumor genomes. Results In the present study brain, breast, ovary, and prostate tumors, along with three breast cancer cell lines, were surveyed using end sequencing profiling, yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization confirmed translocations and complex tumor genome structures that include co-amplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms revealed candidate somatic mutations and an elevated rate of novel single nucleotide polymorphisms in an ovarian tumor. Conclusion These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than was previously appreciated and that genomic fusions, including fusion transcripts and proteins, may be common, possibly yielding tumor-specific biomarkers and therapeutic targets. PMID:18364049

  20. GENOMIC DIVERGENCES AMONG CATTLE, DOG, AND HUMAN ESTIMATED FROM LARGE-SCALE ALIGNMENTS OF GENOMIC SEQUENCES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We performed a detailed analysis of genomic divergences based on large-scale comparison of 11 Mb of genomic sequence from cattle, human and dog. Using human and dog genome assemblies as references, optimal 3-way global alignments were constructed for 84 cattle large (>50 kb) genomic sequence clones...

  1. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  2. Enhancing cancer clonality analysis with integrative genomics

    PubMed Central

    2015-01-01

    Introduction It is understood that cancer is a clonal disease initiated by a single cell, and that metastasis, which is the spread of cancer from the primary site, is also initiated by a single cell. The seemingly natural capability of cancer to adapt dynamically in a Darwinian manner is a primary reason for therapeutic failures. Survival advantages may be induced by cancer therapies and also occur as a result of inherent cell and microenvironmental factors. The selected "more fit" clones outmatch their competition and then become dominant in the tumor via propagation of progeny. This clonal expansion leads to relapse, therapeutic resistance and eventually death. The goal of this study is to develop and demonstrate a more detailed clonality approach by utilizing integrative genomics. Methods Patient tumor samples were profiled by Whole Exome Sequencing (WES) and RNA-seq on an Illumina HiSeq 2500 and methylation profiling was performed on the Illumina Infinium 450K array. STAR and the Haplotype Caller were used for RNA-seq processing. Custom approaches were used for the integration of the multi-omic datasets. Results Reported are major enhancements to CloneViz, which now provides capabilities enabling a formal tumor multi-dimensional clonality analysis by integrating: i) DNA mutations, ii) RNA expressed mutations, and iii) DNA methylation data. RNA and DNA methylation integration were not previously possible, by CloneViz (previous version) or any other clonality method to date. This new approach, named iCloneViz (integrated CloneViz) employs visualization and quantitative methods, revealing an integrative genomic mutational dissection and traceability (DNA, RNA, epigenetics) thru the different layers of molecular structures. Conclusion The iCloneViz approach can be used for analysis of clonal evolution and mutational dynamics of multi-omic data sets. Revealing tumor clonal complexity in an integrative and quantitative manner facilitates improved mutational characterization, understanding, and therapeutic assignments. PMID:26424171

  3. Optimizing the BACEnd Strategy for Sequencing the Human Genome

    E-print Network

    Shamir, Ron

    University, Tel Aviv, 69978, Israel. 1 #12; 1 Introduction With the Human Genome Project moving from the map sequencing has become central. The classical strategy set forth by the founders of the Human Genome ProjectOptimizing the BAC­End Strategy for Sequencing the Human Genome Richard M. Karp \\Lambda Ron Shamir

  4. DATABASE Open Access Whole genome sequencing of peach (Prunus

    E-print Network

    Crisosto, Carlos H.

    DATABASE Open Access Whole genome sequencing of peach (Prunus persica L.) for SNP identification high frequency SNPs distributed throughout the peach genome is described. Three peach genomes were `Lovell' peach sequence as well as sufficient depth of coverage for `in silico' SNP discovery. Description

  5. Genome sequence of the Pea Aphid Acyrthosiphon pisum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The International aphid genome consortium, IAGC, herein presents the 464 Mb draft genome assembly sequence of the pea aphid Acyrthosiphon pisum. This is the first published whole genome sequence from the diverse assemblage of hemimetabolous insects, providing an outgroup to the multiple published g...

  6. Genomic Sequence Is Highly Predictive of Local Nucleosome Depletion

    E-print Network

    Yuan, Guo-Cheng "GC"

    in human. Citation: Yuan G-C, Liu JS (2008) Genomic sequence is highly predictive of local nucleosome with nucleosome binding may offer valuable insight. In addition, as high-resolution mapping of genomeGenomic Sequence Is Highly Predictive of Local Nucleosome Depletion Guo-Cheng Yuan1,2,* , Jun S

  7. Complete Genome Sequence of the Embu Virus Strain SPAn880

    PubMed Central

    Antwerpen, Markus; Georgi, Enrico; Vette, Philipp; Zoeller, Gudrun; Meyer, Hermann

    2014-01-01

    We report the complete genome sequence of the Embu virus. The genome consists of 185,139 bp and is nearly identical to that of the Cotia virus. This is the first report on the Embu virus genome sequence, which has been considered an unclassified poxvirus until now. PMID:25477400

  8. Draft Genome Sequence of the Fungus Trametes hirsuta 072

    PubMed Central

    Tyazhelova, Tatiana V.; Moiseenko, Konstantin V.; Vasina, Daria V.; Mosunova, Olga V.; Fedorova, Tatiana V.; Maloshenok, Lilya G.; Landesman, Elena O.; Bruskin, Sergei A.; Psurtseva, Nadezhda V.; Slesarev, Alexei I.; Kozyavkin, Sergei A.; Koroleva, Olga V.

    2015-01-01

    A standard draft genome sequence of the white rot saprotrophic fungus Trametes hirsuta 072 (Basidiomycota, Polyporales) is presented. The genome sequence contains about 33.6 Mb assembled in 141 scaffolds with a G+C content of ~57.6%. The draft genome annotation predicts 14,598 putative protein-coding open reading frames (ORFs). PMID:26586872

  9. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    PubMed Central

    Tattini, Lorenzo; D’Aurizio, Romina; Magi, Alberto

    2015-01-01

    Structural variants are genomic rearrangements larger than 50?bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events. PMID:26161383

  10. Genomic Arrays: Tools for cancer gene discovery

    E-print Network

    Roberts, Ian

    2008-06-26

    Genomic gains (oncogenes) Genomic losses (tumour suppressor genes) Applications Research ? disease gene discovery Clinical ? diagnostic tests Comparative genomic hybridisation Tumour DNA (Test) Normal DNA (Reference) + Available probe GAIN: More test... stream_source_info Ian_Roberts.ppt.txt stream_content_type text/plain stream_size 3455 Content-Encoding UTF-8 stream_name Ian_Roberts.ppt.txt Content-Type text/plain; charset=UTF-8 Genomic Arrays: Tools for cancer gene discovery...

  11. Toward a Comprehensive Genomic Analysis of Cancer

    Cancer.gov

    The National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) convened a "Toward a Comprehensive Genomic Analysis of Cancer" workshop in Washington, D.C. This workshop brought together physicians, basic scientists and other members of the U.S. and international cancer communities to assist in outlining the most effective strategies for the development of a successful project. Information about this workshop is reported in the Executive Summary.

  12. Complete genome sequence of bacteriophage T5.

    PubMed

    Wang, Jianbin; Jiang, Yan; Vincent, Myriam; Sun, Yongqiao; Yu, Hong; Wang, Jing; Bao, Qiyu; Kong, Huimin; Hu, Songnian

    2005-02-01

    The 121,752-bp genome sequence of bacteriophage T5 was determined; the linear, double-stranded DNA is nicked in one of the strands and has large direct terminal repeats of 10,139 bp (8.3%) at both ends. The genome structure is consistently arranged according to its lytic life cycle. Of the 168 potential open reading frames (ORFs), 61 were annotated; these annotated ORFs are mainly enzymes involved in phage DNA replication, repair, and nucleotide metabolism. At least five endonucleases that believed to help inducing nicks in T5 genomic DNA, and a DNA ligase gene was found to be split into two separate ORFs. Analysis of T5 early promoters suggests a probable motif AAA{3, 4 T}nTTGCTT{17, 18 n}TATAATA{12, 13 W}{10 R} for strong promoters that may strengthen the step modification of host RNA polymerase, and thus control transcription of phage DNA. The distinct protein domain profile and a mosaic genome structure suggest an origin from the common genetic pool. PMID:15661140

  13. Nucleotide sequence of human endogenous retrovirus genome related to the mouse mammary tumor virus genome.

    PubMed Central

    Ono, M; Yasunaga, T; Miyata, T; Ushikubo, H

    1986-01-01

    We determined the complete nucleotide sequence of the human endogenous retrovirus genome HERV-K10 isolated as the sequence homologous to the Syrian hamster intracisternal A-particle (type A retrovirus) genome. HERV-K10 is 9,179 base pairs long with long terminal repeats of 968 base pairs at both ends; a sequence 290 base pairs long, however, was found to be deleted. It was concluded that a composite genome having the 290-base-pair fragment is the prototype HERV-K provirus gag (666 codons), protease (334 codons), pol (937 codons), and env (618 codons) genes. The size of the protease gene product of HERV-K is essentially the same as that of A- and D-type oncoviruses but nearly twice that of other retroviruses. A comparison of the deduced amino acid sequences encoded by the pol region showed HERV-K to be closely related to types A and D retroviruses and even more so to type B retrovirus. It was noted that the env gene product of HERV-K structurally resembles the mouse mammary tumor virus (type B retrovirus) env protein, and the possible expression of the HERV-K env gene in human breast cancer cells is discussed. PMID:3021993

  14. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  15. TAG Sequence Identification of Genomic Regions Using TAGdb.

    PubMed

    Ruperao, Pradeep

    2016-01-01

    Second-generation sequencing (SGS) technology has enabled the sequencing of genomes and identification of genes. However, large complex plant genomes remain particularly difficult for de novo assembly. Access to the vast quantity of raw sequence data may facilitate discoveries; however the volume of this data makes access difficult. This chapter discusses the Web-based tool TAGdb that enables researchers to identify paired read second-generation DNA sequence data that share identity with a submitted query sequence. The identified reads can be used for PCR amplification of genomic regions to identify genes and promoters without the need for genome assembly. PMID:26519409

  16. Complete genome sequence of Methanocorpusculum labreanum type strain Z

    SciTech Connect

    Anderson, Iain; Sieprawska-Lupa, Magdalena; Goltsman, Eugene; Lapidus, Alla L.; Copeland, A; Glavina Del Rio, Tijana; Tice, Hope; Dalin, Eileen; Barry, Kerrie; Pitluck, Sam; Hauser, Loren John; Land, Miriam L; Lucas, Susan; Richardson, P M; Whitman, W. B.; Kyrpides, Nikos C

    2009-01-01

    Methanocorpusculum labreanum is a methanogen belonging to the order Methanomicrobiales within the archaeal phylum Euryarchaeota. The type strain Z was isolated from surface sediments of Tar Pit Lake in the La Brea Tar Pits in Los Angeles, California. M. labreanum is of phylogenetic interest because at the time the sequencing project began only one genome had previously been sequenced from the order Methanomicrobiales. We report here the complete genome sequence of M. labreanum type strain Z and its annotation. This is part of a 2006 Joint Genome Institute Community Sequencing Program project to sequence genomes of diverse Archaea.

  17. Complete genome sequence of Methanoculleus marisnigri type strain JR1

    SciTech Connect

    Anderson, Iain; Sieprawska-Lupa, Magdalena; Goltsman, Eugene; Lapidus, Alla L.; Copeland, A; Glavina Del Rio, Tijana; Tice, Hope; Dalin, Eileen; Barry, Kerrie; Saunders, Elizabeth H; Han, Cliff; Brettin, Tom; Detter, J. Chris; Bruce, David; Mikhailova, Natalia; Pitluck, Sam; Hauser, Loren John; Land, Miriam L; Lucas, Susan; Richardson, P M; Whitman, W. B.; Kyrpides, Nikos C

    2009-01-01

    Methanoculleus marisnigri Romesser et al. 1981 is a methanogen belonging to the order Methanomicrobiales within the archaeal phylum Euryarchaeota. The type strain, JR1, was isolated from anoxic sediments of the Black Sea. M. marisnigri is of phylogenetic interest because at the time the sequencing project began only one genome had previously been sequenced from the order Methanomicrobiales. We report here the complete genome sequence of M. marisnigri type strain JR1 and its annotation. This is part of a Joint Genome Institute 2006 Community Sequencing Program to sequence genomes of diverse Archaea.

  18. Simple sequence repeats in bryophyte mitochondrial genomes.

    PubMed

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2016-01-01

    Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

  19. Complete mitochondrial genome sequence of Nectogale elegans.

    PubMed

    Huang, Ting; Yan, Chaochao; Tan, Zheng; Tu, Feiyun; Yue, Bisong; Zhang, Xiuyue

    2014-08-01

    The elegant water shrew (Nectogale elegans) belongs to the family Soricidae, and distributes in northern South Asia, central and southern China and northern Southeast Asia. In this study, the complete mitochondrial genome of N. elegans was sequenced. It was determined to be 17,460 bases, and included 13 protein-coding genes (PCGs), 22 tRNA genes, 2 ribosomal RNA genes and one non-coding region, which is similar to other mammalian mitochondrial genomes. Bayesian inference and maximum likelihood methods were used to construct phylogenetic trees based on 12 heavy-strand concatenated PCGs. Phylogenetic analyses further confirmed that Crocidurinae diverged prior to Soricinae, and Sorex unguiculatus differentiated earlier than N. elegans. PMID:23795853

  20. Initial sequencing and comparative analysis of the mouse genome

    E-print Network

    Hardison, Ross C.

    and knockin techniques17­22 . For these and other reasons, the Human Genome Project (HGP) recognized from its ........................................................................................................................................................................................................................... The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from

  1. Genome sequencing in microfabricated high-density picolitre reactors

    E-print Network

    using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show. Large- scale sequencing projects, including whole-genome sequencing, have usually required the cloning or capillary electrophoresis. Current estimates put the cost of sequencing a human genome between $10 million

  2. Genome Sequence of Stachybotrys chartarum Strain 51-11.

    PubMed

    Betancourt, Doris A; Dean, Timothy R; Kim, Jean; Levy, Josh

    2015-01-01

    The Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina HiSeq 2000 and PacBio technologies. Since S. chartarum has been implicated as having health impacts within water-damaged buildings, any information extracted from the genomic sequence data relating to toxins or the metabolism of the fungus might be useful. PMID:26430036

  3. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  4. Identification and annotation of repetitive sequences in fungal genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cheaper and faster sequencing technologies have fundamentally changed the pace of genome sequencing projects and have contributed to the ever-increasing volume of genomic data. This has been paralleled by an increase in computational power and resources to process and translate raw sequence data int...

  5. Whole-Genome Chromatin IP Sequencing (ChIP-Seq)

    E-print Network

    Kopp, Artyom

    ILLUMINA® SEQUENCING Whole-Genome Chromatin IP Sequencing (ChIP-Seq) Illumina ChIP-Seq combines-associated proteins. Illumina ChIP-Seq technology precisely and cost-effectively maps global binding sites. The powerful Illumina Whole-Genome Chromatin IP Sequencing (ChIP-Seq) application allows researchers to easily

  6. Genome Sequence of Stachybotrys chartarum Strain 51-11

    PubMed Central

    Kim, Jean; Levy, Josh

    2015-01-01

    The Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina HiSeq 2000 and PacBio technologies. Since S. chartarum has been implicated as having health impacts within water-damaged buildings, any information extracted from the genomic sequence data relating to toxins or the metabolism of the fungus might be useful. PMID:26430036

  7. Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer

    SciTech Connect

    Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

    2009-06-18

    Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes to cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also reveal the molecular differences between cancer and normal that may be exploited to therapeutic benefit or that provide targets for molecular assays that may enable early cancer detection, and predict individual disease progression or response to treatment. This chapter reviews current and future directions in genome analysis and summarizes studies that provide insights into breast cancer pathophysiology or that suggest strategies to improve breast cancer management.

  8. Initial impact of the sequencing of the human genome

    E-print Network

    Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

    The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

  9. Current challenges in de novo plant genome sequencing and assembly

    PubMed Central

    2012-01-01

    Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community. PMID:22546054

  10. A sequence-based survey of the complex structural organization of tumor genomes

    SciTech Connect

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  11. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  12. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    SciTech Connect

    Yasawong, Montri; Teshima, Hazuki; Lapidus, Alla L.; Nolan, Matt; Lucas, Susan; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Detter, J. Chris; Tapia, Roxanne; Han, Cliff; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Rohde, Manfred; Sikorski, Johannes; Pukall, Rudiger; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Draft Genome Sequences of Klebsiella variicola Plant Isolates

    PubMed Central

    Martínez-Romero, Esperanza; Silva-Sanchez, Jesús; Barrios, Humberto; Rodríguez-Medina, Nadia; Martínez-Barnetche, Jesús; Téllez-Sosa, Juan; Gómez-Barreto, Rosa Elena

    2015-01-01

    Three endophytic Klebsiella variicola isolates—T29A, 3, and 6A2, obtained from sugar cane stem, maize shoots, and banana leaves, respectively—were used for whole-genome sequencing. Here, we report the draft genome sequences of circular chromosomes and plasmids. The genomes contain plant colonization and cellulases genes. This study will help toward understanding the genomic basis of K. variicola interaction with plant hosts. PMID:26358599

  14. Multigene amplification and massively parallel sequencing for cancer mutation discovery

    PubMed Central

    Dahl, Fredrik; Stenberg, Johan; Fredriksson, Simon; Welch, Katrina; Zhang, Michael; Nilsson, Mats; Bicknell, David; Bodmer, Walter F.; Davis, Ronald W.; Ji, Hanlee

    2007-01-01

    We have developed a procedure for massively parallel resequencing of multiple human genes by combining a highly multiplexed and target-specific amplification process with a high-throughput parallel sequencing technology. The amplification process is based on oligonucleotide constructs, called selectors, that guide the circularization of specific DNA target regions. Subsequently, the circularized target sequences are amplified in multiplex and analyzed by using a highly parallel sequencing-by-synthesis technology. As a proof-of-concept study, we demonstrate parallel resequencing of 10 cancer genes covering 177 exons with average sequence coverage per sample of 93%. Seven cancer cell lines and one normal genomic DNA sample were studied with multiple mutations and polymorphisms identified among the 10 genes. Mutations and polymorphisms in the TP53 gene were confirmed by traditional sequencing. PMID:17517648

  15. Rapid modelling of cooperating genetic events in cancer through somatic genome editing

    E-print Network

    Papagiannakopoulos, Thales

    Cancer is a multistep process that involves mutations and other alterations in oncogenes and tumour suppressor genes. Genome sequencing studies have identified a large collection of genetic alterations that occur in human ...

  16. Genome sequencing and annotation of Morganella sp. SA36

    PubMed Central

    Selim, Samy; Hassan, Sherif; Hagagy, Nashwa

    2015-01-01

    We report draft genome sequence of Morganella sp. Strain SA36, isolated from water spring in Aljouf region, Saudi Arabia. The draft genome size is 2,564,439 bp with a G + C content of 51.1% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA). The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDNQ00000000.

  17. Complete Genome Sequence of Corynebacterium pseudotuberculosis Strain 12C

    PubMed Central

    Sousa, Thiago Jesus; Mariano, Diego; Parise, Doglas; Parise, Mariana; Viana, Marcus Vinicius Canário; Guimarães, Luis Carlos; Benevides, Leandro Jesus; Rocha, Flávia; Bagano, Priscilla; Ramos, Rommel; Silva, Artur; Figueiredo, Henrique; Almeida, Sintia

    2015-01-01

    We present here the complete genome sequence of Corynebacterium pseudotuberculosis strain 12C, isolated from a sheep abscess in the Brazil. The sequencing was performed with the Ion Torrent Personal Genome Machine (PGM) system, a fragment library, and a coverage of ~48-fold. The genome presented is a circular chromosome with 2,337,451 bp in length, 2,119 coding sequences, 12 rRNAs, 49 tRNAs, and a G+C content of 52.83%. PMID:26184935

  18. Diversity through duplication: whole-genome sequencing reveals novel gene retrocopies in the human population.

    PubMed

    Richardson, Sandra R; Salvador-Palomeque, Carmen; Faulkner, Geoffrey J

    2014-05-01

    Gene retrocopies are generated by reverse transcription and genomic integration of mRNA. As such, retrocopies present an important exception to the central dogma of molecular biology, and have substantially impacted the functional landscape of the metazoan genome. While an estimated 8,000-17,000 retrocopies exist in the human genome reference sequence, the extent of variation between individuals in terms of retrocopy content has remained largely unexplored. Three recent studies by Abyzov et al., Ewing et al. and Schrider et al. have exploited 1,000 Genomes Project Consortium data, as well as other sources of whole-genome sequencing data, to uncover novel gene retrocopies. Here, we compare the methods and results of these three studies, highlight the impact of retrocopies in human diversity and genome evolution, and speculate on the potential for somatic gene retrocopies to impact cancer etiology and genetic diversity among individual neurons in the mammalian brain. PMID:24615986

  19. The genomic complexity of primary human prostate cancer

    PubMed Central

    Berger, Michael F.; Lawrence, Michael S.; Demichelis, Francesca; Drier, Yotam; Cibulskis, Kristian; Sivachenko, Andrey Y.; Sboner, Andrea; Esgueva, Raquel; Pflueger, Dorothee; Sougnez, Carrie; Onofrio, Robert; Carter, Scott L.; Park, Kyung; Habegger, Lukas; Ambrogio, Lauren; Fennell, Timothy; Parkin, Melissa; Saksena, Gordon; Voet, Douglas; Ramos, Alex H.; Pugh, Trevor J.; Wilkinson, Jane; Fisher, Sheila; Winckler, Wendy; Mahan, Scott; Ardlie, Kristin; Baldwin, Jennifer; Simons, Jonathan W.; Kitabayashi, Naoki; MacDonald, Theresa Y.; Kantoff, Philip W.; Chin, Lynda; Gabriel, Stacey B.; Gerstein, Mark B.; Golub, Todd R.; Meyerson, Matthew; Tewari, Ashutosh; Lander, Eric S.; Getz, Gad; Rubin, Mark A.; Garraway, Levi A.

    2010-01-01

    Prostate cancer is the second most common cause of male cancer deaths in the United States. Here we present the complete sequence of seven primary prostate cancers and their paired normal counterparts. Several tumors contained complex chains of balanced rearrangements that occurred within or adjacent to known cancer genes. Rearrangement breakpoints were enriched near open chromatin, androgen receptor and ERG DNA binding sites in the setting of the ETS gene fusion TMPRSS2-ERG, but inversely correlated with these regions in tumors lacking ETS fusions. This observation suggests a link between chromatin or transcriptional regulation and the genesis of genomic aberrations. Three tumors contained rearrangements that disrupted CADM2, and four harbored events disrupting either PTEN (unbalanced events), a prostate tumor suppressor, or MAGI2 (balanced events), a PTEN interacting protein not previously implicated in prostate tumorigenesis. Thus, genomic rearrangements may arise from transcriptional or chromatin aberrancies to engage prostate tumorigenic mechanisms. PMID:21307934

  20. Rapid whole genome sequencing and precision neonatology.

    PubMed

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care. PMID:26521050

  1. Endometrial and acute myeloid leukemia cancer genomes characterized

    Cancer.gov

    Two studies from The Cancer Genome Atlas (TCGA) program reveal details about the genomic landscapes of acute myeloid leukemia (AML) and endometrial cancer. Both provide new insights into the molecular underpinnings of these cancers with the potential to i

  2. AACR 2014: NCI/NIH-Sponsored Session: Large-Scale Genomics Data for the Research Community through the NCI Center for Cancer Genomics

    Cancer.gov

    The NCI’s Center for Cancer Genomics (CCG), which includes the Office of Cancer Genomics and The Cancer Genome Atlas Program Office, provides the research community access to large-scale molecular characterization data, which is largely sequence-based. CCG programs aim to improve patient outcome through identification of valid molecular targets and associated molecular markers (prognostic or diagnostic), in and across diseases investigated, which should ultimately lead to the rapid development of novel, more effective therapies.

  3. Sequencing and Assembly of the 22-Gb Loblolly Pine Genome

    PubMed Central

    Zimin, Aleksey; Stevens, Kristian A.; Crepeau, Marc W.; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L.; de Jong, Pieter J.; Neale, David B.; Salzberg, Steven L.; Yorke, James A.; Langley, Charles H.

    2014-01-01

    Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer “super-reads,” rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp. PMID:24653210

  4. Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken

    E-print Network

    Campbell, A. Malcolm

    in real time and includes tutorials detailing the complexities of genome assembly. With PHAST, students such as genome assembly. Key Words: Genome assembly; bioinformatics; computational biology; teaching tool. Genome- stand genome sequencing and assembly. Objectives PHAST (Phage Assembly Suite and Tutorial; http

  5. The reference genome sequence of Saccharomyces cerevisiae: then and now.

    PubMed

    Engel, Stacia R; Dietrich, Fred S; Fisk, Dianna G; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C; Dwight, Selina S; Hitz, Benjamin C; Karra, Kalpana; Nash, Robert S; Weng, Shuai; Wong, Edith D; Lloyd, Paul; Skrzypek, Marek S; Miyasato, Stuart R; Simison, Matt; Cherry, J Michael

    2014-03-01

    The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called "S288C 2010," was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639

  6. All Resources | Office of Cancer Genomics

    Cancer.gov

    CGAP generated a wide range of genomics data on cancerous cells that are accessible through easy-to-use online tools. Researchers, educators, and students can find "in silico" answers to biological questions through the CGAP website.

  7. A taste of pineapple evolution through genome sequencing.

    PubMed

    Xu, Qing; Liu, Zhong-Jian

    2015-12-01

    The genome sequence assembly of the highly heterozygous Ananas comosus and its varieties is an impressive technical achievement. The sequence opens the door to a greater understanding of pineapple morphology and evolution. PMID:26620110

  8. Genome scanning : an AFM-based DNA sequencing technique

    E-print Network

    Elmouelhi, Ahmed (Ahmed M.), 1979-

    2003-01-01

    Genome Scanning is a powerful new technique for DNA sequencing. The method presented in this thesis uses an atomic force microscope with a functionalized cantilever tip to sequence single stranded DNA immobilized to a mica ...

  9. Insights from twenty years of bacterial genome sequencing

    SciTech Connect

    Land, Miriam L; Hauser, Loren John; Jun, Se Ran; Nookaew, Intawat; Leuze, Michael Rex; Ahn, Tae-Hyuk; Karpinets, Tatiana V; Lund, Ole; Kora, Guruprasad H; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.

  10. Mapping the Human Reference Genome’s Missing Sequence by Three-Way Admixture in Latino Genomes

    PubMed Central

    Genovese, Giulio; Handsaker, Robert E.; Li, Heng; Kenny, Eimear E.; McCarroll, Steven A.

    2013-01-01

    A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

  11. Toward product attribute control: developments from genome sequencing.

    PubMed

    Baik, Jong Youn; Lee, Kelvin H

    2014-12-01

    Chinese hamster ovary (CHO) cells are important hosts for the production of therapeutic proteins. Recent genome sequencing studies provide an initial baseline of information useful for understanding cell line performance in terms of product quality attributes. However, the lack of a well-established reference genome together with concerns about genome stability have not yet permitted the community to define the detailed relationship between the genome and cell line performance. Emerging efforts to define a new reference genome, together with new data on genome stability, herald an era where cell line's with defined genomes can be combined with defined process parameters to yield product quality attribute control. PMID:24874795

  12. Mechanisms of Base Substitution Mutagenesis in Cancer Genomes

    PubMed Central

    Bacolla, Albino; Cooper, David N.; Vasquez, Karen M.

    2014-01-01

    Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules. PMID:24705290

  13. Single-Cell Whole-Genome Amplification and Sequencing

    E-print Network

    Xie, Xiaoliang Sunney

    rates, chimera rates, allele dropout rates, false positive rates for calling single- nucleotide evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells' genomes at particular times can reveal their temporal evolution; and (d ) the genomes of individual cells

  14. Individual Patient Cancer Profiles in The Cancer Genome Atlas - Jianjiong Gao, TCGA Scientific Symposium 2012

    Cancer.gov

    Home News and Events Multimedia Library Videos Individual Patient Cancer Profiles in The Cancer Genome Atlas - Jianjiong Gao Individual Patient Cancer Profiles in The Cancer Genome Atlas - Jianjiong Gao, TCGA Scientific Symposium 2012 You will need

  15. Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.

  16. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  17. Volatiles from nineteen recently genome sequenced actinomycetes.

    PubMed

    Citron, Christian A; Barra, Lena; Wink, Joachim; Dickschat, Jeroen S

    2015-03-01

    The volatiles released by agar plate cultures of nineteen actinomycetes whose genomes were recently sequenced were collected by use of a closed-loop stripping apparatus (CLSA) and analysed by GC/MS. In total, 178 compounds from various classes were identified. The most interesting findings were the detection of the insect pheromone frontalin in Streptomyces varsoviensis, and the emission of the unusual plant metabolite 1-nitro-2-phenylethane. Its biosynthesis from phenylalanine was investigated in isotopic labelling experiments. Furthermore, the identified terpenes were correlated to the information about terpene cyclase homologs encoded in the investigated strains. The analytical data were in line with functionally characterised bacterial terpene cyclases and particularly corroborated the recently suggested function of a terpene cyclase from Streptomyces violaceusniger by the identification of a functional homolog in Streptomyces rapamycinicus. PMID:25585196

  18. Selection to sequence: opportunities in fungal genomics

    SciTech Connect

    Baker, Scott E.

    2009-12-01

    Selection is a biological force, causing genotypic and phenotypic change over time. Whether environmental or human induced, selective pressures shape the genotypes and the phenotypes of organisms both in nature and in the laboratory. In nature, selective pressure is highly dynamic and the sum of the environment and other organisms. In the laboratory, selection is used in genetic studies and industrial strain development programs to isolate mutants affecting biological processes of interest to researchers. Selective pressures are important considerations for fungal biology. In the laboratory a number of fungi are used as experimental systems to study a wide range of biological processes and in nature fungi are important pathogens of plants and animals and play key roles in carbon and nitrogen cycling. The continued development of high throughput sequencing technologies makes it possible to characterize at the genomic level, the effect of selective pressures both in the lab and in nature for filamentous fungi as well as other organisms.

  19. Finishing The Euchromatic Sequence Of The Human Genome

    SciTech Connect

    Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

    2004-09-07

    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

  20. On the sequencing of the human genome Robert H. Waterston*

    E-print Network

    Batzoglou, Serafim

    . The international Human Ge- nome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics. One was the product of the international Human Genome Project (HGP), and the other was the productOn the sequencing of the human genome Robert H. Waterston* , Eric S. Lander , and John E. Sulston

  1. Complete Genome Sequence of Mycoplasma wenyonii Strain Massachusetts

    PubMed Central

    Guimaraes, Ana M. S.; do Nascimento, Naíla C.; SanMiguel, Phillip J.

    2012-01-01

    Mycoplasma wenyonii is a hemotrophic mycoplasma that causes acute and chronic infections in cattle. Here, we announce the first complete genome sequence of this organism. The genome is a single circular chromosome with 650,228 bp and G+C% of 33.9. Analyses of M. wenyonii genome will provide insights into its biology. PMID:22965086

  2. SEQUENCING THE PIG GENOME USING A BAC BY BAC APPROACH

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have generated a highly contiguous physical map covering >98% of the pig genome in just 176 contigs. The map is localized to the genome through integration with the UIVC RH map as well BAC end sequence alignments to the human genome. Over 265k HindIII restriction digest fingerprints totaling 16.2...

  3. Subclonal diversification of primary breast cancer revealed by multiregion sequencing.

    PubMed

    Yates, Lucy R; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Van Loo, Peter; Aas, Turid; Alexandrov, Ludmil B; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre-Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R; Sotiriou, Christos; Richardson, Andrea L; Lønning, Per Eystein; Wedge, David C; Campbell, Peter J

    2015-07-01

    The sequencing of cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer. PMID:26099045

  4. On the current status of Phakopsora pachyrhizi genome sequencing.

    PubMed

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

  5. On the current status of Phakopsora pachyrhizi genome sequencing

    PubMed Central

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

  6. Draft Genome Sequence of Tolypothrix boutellei Strain VB521301

    PubMed Central

    Chandrababunaidu, Mathu Malar; Singh, Deeksha; Sen, Diya; Bhan, Sushma; Das, Subhadeep; Gupta, Akash

    2015-01-01

    We report here the draft genome sequence of the filamentous nitrogen-fixing cyanobacterium Tolypothrix boutellei strain VB521301. The organism is lipid rich and hydrophobic and produces polyunsaturated fatty acids which can be harnessed for industrial purpose. The draft genome sequence assembled into 11,572,263 bp with 70 scaffolds and 7,777 protein coding genes. PMID:25700407

  7. The Prospects for Sequencing the Western Corn Rootworm Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Historically, obtaining the complete sequence of eukaryotic genomes has been an expensive and complex task. For this reason, efforts to sequence insect genomes have largely been confined to model organisms, species that are important to human health, and representative species from a few insect orde...

  8. Nearly Complete Genome Sequence of Lactobacillus plantarum Strain NIZO2877

    PubMed Central

    Bayjanov, Jumamurat R.; Joncour, Pauline; Hughes, Sandrine; Gillet, Benjamin; Kleerebezem, Michiel; Siezen, Roland; van Hijum, Sacha A. F. T.

    2015-01-01

    Lactobacillus plantarum is a versatile bacterial species that is isolated mostly from foods. Here, we present the first genome sequence of L. plantarum strain NIZO2877 isolated from a hot dog in Vietnam. Its two contigs represent a nearly complete genome sequence. PMID:26607887

  9. De Novo Genome Sequence of Yersinia aleksiciae Y159T

    PubMed Central

    Neubauer, Heinrich

    2015-01-01

    We report here on the genome sequence of Yersinia aleksiciae Y159T, isolated in Finland in 1981. The genome has a size of 4 Mb, a G+C content of 49%, and is predicted to contain 3,423 coding sequences. PMID:26383649

  10. Complete Genome Sequence of the Human Gut Symbiont Roseburia hominis

    PubMed Central

    Travis, Anthony J.; Kelly, Denise; Flint, Harry J.

    2015-01-01

    We report here the complete genome sequence of the human gut symbiont Roseburia hominis A2-183T (= DSM 16839T = NCIMB 14029T), isolated from human feces. The genome is represented by a 3,592,125-bp chromosome with 3,405 coding sequences. A number of potential functions contributing to host-microbe interaction are identified. PMID:26543119

  11. Almost finished: the complete genome sequence of Mycosphaerella graminicola

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mycosphaerella graminicola causes septoria tritici blotch of wheat. An 8.9x shotgun sequence of bread wheat strain IPO323 was generated through the Community Sequencing Program of the U.S. Department of Energy’s Joint Genome Institute (JGI), and was finished at the Stanford Human Genome Center. The ...

  12. Draft genome sequence of Kocuria rhizophila P7-4.

    PubMed

    Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

    2011-08-01

    We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

  13. Draft Genome Sequence of Kocuria rhizophila P7-4?

    PubMed Central

    Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

    2011-01-01

    We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

  14. Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...

  15. Draft Genome Sequence of Neurospora crassa Strain FGSC 73

    DOE PAGESBeta

    Baker, Scott E.; Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V.; Simmons, Blake A.; McCluskey, Kevin

    2015-04-02

    We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models.

  16. De Novo Assembly of a Bell Pepper Endornavirus Genome Sequence Using RNA Sequencing Data

    PubMed Central

    Jo, Yeonhwa; Choi, Hoseng

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  17. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  18. Unexpected cross-species contamination in genome sequencing projects

    PubMed Central

    Merchant, Samier; Wood, Derrick E.

    2014-01-01

    The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes. PMID:25426337

  19. Draft sequences of the radish (Raphanus sativus L.) genome.

    PubMed

    Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

    2014-10-01

    Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ? 300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

  20. Genome sequencing and annotation of Serratia sp. strain TEL.

    PubMed

    Lephoto, Tiisetso E; Gray, Vincent M

    2015-12-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000. PMID:26697332

  1. Markov encoding for detecting signals in genomic sequences.

    PubMed

    Rajapakse, Jagath C; Ho, Loi Sy

    2005-01-01

    We present a technique to encode the inputs to neural networks for the detection of signals in genomic sequences. The encoding is based on lower-order Markov models which incorporate known biological characteristics in genomic sequences. The neural networks then learn intrinsic higher-order dependencies of nucleotides at the signal sites. We demonstrate the efficacy of the Markov encoding method in the detection of three genomic signals, namely, splice sites, transcription start sites, and translation initiation sites. PMID:17044178

  2. Genome sequencing and annotation of Serratia sp. strain TEL

    PubMed Central

    Lephoto, Tiisetso E.; Gray, Vincent M.

    2015-01-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000. PMID:26697332

  3. The Cancer Genome Atlas Pan-Cancer analysis project

    E-print Network

    Lander, Eric S.

    The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a ...

  4. Large-scale profiling of microRNAs for The Cancer Genome Atlas.

    PubMed

    Chu, Andy; Robertson, Gordon; Brooks, Denise; Mungall, Andrew J; Birol, Inanc; Coope, Robin; Ma, Yussanne; Jones, Steven; Marra, Marco A

    2016-01-01

    The comprehensive multiplatform genomics data generated by The Cancer Genome Atlas (TCGA) Research Network is an enabling resource for cancer research. It includes an unprecedented amount of microRNA sequence data: ?11 000 libraries across 33 cancer types. Combined with initiatives like the National Cancer Institute Genomics Cloud Pilots, such data resources will make intensive analysis of large-scale cancer genomics data widely accessible. To support such initiatives, and to enable comparison of TCGA microRNA data to data from other projects, we describe the process that we developed and used to generate the microRNA sequence data, from library construction through to submission of data to repositories. In the context of this process, we describe the computational pipeline that we used to characterize microRNA expression across large patient cohorts. PMID:26271990

  5. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    SciTech Connect

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  6. Scrutinizing Virus Genome Termini by High-Throughput Sequencing

    PubMed Central

    Fan, Huahao; Jiang, Huanhuan; Chen, Yubao; Tong, Yigang

    2014-01-01

    Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS) have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs) found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell. PMID:24465717

  7. Ovarian Cancer Biomarker Discovery Based on Genomic Approaches

    PubMed Central

    Lee, Jung-Yun; Kim, Hee Seung; Suh, Dong Hoon; Kim, Mi-Kyung; Chung, Hyun Hoon; Song, Yong-Sang

    2013-01-01

    Ovarian cancer presents at an advanced stage in more than 75% of patients. Early detection has great promise to improve clinical outcomes. Although the advancing proteomic technologies led to the discovery of numerous ovarian cancer biomarkers, no screening method has been recommended for early detection of ovarian cancer. Complexity and heterogeneity of ovarian carcinogenesis is a major obstacle to discover biomarkers. As cancer arises due to accumulation of genetic change, understanding the close connection between genetic changes and ovarian carcinogenesis would provide the opportunity to find novel gene-level ovarian cancer biomarkers. In this review, we summarize the various gene-based biomarkers by genomic technologies, including inherited gene mutations, epigenetic changes, and differential gene expression. In addition, we suggest the strategy to discover novel gene-based biomarkers with recently introduced next generation sequencing. PMID:25337559

  8. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-04

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  9. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-07

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  10. CTD² Publication Guidelines | Office of Cancer Genomics

    Cancer.gov

    The Cancer Target Discovery and Development (CTD2) Network is a “community resource project” supported by the National Cancer Institute’s Office of Cancer Genomics. Members of the Network release data to the broader research community by depositing data into NCI-supported or public databases. Data deposition is NOT equivalent to publishing in a peer-reviewed journal. Unless there is a manuscript associated with a dataset, the Network considers data to be formally unpublished.

  11. Genome Science and Personalized Cancer Treatment

    ScienceCinema

    Gray, Joe

    2010-01-08

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

  12. Single-cell paired-end genome sequencing reveals structural variation per cell cycle

    E-print Network

    Single-cell paired-end genome sequencing reveals structural variation per cell cycle Thierry Voet1 are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer

  13. Draft Genome Sequence of Gerbil-Adapted Carcinogenic Helicobacter pylori Strain 7.13.

    PubMed

    Asim, Mohammad; Chikara, Surendra K; Ghosh, Arpita; Vudathala, Srinivas; Romero-Gallo, Judith; Krishna, Uma S; Wilson, Keith T; Israel, Dawn A; Peek, Richard M; Chaturvedi, Rupesh

    2015-01-01

    We report here the draft genome sequence of Helicobacter pylori strain 7.13, a gerbil-adapted strain that causes gastric cancer in gerbils. Strain 7.13 is derived from clinical strain B128, isolated from a patient with a duodenal ulcer. This study reveals genes associated with the virulence of the strain. PMID:26067974

  14. Genome Science: A Video Tour of the Washington University Genome Sequencing Center for High School and Undergraduate Students

    ERIC Educational Resources Information Center

    Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.

    2005-01-01

    Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…

  15. Doug Brutlag 2015 Sequencing the Human Genome

    E-print Network

    Brutlag, Doug

    Project: Should we do it? · Service, R. F. (2001). The human genome: Objection #1: big biology is bad://www.elec-intro.com/m13-cloning #12;© Doug Brutlag 2015 Public Human Genome Project Strategy Published in Nature 15 The Human Genome Project: How should we do it? · Weber, J. L., & Myers, E. W. (1997). Human whole-genome

  16. Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus

    PubMed Central

    Wang, Qing-Yan; Huang, Yan-Yan; Song, Li-Fu; Du, Qi-Shi; Yu, Bo; Chen, Dong

    2014-01-01

    We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies. PMID:25395648

  17. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    SciTech Connect

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  18. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes.

    PubMed

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics. PMID:26586806

  19. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes

    PubMed Central

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics. PMID:26586806

  20. Interpretation of personal genome sequencing data in terms of disease ranks based on mutual information

    PubMed Central

    2015-01-01

    Background The rapid advances in genome sequencing technologies have resulted in an unprecedented number of genome variations being discovered in humans. However, there has been very limited coverage of interpretation of the personal genome sequencing data in terms of diseases. Methods In this paper we present the first computational analysis scheme for interpreting personal genome data by simultaneously considering the functional impact of damaging variants and curated disease-gene association data. This method is based on mutual information as a measure of the relative closeness between the personal genome and diseases. We hypothesize that a higher mutual information score implies that the personal genome is more susceptible to a particular disease than other diseases. Results The method was applied to the sequencing data of 50 acute myeloid leukemia (AML) patients in The Cancer Genome Atlas. The utility of associations between a disease and the personal genome was explored using data of healthy (control) people obtained from the 1000 Genomes Project. The ranks of the disease terms in the AML patient group were compared with those in the healthy control group using "Leukemia, Myeloid, Acute" (C04.557.337.539.550) as the corresponding MeSH disease term. The mutual information rank of the disease term was substantially higher in the AML patient group than in the healthy control group, which demonstrates that the proposed methodology can be successfully applied to infer associations between the personal genome and diseases. Conclusions Overall, the area under the receiver operating characteristics curve was significantly larger for the AML patient data than for the healthy controls. This methodology could contribute to consequential discoveries and explanations for mining personal genome sequencing data in terms of diseases, and have versatility with respect to genomic-based knowledge such as drug-gene and environmental-factor-gene interactions. PMID:26045178

  1. Community-wide analysis of microbial genome sequence signatures

    PubMed Central

    Dick, Gregory J; Andersson, Anders F; Baker, Brett J; Simmons, Sheri L; Thomas, Brian C; Yelton, A Pepper; Banfield, Jillian F

    2009-01-01

    Background Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape them. Results We analyzed metagenomic sequence data from two acidophilic biofilm communities, including composite genomes reconstructed for nine archaea, three bacteria, and numerous associated viruses, as well as thousands of unassigned fragments from strain variants and low-abundance organisms. Genome signatures, in the form of tetranucleotide frequencies analyzed by emergent self-organizing maps, segregated sequences from all known populations sharing < 50 to 60% average amino acid identity and revealed previously unknown genomic clusters corresponding to low-abundance organisms and a putative plasmid. Signatures were pervasive genome-wide. Clusters were resolved because intra-genome differences resulting from translational selection or protein adaptation to the intracellular (pH ~5) versus extracellular (pH ~1) environment were small relative to inter-genome differences. We found that these genome signatures stem from multiple influences but are primarily manifested through codon composition, which we propose is the result of genome-specific mutational biases. Conclusions An important conclusion is that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities. Thus, genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities. PMID:19698104

  2. Complete genome sequence of Mycoplasma haemofelis, a hemotropic mycoplasma.

    PubMed

    Barker, Emily N; Helps, Chris R; Peters, Iain R; Darby, Alistair C; Radford, Alan D; Tasker, Séverine

    2011-04-01

    Here, we present the genome sequence of Mycoplasma haemofelis strain Langford 1, representing the first hemotropic mycoplasma (hemoplasma) species to be completely sequenced and annotated. Originally isolated from a cat with hemolytic anemia, this strain induces severe hemolytic anemia when inoculated into specific-pathogen-free-derived cats. The genome sequence has provided insights into the biology of this uncultivatable hemoplasma and has identified potential molecular mechanisms underlying its pathogenicity. PMID:21317334

  3. Complete Genome Sequence of Mycoplasma haemofelis, a Hemotropic Mycoplasma?

    PubMed Central

    Barker, Emily N.; Helps, Chris R.; Peters, Iain R.; Darby, Alistair C.; Radford, Alan D.; Tasker, Séverine

    2011-01-01

    Here, we present the genome sequence of Mycoplasma haemofelis strain Langford 1, representing the first hemotropic mycoplasma (hemoplasma) species to be completely sequenced and annotated. Originally isolated from a cat with hemolytic anemia, this strain induces severe hemolytic anemia when inoculated into specific-pathogen-free-derived cats. The genome sequence has provided insights into the biology of this uncultivatable hemoplasma and has identified potential molecular mechanisms underlying its pathogenicity. PMID:21317334

  4. Single-Cell Whole-Genome Amplification and Sequencing: Methodology and Applications.

    PubMed

    Huang, Lei; Ma, Fei; Chapman, Alec; Lu, Sijia; Xie, Xiaoliang Sunney

    2015-01-01

    We present a survey of single-cell whole-genome amplification (WGA) methods, including degenerate oligonucleotide-primed polymerase chain reaction (DOP-PCR), multiple displacement amplification (MDA), and multiple annealing and looping-based amplification cycles (MALBAC). The key parameters to characterize the performance of these methods are defined, including genome coverage, uniformity, reproducibility, unmappable rates, chimera rates, allele dropout rates, false positive rates for calling single-nucleotide variations, and ability to call copy-number variations. Using these parameters, we compare five commercial WGA kits by performing deep sequencing of multiple single cells. We also discuss several major applications of single-cell genomics, including studies of whole-genome de novo mutation rates, the early evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells, preimplantation genetic diagnosis (PGD), and preimplantation genomic screening (PGS) for in vitro-fertilized embryos. PMID:26077818

  5. Reference genome sequence of the model plant Setaria

    SciTech Connect

    Bennetzen, Jeffrey L; Yang, Xiaohan; Ye, Chuyu; Tuskan, Gerald A

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  6. Reference genome sequence of the model plant Setaria

    SciTech Connect

    Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C.; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chuyu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela; Panaud, Olivier; Kellogg, Elizabeth A.; Brutnell, Thomas P.; Doust, Andrew N.; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  7. Open-Access Cancer Genomics - Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The completion of the Human Genome Project sparked a revolution in high-throughput genomics applied towards deciphering genetically complex diseases, like cancer. Now, almost 10 years later, we have a mountain of genomics data on many different cancer type

  8. Microbial genome sequencing using optical mapping and Illumina sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  9. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    SciTech Connect

    Ivanova, N; Sikorski, Johannes; Jando, Marlen; Lapidus, Alla L.; Nolan, Matt; Glavina Del Rio, Tijana; Tice, Hope; Copeland, A; Cheng, Jan-Fang; Chen, Feng; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Mavromatis, K; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Chain, Patrick S. G.; Saunders, Elizabeth H; Han, Cliff; Detter, J C; Brettin, Thomas S; Rohde, Manfred; Goker, Markus; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  10. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    SciTech Connect

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  11. Complete genome sequence of Sulfurospirillum deleyianum type strain (5175T)

    SciTech Connect

    Sikorski, Johannes; Lapidus, Alla L.; Copeland, A; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Saunders, Elizabeth H; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Ovchinnikova, Galina; Pati, Amrita; Ivanova, N; Mavromatis, K; Chen, Amy; Palaniappan, Krishna; Chain, Patrick S. G.; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Detter, J. Chris; Han, Cliff; Rohde, Manfred; Lang, Elke; Spring, Stefan; Goker, Markus; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-01

    Sulfurospirillum deleyianum Schumacher et al. 1993 is the type species of the genus Sulfurospirillum. S. deleyianum is a model organism for studying sulfur reduction and dissimilatory nitrate reduction as energy source for growth. Also, it is a prominent model organism for studying the structural and functional characteristics of the cytochrome c nitrite reductase. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the genus Sulfurospirillum. The 2,306,351 bp long genome with its 2291 protein-coding and 52 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  12. Complete genome sequence of Thermomonospora curvata type strain (B9)

    SciTech Connect

    Chertkov, Olga; Sikorski, Johannes; Nolan, Matt; Lapidus, Alla L.; Lucas, Susan; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Mikhailova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Ngatchou, Olivier Duplex; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Brettin, Thomas S; Han, Cliff; Detter, J. Chris; Rohde, Manfred; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2011-01-01

    Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Accurate Whole Genome Sequencing as the Ultimate Genetic Test

    E-print Network

    Church, George M.

    Accurate Whole Genome Sequencing as the Ultimate Genetic Test Radoje Drmanac,1,2* Brock A. Peters,1- assembling DNA nanoarrays. Science 2010;327:78­81.4 Even 30 years ago, it was obvious that Sanger sequenc that started in Serbia in 1987 with a proposal for sequencing by hy- bridization (SBH) on dot-blot DNA arrays

  14. The Cancer Genome Atlas Pan-Cancer analysis project.

    PubMed

    Weinstein, John N; Collisson, Eric A; Mills, Gordon B; Shaw, Kenna R Mills; Ozenberger, Brad A; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M

    2013-10-01

    The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile. PMID:24071849

  15. Childhood Cancer Genomics Gaps and Opportunities - Workshop Summary

    Cancer.gov

    NCI convened a workshop of representative research teams that have been leaders in defining the genomic landscape of childhood cancers to discuss the influence of genomic discoveries on the future of childhood cancer research.

  16. The Cancer Genome Atlas (TCGA): The next stage

    Cancer.gov

    The Cancer Genome Atlas (TCGA), the NIH research program that has helped set the standards for characterizing the genomic underpinnings of dozens of cancers on a large scale, is moving to its next phase.

  17. Complete genome sequence of Staphylothermus hellenicus P8T

    SciTech Connect

    Anderson, Iain; Wirth, Reinhard; Lucas, Susan; Copeland, A; Lapidus, Alla L.; Cheng, Jan-Fang; Goodwin, Lynne A.; Pitluck, Sam; Davenport, Karen W.; Detter, J. Chris; Han, Cliff; Tapia, Roxanne; Land, Miriam L; Hauser, Loren John; Pati, Amrita; Mikhailova, Natalia; Woyke, Tanja; Klenk, Hans-Peter; Kyrpides, Nikos C; Ivanova, N

    2011-01-01

    Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phy- lum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shal- low hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the com- plete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein- coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) La- boratory Sequencing Program (LSP) project.

  18. Complete genome sequence of Staphylothermus hellenicus P8T

    PubMed Central

    Anderson, Iain; Wirth, Reinhard; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Fang; Goodwin, Lynne; Pitluck, Samuel; Davenport, Karen; Detter, John C.; Han, Cliff; Tapia, Roxanne; Land, Miriam; Hauser, Loren; Pati, Amrita; Mikhailova, Natalia; Woyke, Tanja; Klenk, Hans-Peter; Kyrpides, Nikos; Ivanova, Natalia

    2011-01-01

    Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phylum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shallow hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the complete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein-coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) Laboratory Sequencing Program (LSP) project. PMID:22180806

  19. Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens

    PubMed Central

    Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E.; Bakker, Freek T.

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

  20. Computational methods and resources for the interpretation of genomic variants in cancer

    PubMed Central

    2015-01-01

    The recent improvement of the high-throughput sequencing technologies is having a strong impact on the detection of genetic variations associated with cancer. Several institutions worldwide have been sequencing the whole exomes and or genomes of cancer patients in the thousands, thereby providing an invaluable collection of new somatic mutations in different cancer types. These initiatives promoted the development of methods and tools for the analysis of cancer genomes that are aimed at studying the relationship between genotype and phenotype in cancer. In this article we review the online resources and computational tools for the analysis of cancer genome. First, we describe the available repositories of cancer genome data. Next, we provide an overview of the methods for the detection of genetic variation and computational tools for the prioritization of cancer related genes and causative somatic variations. Finally, we discuss the future perspectives in cancer genomics focusing on the impact of computational methods and quantitative approaches for defining personalized strategies to improve the diagnosis and treatment of cancer. PMID:26111056

  1. Cancer vulnerabilities unveiled by genomic loss | Office of Cancer Genomics

    Cancer.gov

    Integrated analysis of RNAi and copy number data across a panel of cancer cell lines revealed the CYCLOPS (copy number alterations yielding cancer liabilities owing to partial loss) genes, which include components of the spliceosome, ribosome and proteasome, as potential candidates for targeted cancer therapies. Partial loss of these genes may make tumor cells more sensitive than normal cells to gene suppression with targeted agents.

  2. Sequencing, assembling, and correcting draft genomes using recombinant populations.

    PubMed

    Hahn, Matthew W; Zhang, Simo V; Moyle, Leonie C

    2014-04-01

    Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences-which we call recombinant population genome construction-that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross. PMID:24531727

  3. Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato

    PubMed Central

    Macias-Muñoz, Aide; Briscoe, Adriana D.

    2014-01-01

    Here, we report the genome sequence of a novel iflavirus strain recovered from the neotropical butterfly Heliconius erato. The coding DNA sequence (CDS) of the iflavirus genome was 8,895 nucleotides in length, encoding a polyprotein that was 2,965 amino acids long. PMID:24831145

  4. The impact of the Cancer Genome Atlas on lung cancer.

    PubMed

    Chang, Jeremy T-H; Lee, Yee Ming; Huang, R Stephanie

    2015-12-01

    The Cancer Genome Atlas (TCGA) has profiled more than 10,000 samples derived from 33 types of cancer to date, with the goal of improving our understanding of the molecular basis of cancer and advancing our ability to diagnose, treat, and prevent cancer. This review focuses on lung cancer as it is the leading cause of cancer-related mortality worldwide in both men and women. Particularly, non-small cell lung cancers (including lung adenocarcinoma and lung squamous cell carcinoma) were evaluated. Our goal was to demonstrate the impact of TCGA on lung cancer research under 4 themes: diagnostic markers, disease progression markers, novel therapeutic targets, and novel tools. Examples are given related to DNA mutation, copy number variation, messenger RNA, and microRNA expression along with methylation profiling. PMID:26318634

  5. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    SciTech Connect

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  6. U3 Region in the HIV1 Genome Adopts a GQuadruplex Structure in Its RNA and DNA Sequence

    E-print Network

    Sharma, Gaurav

    this structure in promoters of cancer-related genes. Here, we demonstrate that the G-rich proviral DNA sequence to recombination in U3. Recent cellular research revealed that G-quadruplexes formed in promoter regions of cancer. The genomic regions prone to adopt this structure are rich in G residues, and include telomeres and gene

  7. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    SciTech Connect

    Abt, Birte; Foster, Brian; Lapidus, Alla L.; Clum, Alicia; Sun, Hui; Pukall, Rudiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne A.; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Rohde, Manfred; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  8. Genome sequencing and analysis of the model grass Brachypodium distachyon

    SciTech Connect

    Yang, Xiaohan; Kalluri, Udaya C; Tuskan, Gerald A

    2010-01-01

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

  9. Comprehensive genomic profiles of small cell lung cancer.

    PubMed

    George, Julie; Lim, Jing Shan; Jang, Se Jin; Cun, Yupeng; Ozreti?, Luka; Kong, Gu; Leenders, Frauke; Lu, Xin; Fernández-Cuesta, Lynnette; Bosco, Graziella; Müller, Christian; Dahmen, Ilona; Jahchan, Nadine S; Park, Kwon-Sik; Yang, Dian; Karnezis, Anthony N; Vaka, Dedeepya; Torres, Angela; Wang, Maia Segura; Korbel, Jan O; Menon, Roopika; Chun, Sung-Min; Kim, Deokhoon; Wilkerson, Matt; Hayes, Neil; Engelmann, David; Pützer, Brigitte; Bos, Marc; Michels, Sebastian; Vlasic, Ignacija; Seidel, Danila; Pinther, Berit; Schaub, Philipp; Becker, Christian; Altmüller, Janine; Yokota, Jun; Kohno, Takashi; Iwakawa, Reika; Tsuta, Koji; Noguchi, Masayuki; Muley, Thomas; Hoffmann, Hans; Schnabel, Philipp A; Petersen, Iver; Chen, Yuan; Soltermann, Alex; Tischler, Verena; Choi, Chang-min; Kim, Yong-Hee; Massion, Pierre P; Zou, Yong; Jovanovic, Dragana; Kontic, Milica; Wright, Gavin M; Russell, Prudence A; Solomon, Benjamin; Koch, Ina; Lindner, Michael; Muscarella, Lucia A; la Torre, Annamaria; Field, John K; Jakopovic, Marko; Knezevic, Jelena; Castaños-Vélez, Esmeralda; Roz, Luca; Pastorino, Ugo; Brustugun, Odd-Terje; Lund-Iversen, Marius; Thunnissen, Erik; Köhler, Jens; Schuler, Martin; Botling, Johan; Sandelin, Martin; Sanchez-Cespedes, Montserrat; Salvesen, Helga B; Achter, Viktor; Lang, Ulrich; Bogus, Magdalena; Schneider, Peter M; Zander, Thomas; Ansén, Sascha; Hallek, Michael; Wolf, Jürgen; Vingron, Martin; Yatabe, Yasushi; Travis, William D; Nürnberg, Peter; Reinhardt, Christian; Perner, Sven; Heukamp, Lukas; Büttner, Reinhard; Haas, Stefan A; Brambilla, Elisabeth; Peifer, Martin; Sage, Julien; Thomas, Roman K

    2015-08-01

    We have sequenced the genomes of 110 small cell lung cancers (SCLC), one of the deadliest human cancers. In nearly all the tumours analysed we found bi-allelic inactivation of TP53 and RB1, sometimes by complex genomic rearrangements. Two tumours with wild-type RB1 had evidence of chromothripsis leading to overexpression of cyclin D1 (encoded by the CCND1 gene), revealing an alternative mechanism of Rb1 deregulation. Thus, loss of the tumour suppressors TP53 and RB1 is obligatory in SCLC. We discovered somatic genomic rearrangements of TP73 that create an oncogenic version of this gene, TP73?ex2/3. In rare cases, SCLC tumours exhibited kinase gene mutations, providing a possible therapeutic opportunity for individual patients. Finally, we observed inactivating mutations in NOTCH family genes in 25% of human SCLC. Accordingly, activation of Notch signalling in a pre-clinical SCLC mouse model strikingly reduced the number of tumours and extended the survival of the mutant mice. Furthermore, neuroendocrine gene expression was abrogated by Notch activity in SCLC cells. This first comprehensive study of somatic genome alterations in SCLC uncovers several key biological processes and identifies candidate therapeutic targets in this highly lethal form of cancer. PMID:26168399

  10. The Release 6 reference sequence of the Drosophila melanogaster genome.

    PubMed

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

  11. The Release 6 reference sequence of the Drosophila melanogaster genome

    PubMed Central

    Carlson, Joseph W.; Wan, Kenneth H.; Park, Soo; Mendez, Ivonne; Galle, Samuel E.; Booth, Benjamin W.; Pfeiffer, Barret D.; George, Reed A.; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V.; Andreyeva, Evgeniya N.; Boldyreva, Lidiya V.; Marra, Marco; Carvalho, A. Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F.; Rubin, Gerald M.; Karpen, Gary H.

    2015-01-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

  12. A survey of tools for variant analysis of next-generation genome sequencing data

    PubMed Central

    Pabinger, Stephan; Dander, Andreas; Fischer, Maria; Snajder, Rene; Sperk, Michael; Efremova, Mirjana; Krabichler, Birgit; Speicher, Michael R.; Zschocke, Johannes

    2014-01-01

    Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers. PMID:23341494

  13. Metastatic tumor evolution and organoid modeling implicate TGFBR2 as a cancer driver in diffuse gastric cancer | Office of Cancer Genomics

    Cancer.gov

    Gastric cancer is the second-leading cause of global cancer deaths, with metastatic disease representing the primary cause of mortality. To identify candidate drivers involved in oncogenesis and tumor evolution, we conduct an extensive genome sequencing analysis of metastatic progression in a diffuse gastric cancer. This involves a comparison between a primary tumor from a hereditary diffuse gastric cancer syndrome proband and its recurrence as an ovarian metastasis.

  14. Complete genome sequences of six strains of the genus methylobacterium

    SciTech Connect

    Marx, Christopher J; Bringel, Francoise O.; Christoserdova, Ludmila; Moulin, Lionel; Farhan Ul Haque, Muhammad; Fleischman, Darrell E.; Gruffaz, Christelle; Jourand, Philippe; Knief, Claudia; Lee, Ming-Chun; Muller, Emilie E. L.; Nadalig, Thierry; Peyraud, Remi; Roselli, Sandro; Russ, Lina; Aguero, Fernan; Goodwin, Lynne A.; Ivanova, N; Kyrpides, Nikos C; Lajus, Aurelie; Medigue, Claudine; Nolan, Matt; Woyke, Tanja; Stolyar, Sergey; Vorholt, Julia A.; Vuilleumier, Stephane

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  15. Complete Genome Sequences of Six Strains of the Genus Methylobacterium

    SciTech Connect

    Marx, Christopher J; Bringel, Francoise O.; Christoserdova, Ludmila; Moulin, Lionel; UI Hague, Muhammad Farhan; Fleischman, Darrell E.; Gruffaz, Christelle; Jourand, Philippe; Knief, Claudia; Lee, Ming-Chun; Muller, Emilie E. L.; Nadalig, Thierry; Peyraud, Remi; Roselli, Sandro; Russ, Lina; Goodwin, Lynne A.; Ivanov, Pavel S.; Ivanova, N; Kyrpides, Nikos C; Lajus, Aurelie; Medigue, Claudine; Nolan, Matt; Woyke, Tanja; Stolyar, Sergey; Vorholt, Julia A.; Vuilleumier, Stephane

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  16. The genome sequence of the filamentous fungus Neurospora crassa 

    E-print Network

    Read, Nick D; et al

    2003-04-24

    Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase ...

  17. Draft Genome Sequence of Pseudomonas syringae pv. persicae NCPPB 2254.

    PubMed

    Zhao, Wenjun; Jiang, Hongshan; Tian, Qian; Hu, Jie

    2015-01-01

    Pseudomonas syringae pv. persicae is a pathogen that causes bacterial decline of stone fruit. Here, we report the draft genome sequence for P. syringae pv. persicae, which was isolated from Prunus persica. PMID:26044420

  18. Edinburgh Research Explorer Draft Genome Sequences of Six Different Staphylococcus

    E-print Network

    Millar, Andrew J.

    Edinburgh Research Explorer Draft Genome Sequences of Six Different Staphylococcus epidermidis of Six Different Staphylococcus epidermidis Clones, Isolated Individually from Preterm Neonates Staphylococcus epidermidis Clones, Isolated Individually from Preterm Neonates Presenting with Sepsis

  19. Melanoma genome sequencing reveals frequent PREX2 mutations

    E-print Network

    Lander, Eric S.

    Melanoma is notable for its metastatic propensity, lethality in the advanced setting and association with ultraviolet exposure early in life. To obtain a comprehensive genomic view of melanoma in humans, we sequenced the ...

  20. Complete Genome Sequence of Rahnella aquatilis CIP 78.65

    SciTech Connect

    Martinez, Robert J; Bruce, David; Detter, J C; Goodwin, Lynne A.; Han, James; Han, Cliff; Held, Brittany; Land, Miriam L; Mikhailova, Natalia; Nolan, Matt; Pennacchio, Len; Pitluck, Sam; Tapia, Roxanne; Woyke, Tanja; Sobeckya, Patricia A.

    2012-01-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

  1. Draft Genome Sequences of Three Mycobacterium chimaera Respiratory Isolates

    PubMed Central

    Roycroft, Emma; Raftery, Philomena; Mok, Simone; Fitzgibbon, Margaret; Rogers, Thomas R.

    2015-01-01

    Mycobacterium chimaera is an opportunistic human pathogen implicated in both pulmonary and cardiovascular infections. Here, we report the draft genome sequences of three strains isolated from human respiratory specimens. PMID:26634757

  2. Sequence analysis of the complete mitochondrial genome of Youxian sheldrake.

    PubMed

    He, Shao-Ping; Liu, Li-Li; Yu, Qi-Fang; Li, Si; He, Jian-Hua

    2016-03-01

    Youxian sheldrake is excellent native breeds in Hunan province in China. The complete mitochondrial (mt) genome sequence plays an important role in the accurate determination of phylogenetic relationships among metazoans. This is the first study to determine the complete mitochondrial genome sequence of Youxian sheldrake using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, the total length of the mitogenome is 16,605?bp, with the base composition of 29.21% A, 22.18% T, 32.84% C, 15.77% G in the Youxian sheldrake. It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Youxian sheldrake provided an important data for further study of the phylogenetics of poultry, and available data for the genetics and breeding. PMID:25090395

  3. Commentary on patents: Full bacterial DNA sequences boost genomics

    SciTech Connect

    Fox, J.L.

    1995-07-01

    Together with recent U.S. federal court decisions on DNA patenting, the sequencing achievement indicates that efforts on the broader genomics front may be moving more rapidly than had been previously thought.

  4. Operational streamlining in a high-throughput genome sequencing center

    E-print Network

    Person, Kerry P. (Kerry Patrick)

    2006-01-01

    Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

  5. Initial genome sequencing and analysis of multiple myeloma

    E-print Network

    Lander, Eric S.

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

  6. Draft Genome Sequence of Coprobacter fastidiosus NSB1T

    PubMed Central

    Chaplin, A. V.; Efimov, B. A.; Khokhlova, E. V.; Kafarskaia, L. I.; Tupikin, A. E.; Kabilov, M. R.

    2014-01-01

    Coprobacter fastidiosus is a Gram-negative obligate anaerobic bacterium belonging to the phylum Bacteroidetes. In this work, we report the draft genome sequence of C. fastidiosus strain NSB1T isolated from human infant feces. PMID:24604645

  7. Fulfilling the Promise of a Sequenced Human Genome – Part II

    SciTech Connect

    Green, Eric

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

  8. Fulfilling the Promise of a Sequenced Human Genome – Part I

    SciTech Connect

    Green, Eric

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

  9. Publications | Office of Cancer Genomics

    Cancer.gov

    Philadelphia chromosome-like acute lymphoblastic leukemia was found to be characterized by a range of genomic alterations that activate a limited number of signaling pathways, all of which may be amenable to inhibition with approved tyrosine kinase inhibitors.

  10. Complete genome sequence of Treponema pallidum strain DAL-1.

    PubMed

    Zobaníková, Marie; Mikolka, Pavol; Cejková, Darina; Pospíšilová, Petra; Chen, Lei; Strouhal, Michal; Qin, Xiang; Weinstock, George M; Smajs, David

    2012-10-10

    Treponema pallidum strain DAL-1 is a human uncultivable pathogen causing the sexually transmitted disease syphilis. Strain DAL-1 was isolated from the amniotic fluid of a pregnant woman in the secondary stage of syphilis. Here we describe the 1,139,971 bp long genome of T. pallidum strain DAL-1 which was sequenced using two independent sequencing methods (454 pyrosequencing and Illumina). In rabbits, strain DAL-1 replicated better than the T. pallidum strain Nichols. The comparison of the complete DAL-1 genome sequence with the Nichols sequence revealed a list of genetic differences that are potentially responsible for the increased rabbit virulence of the DAL-1 strain. PMID:23449808

  11. Genome sequence of vanilla distortion mosaic virus infecting Coriandrum sativum.

    PubMed

    Adams, I P; Rai, S; Deka, M; Harju, V; Hodges, T; Hayward, G; Skelton, A; Fox, A; Boonham, N

    2014-12-01

    The 9573-nucleotide genome of a potyvirus was sequenced from a Coriandrum sativum plant from India with viral symptoms. On analysis, this virus was shown to have greater than 85 % nucleotide sequence identity to vanilla distortion mosaic virus (VDMV). Analysis of the putative coat protein sequence confirmed that this virus was in fact VDMV, with greater than 91 % amino acid sequence identity. The genome appears to encode a 3083-amino-acid polyprotein potentially cleaved into the 10 mature proteins expected in potyviruses. Phylogenetic analysis confirmed that VDMV is a distinct but ungrouped member of the genus Potyvirus. PMID:25252813

  12. Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species

    PubMed Central

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  13. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  14. Complete Genome Sequences of Helicobacter pylori Rifampin-Resistant Strains.

    PubMed

    Momynaliev, Kuvat; Chelysheva, Vera; Selezneva, Oksana; Akopian, Tatyana; Alexeev, Dmitry; Govorun, Vadim

    2013-01-01

    Here we present the complete genome sequences of two Helicobacter pylori rifampin-resistant (Rif(r)) strains (Rif1 and Rif2). Rif(r) strains were obtained by in vitro selection of H. pylori 26695 on agar plates with 20 µg/ml rifampin. The genome data provide insights on the genomic diversity of H. pylori under selection by rifampin. PMID:23833139

  15. Mining the coding and non-coding genome for cancer drivers.

    PubMed

    Li, Jia; Drubay, Damien; Michiels, Stefan; Gautheret, Daniel

    2015-12-28

    Progress in next-generation sequencing provides unprecedented opportunities to fully characterize the spectrum of somatic mutations of cancer genomes. Given the large number of somatic mutations identified by such technologies, the prioritization of cancer-driving events is a consistent bottleneck. Most bioinformatics tools concentrate on driver mutations in the coding fraction of the genome, those causing changes in protein products. As more non-coding pathogenic variants are identified and characterized, the development of computational approaches to effectively prioritize cancer-driving variants within the non-coding fraction of human genome is becoming critical. After a short summary of methods for coding variant prioritization, we here review the highly diverse non-coding elements that may act as cancer drivers and describe recent methods that attempt to evaluate the deleteriousness of sequence variation in these elements. With such tools, the prioritization and identification of cancer-implicated regulatory elements and non-coding RNAs is becoming a reality. PMID:26433158

  16. Mulan: multiple-sequence alignment to predict functional elements in genomic sequences.

    PubMed

    Loots, Gabriela G; Ovcharenko, Ivan

    2007-01-01

    Multiple sequence alignment analysis is a powerful approach for translating the evolutionary selective power into phylogenetic relationships to localize functional coding and noncoding genomic elements. The tool Mulan (http://mulan.dcode.org/) has been designed to effectively perform multiple comparisons of genomic sequences necessary to facilitate bioinformatic-driven biological discoveries. The Mulan network server is capable of comparing both closely and distantly related genomes to identify conserved elements over a broad range of evolutionary time. Several novel algorithms are brought together in this tool: the tba multisequence aligner program used to rapidly identify local sequence conservation and the multiTF program to detect evolutionarily conserved transcription factor binding sites in alignments. Mulan is integrated with the ERC Browser, the UCSC Genome Browser for quick uploads of available sequences and supports two-way communication with the GALA database to overlay GALA functional genome annotation with sequence conservation profiles. Local multiple alignments computed by Mulan ensure reliable representation of short- and large-scale genomic rearrangements in distant organisms. Recently, we have also introduced the ability to handle duplications to permit the reliable reconstruction of evolutionary events that underlie the genome sequence data. Here, we describe the main features of the Mulan tool that include the interactive modification of critical conservation parameters, visualization options, and dynamic access to sequence data from visual graphs for flexible and easy-to-perform analysis of differentially evolving genomic regions. PMID:17993678

  17. Complete Genome Sequence of Mycoplasma synoviae Strain WVU 1853T

    PubMed Central

    Kutish, Gerald F.; Barbet, Anthony F.; Michaels, Dina L.

    2015-01-01

    A hybrid sequence assembly of the complete Mycoplasma synoviae type strain WVU 1853T genome was compared to that of strain MS53. The findings support prior conclusions about M. synoviae, based on the genome of that otherwise uncharacterized field strain, and provide the first evidence of epigenetic modifications in M. synoviae. PMID:26021934

  18. Mitochondrial Genome Sequence of the Glass Sponge Oopsacas minuta.

    PubMed

    Jourda, Cyril; Santini, Sébastien; Rocher, Caroline; Le Bivic, André; Claverie, Jean-Michel

    2015-01-01

    We report the complete mitochondrial genome sequence of the Mediterranean glass sponge Oopsacas minuta. This 19-kb mitochondrial genome has 24 noncoding genes (22 tRNAs and 2 rRNAs) and 14 protein-encoding genes coding for 11 subunits of respiratory chain complexes and 3 ATP synthase subunits. PMID:26227597

  19. A snapshot of the emerging tomato genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

  20. Complete Genome Sequence of Campylobacter gracilis ATCC 33236T

    PubMed Central

    Yee, Emma

    2015-01-01

    The human oral pathogen Campylobacter gracilis has been isolated from periodontal and endodontal infections, and also from nonoral head, neck, or lung infections. This study describes the whole-genome sequence of the human periodontal isolate ATCC 33236T (=FDC 1084), which is the first closed genome for C. gracilis. PMID:26383656

  1. Mitochondrial Genome Sequence of the Glass Sponge Oopsacas minuta

    PubMed Central

    Jourda, Cyril; Santini, Sébastien; Rocher, Caroline; Le Bivic, André

    2015-01-01

    We report the complete mitochondrial genome sequence of the Mediterranean glass sponge Oopsacas minuta. This 19-kb mitochondrial genome has 24 noncoding genes (22 tRNAs and 2 rRNAs) and 14 protein-encoding genes coding for 11 subunits of respiratory chain complexes and 3 ATP synthase subunits. PMID:26227597

  2. Draft Genome Sequence of Mycobacterium austroafricanum DSM 44191.

    PubMed

    Croce, Olivier; Robert, Catherine; Raoult, Didier; Drancourt, Michel

    2014-01-01

    We announce the draft genome sequence of Mycobacterium austroafricanum DSM 44191(T) (= E9789-SA12441(T)), a non-tuberculosis species responsible for opportunistic infection. The genome described here has a size of 6,772,357 bp with a G+C content of 66.79% and contains 6,419 protein-coding genes and 112 RNA genes. PMID:24744336

  3. Sequence and comparative analysis of the chicken genome provide unique

    E-print Network

    Hardison, Ross C.

    and contraction of multigene families seem to have been major factors in the independent evolution of mammals evolution International Chicken Genome Sequencing Consortium* *Lists of participants and affiliations appear and an estimated 20,000­23,000 genes--provides a new perspective on vertebrate genome evolution, while also

  4. Genomic regulatory regions: insights from comparative sequence analysis

    E-print Network

    Sidow, Arend

    Genomic regulatory regions: insights from comparative sequence analysis Gregory M Cooperà and Arend of genomic regulatory regions with functional roles. It is effective because functionally important regions for the comprehensive discovery of human regulatory elements. Addresses à Department of Genetics, Stanford University

  5. RESEARCH Open Access Genomic and small RNA sequencing of

    E-print Network

    Green, Pamela

    of sorghum as a reference genome sequence for Andropogoneae grasses Kankshita Swaminathan1,2 , Magdy origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome. Included within the Andropogoneae are major crops such as maize, Sorghum bicolor (sorghum), sugarcane

  6. Draft Genome Sequence of "Candidatus Liberibacter asiaticus" from California.

    PubMed

    Zheng, Z; Deng, X; Chen, J

    2014-01-01

    We report here the draft genome sequence of "Candidatus Liberibacter asiaticus" strain HHCA, collected from a lemon tree in California. The HHCA strain has a genome size of 1,150,620 bp, 36.5% G+C content, 1,119 predicted open reading frames, and 51 RNA genes. PMID:25278540

  7. Draft Genome Sequence of “Candidatus Liberibacter asiaticus” from California

    PubMed Central

    Zheng, Z.

    2014-01-01

    We report here the draft genome sequence of “Candidatus Liberibacter asiaticus” strain HHCA, collected from a lemon tree in California. The HHCA strain has a genome size of 1,150,620 bp, 36.5% G+C content, 1,119 predicted open reading frames, and 51 RNA genes. PMID:25278540

  8. Draft Genome Sequence of Linfuranone Producer Microbispora sp. GMKU 363.

    PubMed

    Komaki, Hisayuki; Ichikawa, Natsuko; Hosoyama, Akira; Fujita, Nobuyuki; Thamchaipenet, Arinthip; Igarashi, Yasuhiro

    2015-01-01

    Here, we report the draft genome sequence of Microbispora sp. GMKU 363, a plant-derived actinomycete that produces linfuranone A, a linear polyketide modified with a furanone ring possessing adipocyte differentiation inducing activity. The biosynthetic gene cluster for linfuranone was identified by analyzing polyketide synthase genes in the genome. PMID:26659694

  9. First Complete Genome Sequence of Felis catus Gammaherpesvirus 1

    PubMed Central

    Lee, Justin S.; Vuyisich, Momchilo; Chain, Patrick; Lo, Chien-Chi; Kronmiller, Brent; Bracha, Shay; Avery, Anne C.; VandeWoude, Sue

    2015-01-01

    We sequenced the complete genome of Felis catus gammaherpesvirus 1 (FcaGHV1) from lymph node DNA of an infected cat. The genome includes a 121,556-nucleotide unique region with 87 predicted open reading frames (61 gammaherpesvirus conserved and 26 unique) flanked by multiple copies of a 966-nucleotide terminal repeat. PMID:26543105

  10. Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products

    E-print Network

    Pääbo, Svante

    Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products Tomislav Maricic products are used to capture complete human mitochondrial genomes from complex DNA mixtures. We use. It has applications in population genetics and forensics, as well as studies of ancient DNA. Citation

  11. Draft Genome Sequence of Linfuranone Producer Microbispora sp. GMKU 363

    PubMed Central

    Ichikawa, Natsuko; Hosoyama, Akira; Fujita, Nobuyuki; Thamchaipenet, Arinthip; Igarashi, Yasuhiro

    2015-01-01

    Here, we report the draft genome sequence of Microbispora sp. GMKU 363, a plant-derived actinomycete that produces linfuranone A, a linear polyketide modified with a furanone ring possessing adipocyte differentiation inducing activity. The biosynthetic gene cluster for linfuranone was identified by analyzing polyketide synthase genes in the genome. PMID:26659694

  12. Draft Genome Sequence of Entomopathogenic Serratia liquefaciens Strain FK01

    PubMed Central

    Taira, Erika; Mon, Hiroaki; Mori, Kazuki; Akasaka, Taiki; Tashiro, Kousuke; Yasunaga-Aoki, Chisa; Lee, Jae Man; Kusakabe, Takahiro

    2014-01-01

    In the present study, we determined the draft genome sequence of the entomopathogenic bacterium Serratia liquefaciens FK01, which is highly virulent to the silkworm. The draft genome is ~5.28 Mb in size, and the G+C content is 55.8%. PMID:24970828

  13. Draft Genome Sequence of Corynebacterium pseudodiphtheriticum Strain 090104 "Sokolov".

    PubMed

    Karlyshev, Andrey V; Melnikov, Vyacheslav G

    2013-01-01

    This report describes the first draft genome sequence of a Corynebacterium pseudodiphtheriticum strain. The information on the genome organization and putative gene products will assist in better understanding of the molecular mechanisms involved in the beneficial probiotic effects of this bacterium. PMID:24201200

  14. Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...

  15. Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium

    PubMed Central

    Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  16. Draft genome sequences of 10 strains of the genus exiguobacterium.

    PubMed

    Vishnivetskaya, Tatiana A; Chauhan, Archana; Layton, Alice C; Pfiffner, Susan M; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C; Markowitz, Victor M; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W; Pati, Amrita; Stamatis, Dimitrios; Reddy, T B K; Shapiro, Nicole; Nordberg, Henrik P; Cantor, Michael N; Hua, X Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  17. Genome Sequence of Type Strain Lysinibacillus macroides DSM 54T

    PubMed Central

    Liu, Guo-hong; Wang, Jie-ping; Che, Jian-Mei; Chen, Qian-Qian; Chen, Zheng; Ge, Ci-bin

    2015-01-01

    Lysinibacillus macroides DSM 54T is a Gram-positive, spore-forming bacterium. Here, we report the 4,866,035-bp genome sequence of Lysinibacillus macroides DSM 54T, which will accelerate the application of degrading xylan and provide useful information for genomic taxonomy and phylogenomics of Bacillus-like bacteria. PMID:26543111

  18. Complete genome sequence of pronghorn virus, a pestivirus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  19. Draft genome sequence of Therminicola potens strain JR

    SciTech Connect

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  20. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    SciTech Connect

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin, since not only are their genomes available, but they are also accompanied by data on environment and physiology that can be used to understand the resulting data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into this effort. Efforts to sequence cultivated isolates should target characterized isolates from culture collections for which biochemical data are available, as well as other cultures of lasting value from personal collections. The genomes of type strains should be among the first targets for sequencing, but creative culture methods, novel cell isolation, and sorting methods would all be helpful in obtaining organisms we have not yet been able to cultivate for sequencing. The data that should be provided for strains targeted for sequencing will depend on the phylogenetic context of the organism and the amount of information available about its nearest relatives. Annotation is an important part of transforming genome sequences into useful resources, but it represents the most significant bottleneck to the field of comparative genomics right now and must be addressed. Furthermore, there is a need for more consistency in both annotation and achieving annotation data. As new annotation tools become available over time, re-annotation of genomes should be implemented, taking advantage of advancements in annotation techniques in order to capitalize on the genome sequences and increase both the societal and scientific benefit of genomics work. Given the proper resources, the knowledge and ability exist to be able to select model systems, some simple, some less so, and dissect them so that we may understand the processes and interactions at work in them. Colloquium participants suggest a five-pronged, coordinated initiative to exhaustively describe six different microbial ecosystems, designed to describe all the gene diversity, across genomes. In this effort, sequencing should be complemented by other experimental data, particularly transcriptomics and metabolomics data, all of which

  1. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    PubMed

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-01-01

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181?Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40?Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299?Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled. PMID:26586576

  2. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis

    PubMed Central

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-01-01

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181?Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40?Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299?Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled. PMID:26586576

  3. A 454 sequencing approach to dipteran mitochondrial genome research.

    PubMed

    Ramakodi, Meganathan P; Singh, Baneshwar; Wells, Jeffrey D; Guerrero, Felix; Ray, David A

    2015-01-01

    The availability of complete mitochondrial genome (mtgenome) data for Diptera, one of the largest metazoan orders, in public databases is limited. The advent of high throughput sequencing technology provides the potential to generate mtgenomes for many species affordably and quickly. However, these technologies need to be validated for dipterans as the members of this clade play important economic and research roles. Illumina and 454 sequencing platforms are widely used in genomic research involving non-model organisms. The Illumina platform has already been utilized for generating mitochondrial genomes without using conventional long range PCR for insects whereas the power of 454 sequencing for generating mitochondrial genome drafts without PCR has not yet been validated for insects. Thus, this study examines the utility of 454 sequencing approach for dipteran mtgenomic research. We generated complete or nearly complete mitochondrial genomes for Cochliomyia hominivorax, Haematobia irritans, Phormia regina and Sarcophaga crassipalpis using a 454 sequencing approach. Comparisons between newly obtained and existing assemblies for C. hominivorax and H. irritans revealed no major discrepancies and verified the utility of 454 sequencing for dipteran mitochondrial genomes. We also report the complete mitochondrial sequences for two forensically important flies, P. regina and S. crassipalpis, which could be used to provide useful information to legal personnel. Comparative analyses revealed that dipterans follow similar codon usage and nucleotide biases that could be due to mutational and selection pressures. This study illustrates the utility of 454 sequencing to obtain complete mitochondrial genomes for dipterans without the aid of conventional molecular techniques such as PCR and cloning and validates this method of mtgenome sequencing in arthropods. PMID:25451744

  4. Sequencing and comparing whole mitochondrial genomes of animals.

    PubMed

    Boore, Jeffrey L; Macey, J Robert; Medina, Mónica

    2005-01-01

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, which can be especially powerful. We describe here the protocols commonly used for physically isolating mitochondrial DNA (mtDNA), for amplifying these by polymerase chain reaction (PCR) or rolling circle amplification (RCA), for cloning, sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences with determining and comparing complete mitochondrial DNA sequences. PMID:15865975

  5. Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences

    PubMed Central

    Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

    2002-01-01

    As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

  6. National Institutes of Health to Map Genomic Changes of Lung, Brain, and Ovarian Cancers | Office of Cancer Genomics

    Cancer.gov

    National Institutes of Health to Map Genomic Changes of Lung, Brain, and Ovarian Cancers The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today announced the first three cancers that will be studied in the pilot phase of The Cancer Genome Atlas (TCGA) project. The cancers to be studied in the TCGA Pilot Project are lung, brain (glioblastoma), and ovarian.

  7. Genome sequence of the date palm Phoenix dactylifera L.

    PubMed

    Al-Mssallem, Ibrahim S; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O; Jia, Shangang; Yin, An; Alhuzimi, Eman M; Alsaihati, Burair A; Al-Owayyed, Saad A; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A; Sun, Gaoyuan; Majrashi, Majed A; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

    2013-01-01

    Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm's unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

  8. MicroRNAs, Genomic Instability and Cancer

    PubMed Central

    Vincent, Kimberly; Pichler, Martin; Lee, Gyeong-Won; Ling, Hui

    2014-01-01

    MicroRNAs (miRNAs) are small non-coding RNA transcripts approximately 20 nucleotides in length that regulate expression of protein-coding genes via complementary binding mechanisms. The last decade has seen an exponential increase of publications on miRNAs, ranging from every aspect of basic cancer biology to diagnostic and therapeutic explorations. In this review, we summarize findings of miRNA involvement in genomic instability, an interesting but largely neglected topic to date. We discuss the potential mechanisms by which miRNAs induce genomic instability, considered to be one of the most important driving forces of cancer initiation and progression, though its precise mechanisms remain elusive. We classify genomic instability mechanisms into defects in cell cycle regulation, DNA damage response, and mitotic separation, and review the findings demonstrating the participation of specific miRNAs in such mechanisms. PMID:25141103

  9. Study reveals genomic similarities between breast and ovarian cancers

    Cancer.gov

    A new study from The Cancer Genome Atlas captured a complete view of genomic alterations in breast cancer and classified them into four intrinsic subtypes, one of which shares many genetic features with high-grade serous ovarian cancer. Depicted are breast cancer cells with the HER2 protein, which can trigger cell growth responses, lit up in bright red. (Photo credit: NIST)

  10. Overview | Office of Cancer Genomics

    Cancer.gov

    The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative uses comprehensive molecular characterization to determine the genetic changes that drive the initiation and progression of hard-to-treat childhood cancers. TARGET aims to identify therapeutic targets and prognostic markers so that new, more effective treatment strategies can be developed and applied. Novel pediatric cancer treatments are needed because:

  11. Genome-wide identification of significant aberrations in cancer genome

    PubMed Central

    2012-01-01

    Background Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme. Results We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies. Conclusions Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open–source and platform-independent SAIC software is implemented using C++, together with R scripts for data formatting and Perl scripts for user interfacing, and it is easy to install and efficient to use. The source code and documentation are freely available at http://www.cbil.ece.vt.edu/software.htm. PMID:22839576

  12. Complete mitochondrial genome sequence of Aoluguya reindeer (Rangifer tarandus).

    PubMed

    Ju, Yan; Liu, Huamiao; Rong, Min; Yang, Yifeng; Wei, Haijun; Shao, Yuanchen; Chen, Xiumin; Xing, Xiumei

    2014-12-01

    Abstract The complete mitochondria genome of the reindeer, Rangifer tarandus, was determined by accurate polymerase chain reaction. The entire genome is 16,357?bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a D-loop region, all of which are arranged in a typical vertebrate manner. The overall base composition of the reindeer's mitochondrial genome is 33.7% of A, 23.1% of C, 30.1% of T and 13.2%of G. A termination associated sequence and several conserved central sequence block domains were discovered within the control region. PMID:25469816

  13. Complete genome sequence of Serratia plymuthica strain AS12

    SciTech Connect

    Neupane, Saraswoti; Finlay, Roger D.; Alstrom, Sadhna; Goodwin, Lynne A.; Kyrpides, Nikos C; Lucas, Susan; Lapidus, Alla L.; Bruce, David; Pitluck, Sam; Peters, Lin; Ovchinnikova, Galina; Chertkov, Olga; Han, James; Han, Cliff; Tapia, Roxanne; Detter, J. Chris; Land, Miriam L; Hauser, Loren John; Cheng, Jan-Fang; Ivanova, N; Pagani, Ioanna; Klenk, Hans-Peter; Woyke, Tanja; Hogberg, Nils

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  14. Complete genome sequence of Ferroglobus placidus AEDII12DO

    SciTech Connect

    Anderson, Iain; Risso, Carla; Holmes, Dawn; Lucas, Susan; Copeland, A; Lapidus, Alla L.; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Saunders, Elizabeth H; Brettin, Thomas S; Detter, J. Chris; Han, Cliff; Tapia, Roxanne; Larimer, Frank W; Land, Miriam L; Hauser, Loren John; Woyke, Tanja; Lovley, Derek; Kyrpides, Nikos C; Ivanova, N

    2011-01-01

    Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryar- chaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemoli- thoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and anno- tation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was se- quenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project.

  15. RESTseq – Efficient Benchtop Population Genomics with RESTriction Fragment SEQuencing

    PubMed Central

    Stolle, Eckart; Moritz, Robin F. A.

    2013-01-01

    We present RESTseq, an improved approach for a cost efficient, highly flexible and repeatable enrichment of DNA fragments from digested genomic DNA using Next Generation Sequencing platforms including small scale Personal Genome sequencers. Easy adjustments make it suitable for a wide range of studies requiring SNP detection or SNP genotyping from fine-scale linkage mapping to population genomics and population genetics also in non-model organisms. We demonstrate the validity of our approach by comparing two honeybee and several stingless bee samples. PMID:23691128

  16. RESTseq--efficient benchtop population genomics with RESTriction Fragment SEQuencing.

    PubMed

    Stolle, Eckart; Moritz, Robin F A

    2013-01-01

    We present RESTseq, an improved approach for a cost efficient, highly flexible and repeatable enrichment of DNA fragments from digested genomic DNA using Next Generation Sequencing platforms including small scale Personal Genome sequencers. Easy adjustments make it suitable for a wide range of studies requiring SNP detection or SNP genotyping from fine-scale linkage mapping to population genomics and population genetics also in non-model organisms. We demonstrate the validity of our approach by comparing two honeybee and several stingless bee samples. PMID:23691128

  17. Massively parallel sequencing: the new frontier of hematologic genomics

    PubMed Central

    Nickerson, Deborah A.; Reiner, Alex P.

    2013-01-01

    Genomic technologies are becoming a routine part of human genetic analysis. The exponential growth in DNA sequencing capability has brought an unprecedented understanding of human genetic variation and the identification of thousands of variants that impact human health. In this review, we describe the different types of DNA variation and provide an overview of existing DNA sequencing technologies and their applications. As genomic technologies and knowledge continue to advance, they will become integral in clinical practice. To accomplish the goal of personalized genomic medicine for patients, close collaborations between researchers and clinicians will be essential to develop and curate deep databases of genetic variation and their associated phenotypes. PMID:24021669

  18. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    SciTech Connect

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  19. Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology

    PubMed Central

    Cronn, Richard; Liston, Aaron; Parks, Matthew; Gernandt, David S.; Shen, Rongkun; Mockler, Todd

    2008-01-01

    Organellar DNA sequences are widely used in evolutionary and population genetic studies, however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified ?120 kb plastomes from eight species (seven Pinus, one Picea) in 35 reactions. Pooled products were ligated to modified adapters that included 3 bp indexing tags and samples were multiplexed at four genomes per lane. Tagged microreads were assembled by de novo and reference-guided assembly methods, using previously published Pinus plastomes as surrogate references. Assemblies for these eight genomes are estimated at 88–94% complete, with an average sequence depth of 55× to 186×. Mononucleotide repeats interrupt contig assembly with increasing repeat length, and we estimate that the limit for their assembly is 16 bp. Comparisons to 37 kb of Sanger sequence show a validated error rate of 0.056%, and conspicuous errors are evident from the assembly process. This efficient sequencing approach yields high-quality draft genomes and should have immediate applicability to genomes with comparable complexity. PMID:18753151

  20. Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology.

    PubMed

    Cronn, Richard; Liston, Aaron; Parks, Matthew; Gernandt, David S; Shen, Rongkun; Mockler, Todd

    2008-11-01

    Organellar DNA sequences are widely used in evolutionary and population genetic studies, however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified approximately 120 kb plastomes from eight species (seven Pinus, one Picea) in 35 reactions. Pooled products were ligated to modified adapters that included 3 bp indexing tags and samples were multiplexed at four genomes per lane. Tagged microreads were assembled by de novo and reference-guided assembly methods, using previously published Pinus plastomes as surrogate references. Assemblies for these eight genomes are estimated at 88-94% complete, with an average sequence depth of 55x to 186x. Mononucleotide repeats interrupt contig assembly with increasing repeat length, and we estimate that the limit for their assembly is 16 bp. Comparisons to 37 kb of Sanger sequence show a validated error rate of 0.056%, and conspicuous errors are evident from the assembly process. This efficient sequencing approach yields high-quality draft genomes and should have immediate applicability to genomes with comparable complexity. PMID:18753151

  1. Draft genome sequences of two virulent serotypes of avian Pasteurella multocida

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent Pasteurella multocida strain Pm70....

  2. Sequence Analysis and Organization of the Neodiprion abietis Nucleopolyhedrovirus Genome

    PubMed Central

    Duffy, Simon P.; Young, Aaron M.; Morin, Benoit; Lucarotti, Christopher J.; Koop, Ben F.; Levin, David B.

    2006-01-01

    Of 30 baculovirus genomes that have been sequenced to date, the only nonlepidopteran baculoviruses include the dipteran Culex nigripalpus nucleopolyhedrovirus and two hymenopteran nucleopolyhedroviruses that infect the sawflies Neodiprion lecontei (NeleNPV) and Neodiprion sertifer (NeseNPV). This study provides a complete sequence and genome analysis of the nucleopolyhedrovirus that infects the balsam fir sawfly Neodiprion abietis (Hymenoptera, Symphyta, Diprionidae). The N. abietis nucleopolyhedrovirus (NeabNPV) is 84,264 bp in size, with a G+C content of 33.5%, and contains 93 predicted open reading frames (ORFs). Eleven predicted ORFs are unique to this baculovirus, 10 ORFs have a putative sequence homologue in the NeleNPV genome but not the NeseNPV genome, and 1 ORF (neab53) has a putative sequence homologue in the NeseNPV genome but not the NeleNPV genome. Specific repeat sequences are coincident with major genome rearrangements that distinguish NeabNPV and NeleNPV. Genes associated with these repeat regions encode a common amino acid motif, suggesting that they are a family of repeated contiguous gene clusters. Lepidopteran baculoviruses, similarly, have a family of repeated genes called the bro gene family. However, there is no significant sequence similarity between the NeabNPV and bro genes. Homologues of early-expressed genes such as ie-1 and lef-3 were absent in NeabNPV, as they are in the previously sequenced hymenopteran baculoviruses. Analyses of ORF upstream sequences identified potential temporally distinct genes on the basis of putative promoter elements. PMID:16809301

  3. Decoding the fine-scale structure of a breast cancer genome and transcriptome

    PubMed Central

    Volik, Stanislav; Raphael, Benjamin J.; Huang, Guiqing; Stratton, Michael R.; Bignel, Graham; Murnane, John; Brebner, John H.; Bajsarowicz, Krystyna; Paris, Pamela L.; Tao, Quanzhou; Kowbel, David; Lapuk, Anna; Shagin, Dmitri A.; Shagina, Irina A.; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter J.; Pevzner, Pavel; Collins, Colin

    2006-01-01

    A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we extend the ESP methodology in several directions using the breast cancer cell line MCF-7. First, targeted ESP is applied to multiple amplified loci, revealing a complex process of rearrangement and coamplification in these regions reminiscent of breakage/fusion/bridge cycles. Second, genome breakpoints identified by ESP are confirmed using a combination of DNA sequencing and PCR. Third, in vitro functional studies assign biological function to a rearranged tumor BAC clone, demonstrating that it encodes antiapoptotic activity. Finally, ESP is extended to the transcriptome identifying four novel fusion transcripts and providing evidence that expression of fusion genes may be common in tumors. These results demonstrate the distinct advantages of ESP including: (1) the ability to detect all types of rearrangements and copy number changes; (2) straightforward integration of ESP data with the annotated genome sequence; (3) immortalization of the genome; (4) ability to generate tumor-specific reagents for in vitro and in vivo functional studies. Given these properties, ESP could play an important role in a tumor genome project. PMID:16461635

  4. Genomics of chromophobe renal cell carcinoma: implications from a rare tumor for pan-cancer studies

    PubMed Central

    Rathmell, Kimryn W.; Chen, Fengju; Creighton, Chad J.

    2015-01-01

    Chromophobe Renal Cell Carcinoma (ChRCC) is a rare subtype of the renal cell carcinomas, a heterogenous group of cancers arising from the nephron. Recently, The Cancer Genome Atlas (TCGA) profiled this understudied disease using multiple data platforms, including whole exome sequencing, whole genome sequencing (WGS), and mitochondrial DNA (mtDNA) sequencing. The insights gained from this study would have implications for other types of kidney cancer as well as for cancer biology in general. Global molecular patterns in ChRCC provided clues as to this cancer's cell of origin, which is distinct from that of the other renal cell carcinomas, illustrating an approach that might be applied towards elucidating the cell of origin of other cancer types. MtDNA sequencing revealed loss-of-function mutations in NADH dehydrogenase subunits, highlighting the role of deregulated metabolism in this and other cancers. Analysis of WGS data led to the discovery of recurrent genomic rearrangements involving TERT promoter region, which were associated with very high expression levels of TERT, pointing to a potential mechanism for TERT deregulation that might be found in other cancers. WGS data, generated by large scale efforts such as TCGA and the International Cancer Genomics Consortium (ICGC), could be more extensively mined across various cancer types, to uncover structural variants, mtDNA mutations, themes of tumor metabolic properties, as well as noncoding point mutations. TCGA's data on ChRCC should continue to serve as a resource for future pan-cancer as well as kidney cancer studies, and highlight the value of investigations into rare tumor types to globally inform principals of cancer biology. PMID:25859550

  5. Complete genome sequence of equine herpesvirus type 9.

    PubMed

    Fukushi, Hideto; Yamaguchi, Tsuyoshi; Yamada, Souichi

    2012-12-01

    Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

  6. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  7. Comparison of mitochondrial genome sequences of pangolins (Mammalia, Pholidota).

    PubMed

    Hassanin, Alexandre; Hugot, Jean-Pierre; van Vuuren, Bettine Jansen

    2015-04-01

    The complete mitochondrial genome was sequenced for three species of pangolins, Manis javanica, Phataginus tricuspis, and Smutsia temminckii, and comparisons were made with two other species, Manis pentadactyla and Phataginus tetradactyla. The genome of Manidae contains the 37 genes found in a typical mammalian genome, and the structure of the control region is highly conserved among species. In Manis, the overall base composition differs from that found in African genera. Phylogenetic analyses support the monophyly of the genera Manis, Phataginus, and Smutsia, as well as the basal division between Maninae and Smutsiinae. Comparisons with GenBank sequences reveal that the reference genomes of M. pentadactyla and P. tetradactyla (accession numbers NC_016008 and NC_004027) were sequenced from misidentified taxa, and that a new species of tree pangolin should be described in Gabon. PMID:25746396

  8. Glossary | Office of Cancer Genomics

    Cancer.gov

    A   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P   Q   R   S   T   U   V   W   X   Y   Z     B Bioinformatics The use of computing tools to manage and analyze genomic and molecular biological data.

  9. Sequencing the Genome of the Heirloom Watermelon Cultivar Charleston Gray

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the watermelon cultivar Charleston Gray, a major heirloom which has been used in breeding programs of many watermelon cultivars, was sequenced. Our strategy involved a hybrid approach using the Illumina and 454/Titanium next-generation sequencing technologies. For Illumina, shotgun g...

  10. GENOMIC SEQUENCE ANALYSIS OF LEPTOSPIRA BORGPETERSENII SEROVAR HARDJO

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A genomic library from Leptospira borgpetersenii serovar hardjo strain JB197 was prepared by mechanically shearing the DNA and inserting it into a positive selection vector. DNA was prepared from approximately 22,000 random clones and used as templates for automated sequencing. Sequence data was c...

  11. Environmental Genome Shotgun Sequencing of the Sargasso Sea

    E-print Network

    Bruns, Tom

    Environmental Genome Shotgun Sequencing of the Sargasso Sea J. Craig Venter,1 * Karin Remington,1 collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence characterization. To help ensure a tractable pilot study, we sampled in the Sargasso Sea, a nutrient- limited, open

  12. Draft Genome Sequence of Prosthecomicrobium hirschii ATCC 27832T

    PubMed Central

    Daniel, Jeremy J.; Givan, Scott A.; Brun, Yves V.

    2015-01-01

    We report the draft genome sequence of Prosthecomicrobium hirschii ATCC 27832T, an alphaproteobacterium with remarkable cellular morphologies. The chromosome comprises 6,484,983 bp in six scaffolds with a G+C content of 69%, and 6,066 potential coding sequences. PMID:26586892

  13. Draft Genome Sequence of Lactobacillus fermentum NB-22

    PubMed Central

    Shkoporov, A. N.; Efimov, B. A.; Pikina, A. P.; Borisova, O. Y.; Gladko, I. A.; Postnikova, E. A.; Lordkipanidze, A. E.; Kafarskaia, L. I.

    2015-01-01

    We announce here a draft genome sequence of Lactobacillus fermentum NB-22, a strain isolated from human vaginal microbiota. The assembled sequence consists of 190 contigs, joined into 137 scaffolds, and the total size is 2.01 Mb. PMID:26272572

  14. Rosetta Genomics Announces Next-Generation Sequencing Research Collaboration with

    E-print Network

    Pilpel, Yitzhak

    discoveries and technological applications. Working in collaboration with the Institute provides us identification of microRNA sequences. These advances will allow us to incorporate sequencing in more of our treatment. Rosetta Genomics estimates that, in the U.S. alone, 200,000 patients a year may benefit from

  15. Complete Genome Sequences of Mandrillus leucophaeus and Papio ursinus Cytomegaloviruses.

    PubMed

    Blewett, Earl Linwood; Sherrod, Carly J; Texier, Jordan R; Conrad, Tom M; Dittmer, Dirk P

    2015-01-01

    The complete genome sequences of Mandrillus leucophaeus and Papio ursinus cytomegaloviruses were determined. An isolate from a drill monkey, OCOM6-2, and an isolate from a chacma baboon, OCOM4-52, were subjected to pyrosequencing and assembled. Comparative alignment of published primate cytomegaloviruses (CMVs) showed variable sequence conservation between species. PMID:26251484

  16. Distribution and intensity of constraint in mammalian genomic sequence

    E-print Network

    Sidow, Arend

    sequence conservation to identify regions of functional im- portance in mammals (Pennacchio et al. 2001 that comparative sequence analysis is a powerful paradigm for the discovery of those functional regions in the human genome whose experimental discovery is difficult (O'Brien et al. 1999; Hardison 2000; Pennacchio

  17. Complete Genome Sequence of the Alfalfa latent virus

    PubMed Central

    Shao, Jonathan; Postnikova, Olga A.

    2015-01-01

    The first complete genome sequence of the Alfalfa latent carlavirus (ALV) was obtained by primer walking and Illumina RNA sequencing. The virus differs substantially from the Czech ALV isolate and the Pea streak virus isolate from Wisconsin. The absence of a clear nucleic acid-binding protein indicates ALV divergence from other carlaviruses. PMID:25883281

  18. Complete Genomic Sequence of Issyk-Kul Virus

    PubMed Central

    Marston, Denise A.; Ellis, Richard J.; Fooks, Anthony R.; Hewson, Roger

    2015-01-01

    Issyk-Kul virus (ISKV) is an ungrouped virus tentatively assigned to the Bunyaviridae family and is associated with an acute febrile illness in several central Asian countries. Using next-generation sequencing technologies, we report here the full-genome sequence for this novel unclassified arboviral pathogen circulating in central Asia. PMID:26139711

  19. Complete Genome Sequence of Kocuria palustris MU14/1

    PubMed Central

    Foecking, Mark F.

    2015-01-01

    Presented here is the first completely assembled genome sequence of Kocuria palustris, an actinobacterial species with broad ecological distribution. The single, circular chromosome of K. palustris MU14/1 comprises 2,854,447 bp, has a G+C content of 70.5%, and contains a deduced gene set of 2,521 coding sequences. PMID:26472837

  20. PHYTOPHTHORA GENOME SEQUENCES UNCOVER EVOLUTIONARY ORIGINS AND MECHANISMS OF PATHOGENESIS

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Draft genome sequences of the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum have been determined. Oomycetes such as these Phytophthora species share the kingdom Stramenopiles with photosynthetic algae such as diatoms, and the Phytophthora sequences sugges...

  1. Complete Genome Sequence of Caulobacter crescentus Siphophage Sansa

    PubMed Central

    Vara, Leonardo; Kane, Ashley A.; Cahill, Jesse L.; Rasche, Eric S.

    2015-01-01

    Caulobacter crescentus is a Gram-negative dimorphic model organism used to study cell differentiation. Siphophage Sansa is a newly isolated siphophage with an icosahedral capsid that infects C. crescentus. Sansa shares no sequence similarity to other phages deposited in GenBank. Here, we describe its genome sequence and general features. PMID:26450723

  2. Draft Genome Sequences of Two Toxigenic Corynebacterium ulcerans Strains

    PubMed Central

    Fournier, Eric; Massé, Cynthia; Charest, Hugues; Bernard, Kathryn; Côté, Jean-Charles; Tremblay, Cécile

    2015-01-01

    Here, we present the draft genome sequences of two toxigenic Corynebacterium ulcerans strains isolated from two different patients: one from a blood sample and the other from a scar exudate following surgery. Although these two strains harbor the diphtheria toxin gene tox, no full prophage sequences were found in the flanking regions. PMID:26112794

  3. Complete Genome Sequence of Caulobacter crescentus Siphophage Sansa.

    PubMed

    Vara, Leonardo; Kane, Ashley A; Cahill, Jesse L; Rasche, Eric S; Kuty Everett, Gabriel F

    2015-01-01

    Caulobacter crescentus is a Gram-negative dimorphic model organism used to study cell differentiation. Siphophage Sansa is a newly isolated siphophage with an icosahedral capsid that infects C. crescentus. Sansa shares no sequence similarity to other phages deposited in GenBank. Here, we describe its genome sequence and general features. PMID:26450723

  4. Genome sequence of Stachybotrys chartarum Strain 51-11

    EPA Science Inventory

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  5. Complete Genome Sequence of Southern tomato virus Identified in China Using Next-Generation Sequencing

    PubMed Central

    Padmanabhan, Chellappan; Zheng, Yi; Li, Rugang; Sun, Shu-E; Zhang, Deyong; Liu, Yong; Fei, Zhangjun

    2015-01-01

    The complete genome sequence of Southern tomato virus (STV), a double-stranded RNA virus that affects tomato in China, was determined using small RNA deep sequencing. This Chinese isolate shares 99% sequence identity to other isolates from Mexico, France, Spain, and the United States. This is the first report of STV infecting tomatoes in Asia. PMID:26494671

  6. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  7. Genomic Sequencing of Single Microbial Cells from Environmental Samples

    SciTech Connect

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

    2008-02-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

  8. Neuroblastoma | Office of Cancer Genomics

    Cancer.gov

    Neuroblastoma (NBL) is a cancer that arises in immature nerve cells of the sympathetic nervous system, primarily affecting infants and children. It can have a devastating impact on patients and their families.

  9. Research | Office of Cancer Genomics

    Cancer.gov

    Each TARGET project team is characterizing a “discovery” cohort of patient cases to identify molecular alterations in primary tumors and relapsed tumors (when available). Whenever possible, they verify the presence of cancer-associated mutations in the same “discovery” cohort. In addition, the teams validate the alterations found in the discovery cohort using a distinct group of patient cases to estimate the population prevalence of the alterations in a given cancer subtype.

  10. Osteosarcoma | Office of Cancer Genomics

    Cancer.gov

    Osteosarcoma (OS) is the most common type of bone cancer in children and adolescents. It is most frequently diagnosed in adolescent patients experiencing periods of rapid growth. As with other childhood cancers being studied by TARGET, improvements in survival outcomes for OS have plateaued despite attempts in refining the standard treatment protocol. Additionally, patients endure rigorous therapy regimens regardless of whether the disease is localized or metastatic.

  11. Research | Office of Cancer Genomics

    Cancer.gov

    The CTD2 initiative seeks novel insights into cancer etiology that can be developed and in the future applied to improve therapeutic strategies. To achieve this goal, each center utilizes a distinct array of advanced computational and functional systems biology approaches. These methods allow reconstruction of cell-context specific gene networks that underlie each cancer subtype. The CTD2 Centers gain power from having both complementary and reinforcing expertise.

  12. CancerGenes: a gene selection resource for cancer genome projects.

    PubMed

    Higgins, Maureen E; Claremont, Martine; Major, John E; Sander, Chris; Lash, Alex E

    2007-01-01

    The genome sequence framework provided by the human genome project allows us to precisely map human genetic variations in order to study their association with disease and their direct effects on gene function. Since the description of tumor suppressor genes and oncogenes several decades ago, both germ-line variations and somatic mutations have been established to be important in cancer-in terms of risk, oncogenesis, prognosis and response to therapy. The Cancer Genome Atlas initiative proposed by the NIH is poised to elucidate the contribution of somatic mutations to cancer development and progression through the re-sequencing of a substantial fraction of the total collection of human genes-in hundreds of individual tumors and spanning several tumor types. We have developed the CancerGenes resource to simplify the process of gene selection and prioritization in large collaborative projects. CancerGenes combines gene lists annotated by experts with information from key public databases. Each gene is annotated with gene name(s), functional description, organism, chromosome number, location, Entrez Gene ID, GO terms, InterPro descriptions, gene structure, protein length, transcript count, and experimentally determined transcript control regions, as well as links to Entrez Gene, COSMIC, and iHOP gene pages and the UCSC and Ensembl genome browsers. The user-friendly interface provides for searching, sorting and intersection of gene lists. Users may view tabulated results through a web browser or may dynamically download them as a spreadsheet table. CancerGenes is available at http://cbio.mskcc.org/cancergenes. PMID:17088289

  13. Sequence Analysis of the Genome of Carnation (Dianthus caryophyllus L.)

    PubMed Central

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-01-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. ‘Francesco’ was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568 887 315 bp, consisting of 45 088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16 644 bp and 60 737 bp, respectively, and the longest scaffold was 1 287 144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ?98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. PMID:24344172

  14. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    PubMed

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ? 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. PMID:24344172

  15. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ?65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  16. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  17. Draft Genome Sequences of Two South African Bacillus anthracis Strains

    PubMed Central

    Lekota, Kgaugelo E.; Mafofo, Joseph; Madoroba, Evelyn; Rees, Jasper; van Heerden, Henriette

    2015-01-01

    Bacillus anthracis is a Gram-positive bacterium that causes anthrax, mainly in herbivores through exotoxins and capsule produced on plasmids, pXO1 and pXO2. This paper compares the whole-genome sequences of two B. anthracis strains from an endemic region and a sporadic outbreak in South Africa. Sequencing was done using next-generation sequencing technologies. PMID:26586878

  18. Draft Genome Sequences of Two South African Bacillus anthracis Strains.

    PubMed

    Lekota, Kgaugelo E; Mafofo, Joseph; Madoroba, Evelyn; Rees, Jasper; van Heerden, Henriette; Muchadeyi, Farai C

    2015-01-01

    Bacillus anthracis is a Gram-positive bacterium that causes anthrax, mainly in herbivores through exotoxins and capsule produced on plasmids, pXO1 and pXO2. This paper compares the whole-genome sequences of two B. anthracis strains from an endemic region and a sporadic outbreak in South Africa. Sequencing was done using next-generation sequencing technologies. PMID:26586878

  19. A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    FitzGerald, Michael [Broad Institute

    2013-02-12

    Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  20. A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)

    SciTech Connect

    FitzGerald, Michael

    2012-06-01

    Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  1. Sequence-Based Mapping of the Polyploid Wheat Genome

    PubMed Central

    Saintenac, Cyrille; Jiang, Dayou; Wang, Shichen; Akhunov, Eduard

    2013-01-01

    The emergence of new sequencing technologies has provided fast and cost-efficient strategies for high-resolution mapping of complex genomes. Although these approaches hold great promise to accelerate genome analysis, their application in studying genetic variation in wheat has been hindered by the complexity of its polyploid genome. Here, we applied the next-generation sequencing of a wheat doubled-haploid mapping population for high-resolution gene mapping and tested its utility for ordering shotgun sequence contigs of a flow-sorted wheat chromosome. A bioinformatical pipeline was developed for reliable variant analysis of sequence data generated for polyploid wheat mapping populations. The results of variant mapping were consistent with the results obtained using the wheat 9000 SNP iSelect assay. A reference map of the wheat genome integrating 2740 gene-associated single-nucleotide polymorphisms from the wheat iSelect assay, 1351 diversity array technology, 118 simple sequence repeat/sequence-tagged sites, and 416,856 genotyping-by-sequencing markers was developed. By analyzing the sequenced megabase-size regions of the wheat genome we showed that mapped markers are located within 40?100 kb from genes providing a possibility for high-resolution mapping at the level of a single gene. In our population, gene loci controlling a seed color phenotype cosegregated with 2459 markers including one that was located within the red seed color gene. We demonstrate that the high-density reference map presented here is a useful resource for gene mapping and linking physical and genetic maps of the wheat genome. PMID:23665877

  2. Detecting somatic mutations in genomic sequences by means of Kolmogorov–Arnold analysis

    PubMed Central

    Gurzadyan, V. G.; Yan, H.; Vlahovic, G.; Kashin, A.; Killela, P.; Reitman, Z.; Sargsyan, S.; Yegorian, G.; Milledge, G.; Vlahovic, B.

    2015-01-01

    The Kolmogorov–Arnold stochasticity parameter technique is applied for the first time to the study of cancer genome sequencing, to reveal mutations. Using data generated by next-generation sequencing technologies, we have analysed the exome sequences of brain tumour patients with matched tumour and normal blood. We show that mutations contained in sequencing data can be revealed using this technique, thus providing a new methodology for determining subsequences of given length containing mutations, i.e. its value differs from those of subsequences without mutations. A potential application for this technique involves simplifying the procedure of finding segments with mutations, speeding up genomic research and accelerating its implementation in clinical diagnostics. Moreover, the prediction of a mutation associated with a family of frequent mutations in numerous types of cancers based purely on the value of the Kolmogorov function indicates that this applied marker may recognize genomic sequences that are in extremely low abundance and can be used in revealing new types of mutations. PMID:26361546

  3. Medulloblastoma | Office of Cancer Genomics

    Cancer.gov

    CGCI developed the Medulloblastoma Project to apply newly emerging genomic methods towards the discovery of novel genetic alterations in medulloblastoma (MB)Opens in a New Tab. MB is the most common malignant brain tumor in children, accounting for approximately 20% of all pediatric brain tumors. Despite significant progress in treatment over the last several decades, about 50% of MB patients do not live more than 5 years after diagnosis.

  4. TARGET Publication Guidelines | Office of Cancer Genomics

    Cancer.gov

    Like other NCI large-scale genomics initiatives, TARGET is a community resource project and data are made available rapidly after validation for use by other researchers. To act in accord with the Fort Lauderdale principles and support the continued prompt public release of large-scale genomic data prior to publication, researchers who plan to prepare manuscripts containing descriptions of TARGET pediatric cancer data that would be of comparable scope to an initial TARGET disease-specific comprehensive, global analysis publication, and journal editors who receive such manuscripts, are stron

  5. The Cancer Genome Atlas: Generating a “Parts List” for Cancer

    Cancer.gov

    When I was working in a genomics laboratory in the 1990s, I sequenced a section of human chromosome 7. The part I focused on coded for a tumor-suppressor gene. At that time, we had to read the DNA sequence letter by letter in short stretches and then figure out where the genes resided. It was tedious and required extreme care and attention to detail, plus a fair amount of educated guesses.

  6. Pan-cancer analysis of the extent and consequences of intratumor heterogeneity | Office of Cancer Genomics

    Cancer.gov

    Intratumor heterogeneity (ITH) drives neoplastic progression and therapeutic resistance. We used the bioinformatics tools 'expanding ploidy and allele frequency on nested subpopulations' (EXPANDS) and PyClone to detect clones that are present at a ?10% frequency in 1,165 exome sequences from tumors in The Cancer Genome Atlas. 86% of tumors across 12 cancer types had at least two clones. ITH in the morphology of nuclei was associated with genetic ITH (Spearman's correlation coefficient, ? = 0.24-0.41; P < 0.001).

  7. Cancer genomics: why rare is valuable.

    PubMed

    Jamshidi, Farzad; Nielsen, Torsten O; Huntsman, David G

    2015-04-01

    Rare conditions are sometimes ignored in biomedical research because of difficulties in obtaining specimens and limited interest from fund raisers. However, the study of rare diseases such as unusual cancers has again and again led to breakthroughs in our understanding of more common diseases. It is therefore unsurprising that with the development and accessibility of next-generation sequencing, much has been learnt from studying cancers that are rare and in particular those with uniform biological and clinical behavior. Herein, we describe how shotgun sequencing of cancers such as granulosa cell tumor, endometrial stromal sarcoma, epithelioid hemangioendothelioma, ameloblastoma, small-cell carcinoma of the ovary, clear-cell carcinoma of the ovary, nonepithelial ovarian tumors, chondroblastoma, and giant cell tumor of the bone has led to rapidly translatable discoveries in diagnostics and tumor taxonomies, as well as providing insights into cancer biology. PMID:25676695

  8. Tyrosine kinome sequencing of pediatric acute lymphoblastic leukemia: a report from the Children's Oncology Group TARGET Project | Office of Cancer Genomics

    Cancer.gov

    TARGET researchers sequenced the tyrosine kinome and downstream signaling genes in 45 high-risk pediatric ALL cases with activated kinase signaling, including Ph-like ALL, to establish the incidence of tyrosine kinase mutations in this cohort. The study confirmed previously identified somatic mutations in JAK and FLT3, but did not find novel alterations in any additional tyrosine kinases or downstream genes. The mechanism of kinase signaling activation in this high-risk subgroup of pediatric ALL remains largely unknown.

  9. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  10. Identification of Variant-Specific Functions of PIK3CA by Rapid Phenotyping of Rare Mutations | Office of Cancer Genomics

    Cancer.gov

    Large-scale sequencing efforts are uncovering the complexity of cancer genomes, which are composed of causal "driver" mutations that promote tumor progression along with many more pathologically neutral "passenger" events. The majority of mutations, both in known cancer drivers and uncharacterized genes, are generally of low occurrence, highlighting the need to functionally annotate the long tail of infrequent mutations present in heterogeneous cancers.

  11. Evolution Analysis of Simple Sequence Repeats in Plant Genome

    PubMed Central

    Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

    2015-01-01

    Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1–3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution. PMID:26630570

  12. Endoplasmic Reticulum Stress, Genome Damage, and Cancer

    PubMed Central

    Dicks, Naomi; Gutierrez, Karina; Michalak, Marek; Bordignon, Vilceu; Agellon, Luis B.

    2015-01-01

    Endoplasmic reticulum (ER) stress has been linked to many diseases, including cancer. A large body of work has focused on the activation of the ER stress response in cancer cells to facilitate their survival and tumor growth; however, there are some studies suggesting that the ER stress response can also mitigate cancer progression. Despite these contradictions, it is clear that the ER stress response is closely associated with cancer biology. The ER stress response classically encompasses activation of three separate pathways, which are collectively categorized the unfolded protein response (UPR). The UPR has been extensively studied in various cancers and appears to confer a selective advantage to tumor cells to facilitate their enhanced growth and resistance to anti-cancer agents. It has also been shown that ER stress induces chromatin changes, which can also facilitate cell survival. Chromatin remodeling has been linked with many cancers through repression of tumor suppressor and apoptosis genes. Interplay between the classic UPR and genome damage repair mechanisms may have important implications in the transformation process of normal cells into cancer cells. PMID:25692096

  13. Genome Sequence of the Pea Aphid Acyrthosiphon pisum

    PubMed Central

    2010-01-01

    Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. PMID:20186266

  14. Corruption of genomic databases with anomalous sequence.

    PubMed Central

    Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L

    1992-01-01

    We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%. PMID:1614861

  15. Comparison of methods for genomic localization of gene trap sequences

    PubMed Central

    Harper, Courtney A; Huang, Conrad C; Stryke, Doug; Kawamoto, Michiko; Ferrin, Thomas E; Babbitt, Patricia C

    2006-01-01

    Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences) were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular. PMID:16982004

  16. Overview | Office of Cancer Genomics

    Cancer.gov

    The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative uses comprehensive molecular characterization to determine the genetic changes that drive the initiation and progression of hard-to-treat childhood cancers. TARGET aims to identify therapeutic targets and prognostic markers so that new, more effective treatment strategies can be developed and applied.

  17. Resources | Office of Cancer Genomics

    Cancer.gov

    OCG provides a variety of scientific and educational resources for both cancer researchers and members of the general public. These resources are divided into the following types: OCG-Supported Resources: Tools, databases, and reagents generated by initiated and completed OCG programs for researchers, educators, and students. (Note: Databases being maintained by current OCG programs are available through program-specific data matrices)

  18. The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data.

    PubMed

    Wilks, Christopher; Cline, Melissa S; Weiler, Erich; Diehkans, Mark; Craft, Brian; Martin, Christy; Murphy, Daniel; Pierce, Howdy; Black, John; Nelson, Donavan; Litzinger, Brian; Hatton, Thomas; Maltbie, Lori; Ainsworth, Michael; Allen, Patrick; Rosewood, Linda; Mitchell, Elizabeth; Smith, Bradley; Warner, Jim; Groboske, John; Telc, Haifang; Wilson, Daniel; Sanford, Brian; Schmidt, Hannes; Haussler, David; Maltbie, Daniel

    2014-01-01

    The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4?PB of data, has grown at an average rate of 50?TB a month and serves >100?TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu. PMID:25267794

  19. The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data

    PubMed Central

    Wilks, Christopher; Cline, Melissa S.; Weiler, Erich; Diehkans, Mark; Craft, Brian; Martin, Christy; Murphy, Daniel; Pierce, Howdy; Black, John; Nelson, Donavan; Litzinger, Brian; Hatton, Thomas; Maltbie, Lori; Ainsworth, Michael; Allen, Patrick; Rosewood, Linda; Mitchell, Elizabeth; Smith, Bradley; Warner, Jim; Groboske, John; Telc, Haifang; Wilson, Daniel; Sanford, Brian; Schmidt, Hannes; Haussler, David; Maltbie, Daniel

    2014-01-01

    The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4?PB of data, has grown at an average rate of 50?TB a month and serves >100?TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu PMID:25267794

  20. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells.

    PubMed

    Ju, Young Seok; Tubio, Jose M C; Mifsud, William; Fu, Beiyuan; Davies, Helen R; Ramakrishna, Manasa; Li, Yilong; Yates, Lucy; Gundem, Gunes; Tarpey, Patrick S; Behjati, Sam; Papaemmanuil, Elli; Martin, Sancha; Fullam, Anthony; Gerstung, Moritz; Nangalia, Jyoti; Green, Anthony R; Caldas, Carlos; Borg, Åke; Tutt, Andrew; Lee, Ming Ta Michael; van't Veer, Laura J; Tan, Benita K T; Aparicio, Samuel; Span, Paul N; Martens, John W M; Knappskog, Stian; Vincent-Salomon, Anne; Børresen-Dale, Anne-Lise; Eyfjörd, Jórunn Erla; Flanagan, Adrienne M; Foster, Christopher; Neal, David E; Cooper, Colin; Eeles, Rosalind; Lakhani, Sunil R; Desmedt, Christine; Thomas, Gilles; Richardson, Andrea L; Purdie, Colin A; Thompson, Alastair M; McDermott, Ultan; Yang, Fengtang; Nik-Zainal, Serena; Campbell, Peter J; Stratton, Michael R

    2015-06-01

    Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells. PMID:25963125

  1. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells

    PubMed Central

    Ju, Young Seok; Tubio, Jose M.C.; Mifsud, William; Fu, Beiyuan; Davies, Helen R.; Ramakrishna, Manasa; Li, Yilong; Yates, Lucy; Gundem, Gunes; Tarpey, Patrick S.; Behjati, Sam; Papaemmanuil, Elli; Martin, Sancha; Fullam, Anthony; Gerstung, Moritz; Nangalia, Jyoti; Green, Anthony R.; Caldas, Carlos; Borg, Åke; Tutt, Andrew; Lee, Ming Ta Michael; van't Veer, Laura J.; Tan, Benita K.T.; Aparicio, Samuel; Span, Paul N.; Martens, John W.M.; Knappskog, Stian; Vincent-Salomon, Anne; Børresen-Dale, Anne-Lise; Eyfjörd, Jórunn Erla; Flanagan, Adrienne M.; Foster, Christopher; Neal, David E.; Cooper, Colin; Eeles, Rosalind; Lakhani, Sunil R.; Desmedt, Christine; Thomas, Gilles; Richardson, Andrea L.; Purdie, Colin A.; Thompson, Alastair M.; McDermott, Ultan; Yang, Fengtang; Nik-Zainal, Serena; Campbell, Peter J.; Stratton, Michael R.

    2015-01-01

    Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells. PMID:25963125

  2. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome

    E-print Network

    Jiggins, Francis

    Andrews, Fife, United Kingdom, 3 Department of Zoology, University of Cambridge, Cambridge, United Kingdom, 4 Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Biochemistry, University of Cambridge, Cambridge, United Kingdom, 13 Evolutionsbiologie, Zoologisches Institut

  3. Widespread mitovirus sequences in plant genomes

    PubMed Central

    Warner, Benjamin E.; Yerramsetty, Pradeep

    2015-01-01

    The exploration of the evolution of RNA viruses has been aided recently by the discovery of copies of fragments or complete genomes of non-retroviral RNA viruses (Non-retroviral Endogenous RNA Viral Elements, or NERVEs) in many eukaryotic nuclear genomes. Among the most prominent NERVEs are partial copies of the RNA dependent RNA polymerase (RdRP) of the mitoviruses in plant mitochondrial genomes. Mitoviruses are in the family Narnaviridae, which are the simplest viruses, encoding only a single protein (the RdRP) in their unencapsidated viral plus strand. Narnaviruses are known only in fungi, and the origin of plant mitochondrial mitovirus NERVEs appears to be horizontal transfer from plant pathogenic fungi. At least one mitochondrial mitovirus NERVE, but not its nuclear copy, is expressed. PMID:25870770

  4. Identification of low abundance microbiome in clinical samples using whole genome sequencing.

    PubMed

    Zhang, Chao; Cleveland, Kyle; Schnoll-Sussman, Felice; McClure, Bridget; Bigg, Michelle; Thakkar, Prashant; Schultz, Nikolaus; Shah, Manish A; Betel, Doron

    2015-01-01

    Identifying the microbiome composition from primary tissues directly affords an opportunity to study the causative relationships between the host microbiome and disease. However, this is challenging due the low abundance of microbial DNA relative to the host. We present a systematic evaluation of microbiome profiling directly from endoscopic biopsies by whole genome sequencing. We compared our methods with other approaches on datasets with previously identified microbial composition. We applied this approach to identify the microbiome from 27 stomach biopsies, and validated the presence of Helicobacter pylori by quantitative PCR. Finally, we profiled the microbial composition in The Cancer Genome Atlas gastric adenocarcinoma cohort. PMID:26614063

  5. Identification of Medium-Sized Copy Number Alterations in Whole-Genome Sequencing

    PubMed Central

    Ozer, Hatice Gulcin; Usubalieva, Aisulu; Dorrance, Adrienne; Yilmaz, Ayse Selen; Caligiuri, Michael; Marcucci, Guido; Huang, Kun

    2014-01-01

    The genome-wide discoveries such as detection of copy number alterations (CNA) from high-throughput whole-genome sequencing data enabled new developments in personalized medicine. The CNAs have been reported to be associated with various diseases and cancers including acute myeloid leukemia. However, there are multiple challenges to the use of current CNA detection tools that lead to high false-positive rates and thus impede widespread use of such tools in cancer research. In this paper, we discuss these issues and propose possible solutions. First, since the entire genome cannot be mapped due to some regions lacking sequence uniqueness, current methods cannot be appropriately adjusted to handle these regions in the analyses. Thus, detection of medium-sized CNAs is also being directly affected by these mappability problems. The requirement for matching control samples is also an important limitation because acquiring matching controls might not be possible or might not be cost efficient. Here we present an approach that addresses these issues and detects medium-sized CNAs in cancer genomes by (1) masking unmappable regions during the initial CNA detection phase, (2) using pool of a few normal samples as control, and (3) employing median filtering to adjust CNA ratios to its surrounding coverage and eliminate false positives. PMID:25788829

  6. Publications | Office of Cancer Genomics

    Cancer.gov

    These results establish that the CRISPR system can be used as a modular and flexible DNA-binding platform for the recruitment of proteins to a target DNA sequence, revealing the potential of CRISPRi as a general tool for the precise regulation of gene expression in eukaryotic cells.

  7. The International Pea Genome Sequencing Project: Sequencing and Assembly Progresses Updates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The International Consortium for the Pea Genome Sequencing (ICPG) includes scientists from six countries around the world. Its aim is to provide a high quality reference of the pea genome to the scientific community as well as to the pea breeder community. The consortium proposed a strategy that int...

  8. The power of EST sequence data: Relation to Acyrthosiphon pisum genome annotation and functional genomics initiatives

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genes important to aphid biology, survival and reproduction were successfully identified by use of a genomics approach. We created and described the Sequencing, compilation, and annotation of the approxiamtely 525Mb nuclear genome of the pea aphid, Acyrthosiphon pisum, which represents an important ...

  9. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing

    PubMed Central

    Staton, Margaret; Best, Teodora; Khodwekar, Sudhir; Owusu, Sandra; Xu, Tao; Xu, Yi; Jennings, Tara; Cronn, Richard; Arumuganathan, A. Kathiravetpilla; Coggeshall, Mark; Gailing, Oliver; Liang, Haiying; Romero-Severson, Jeanne; Schlarbaum, Scott; Carlson, John E.

    2015-01-01

    Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence. PMID:26698853

  10. Locus Reference Genomic: reference sequences for the reporting of clinically relevant sequence variants

    PubMed Central

    MacArthur, Jacqueline A. L.; Morales, Joannella; Tully, Ray E.; Astashyn, Alex; Gil, Laurent; Bruford, Elspeth A.; Larsson, Pontus; Flicek, Paul; Dalgleish, Raymond; Maglott, Donna R.; Cunningham, Fiona

    2014-01-01

    Locus Reference Genomic (LRG; http://www.lrg-sequence.org/) records contain internationally recognized stable reference sequences designed specifically for reporting clinically relevant sequence variants. Each LRG is contained within a single file consisting of a stable ‘fixed’ section and a regularly updated ‘updatable’ section. The fixed section contains stable genomic DNA sequence for a genomic region, essential transcripts and proteins for variant reporting and an exon numbering system. The updatable section contains mapping information, annotation of all transcripts and overlapping genes in the region and legacy exon and amino acid numbering systems. LRGs provide a stable framework that is vital for reporting variants, according to Human Genome Variation Society (HGVS) conventions, in genomic DNA, transcript or protein coordinates. To enable translation of information between LRG and genomic coordinates, LRGs include mapping to the human genome assembly. LRGs are compiled and maintained by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). LRG reference sequences are selected in collaboration with the diagnostic and research communities, locus-specific database curators and mutation consortia. Currently >700 LRGs have been created, of which >400 are publicly available. The aim is to create an LRG for every locus with clinical implications. PMID:24285302

  11. Sequencing study on familial lung squamous cancer

    PubMed Central

    LI, SHAOMIN; WANG, LINA; MA, ZHENCHUAN; MA, YUEFENG; ZHAO, JIANGMAN; PENG, BO; QIAO, ZHE

    2015-01-01

    Lung cancer is the leading cause of cancer-related mortality worldwide. The majority of lung cancers are sporadic, and familial cases are extremely rare. Previous studies have mainly focused on sporadic lung cancer and identified a large quantity of driver genes. However, familial lung cancers are rarer and studied less. The present study recruited a Chinese family in which multiple members had developed lung squamous carcinoma. To find the causative mutations, whole exome sequencing was conducted using a peripheral blood sample of one lung squamous carcinoma patient, and certain variants were validated in more samples. Whole exome sequencing analysis obtained ~2.0 Gb of data (an average of 60x depth for each targeted base), and further validation experiments identified two functional variants in two cancer-related genes (c.1218delA:p.E406fs in PDE4DIP and C1342A:p.L448I in CLTCL1). This study therefore provides useful sources for the further study of hereditary lung cancer. PMID:26622902

  12. Sequencing and analysis of an Irish human genome

    PubMed Central

    2010-01-01

    Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge. PMID:20822512

  13. Complete genome sequence of Arthrobacter sp. strain FB24

    SciTech Connect

    Nakatsu, C. H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, T.; Han, Cliff F.; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

    2013-09-30

    Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program.

  14. Draft genome sequence of the rubber tree Hevea brasiliensis

    PubMed Central

    2013-01-01

    Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR). NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. Conclusions The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber. PMID:23375136

  15. Whole genome sequencing in clinical and public health microbiology

    PubMed Central

    Kwong, J. C.; McCallum, N.; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories. As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future. Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

  16. Complete genome sequence of Meiothermus ruber type strain (21T)

    SciTech Connect

    Tindall, Brian; Sikorski, Johannes; Lucas, Susan; Goltsman, Eugene; Copeland, A; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Pitluck, Sam; Liolios, Konstantinos; Ivanova, N; Mavromatis, K; Ovchinnikova, Galina; Pati, Amrita; Fahnrich, Regine; Goodwin, Lynne A.; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Rohde, Manfred; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla L.

    2010-01-01

    Meiothermus ruber (Loginova et al. 1984) Nobre et al. 1996 is the type species of the genus Meiothermus. This thermophilic genus is of special interest, as its members can be affiliated to either low-temperature or high-temperature groups. The temperature related split is in accordance with the chemotaxonomic feature of the polar lipids. M. ruber is a representative of the low-temperature group. This is the first completed genome sequence of the genus Meiothermus and only the third genome sequence to be published from a member of the family Thermaceae. The 3,097,457 bp long genome with its 3,052 protein-coding and 53 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  17. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)

    SciTech Connect

    Mavromatis, K; Sikorski, Johannes; Lapidus, Alla L.; Glavina Del Rio, Tijana; Copeland, A; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Ivanova, N; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Chain, Patrick S. G.; Meincke, Linda; Sims, David; Chertkov, Olga; Han, Cliff; Brettin, Tom; Detter, J C; Wahrenburg, Claudia; Rohde, Manfred; Pukall, Rudiger; Goker, Markus; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family Alicyclobacillaceae . A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Alicyclobacillaceae . The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  18. Complete genome sequence of Arthrobacter sp. strain FB24

    PubMed Central

    Nakatsu, Cindy H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, Thomas; Han, Cliff; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

    2013-01-01

    Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program. PMID:24501649

  19. Complete genome sequence of Desulfotomaculum acetoxidans type strain (5575T)

    SciTech Connect

    Spring, Stefan; Lapidus, Alla L.; Schroder, Maren; Gleim, Dorothea; Sims, David; Meincke, Linda; Glavina Del Rio, Tijana; Tice, Hope; Copeland, A; Cheng, Jan-Fang; Chen, Feng; Lucas, Susan; Nolan, Matt; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Ivanova, N; Mavromatis, K; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Chain, Patrick S. G.; Saunders, Elizabeth H; Brettin, Tom; Detter, J. Chris; Goker, Markus; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Han, Cliff

    2009-01-01

    Desulfotomaculum acetoxidans Widdel and Pfennig 1977 was one of the first sulfate-reducing bacteria known to grow with acetate as sole energy and carbon source. It is able to oxidize substrates completely to carbon dioxide with sulfate as the electron acceptor, which is reduced to hydrogen sulfide. All available data about this species are based on strain 5575T, isolated from piggery waste in Germany. Here we describe the features of this organ-ism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a Desulfotomaculum species with validly published name. The 4,545,624 bp long single replicon genome with its 4370 protein-coding and 100 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  20. Whole-genome sequencing identifies recurrent mutations in hepatocellular carcinoma

    PubMed Central

    Kan, Zhengyan; Zheng, Hancheng; Liu, Xiao; Li, Shuyu; Barber, Thomas D.; Gong, Zhuolin; Gao, Huan; Hao, Ke; Willard, Melinda D.; Xu, Jiangchun; Hauptschein, Robert; Rejto, Paul A.; Fernandez, Julio; Wang, Guan; Zhang, Qinghui; Wang, Bo; Chen, Ronghua; Wang, Jian; Lee, Nikki P.; Zhou, Wei; Lin, Zhao; Peng, Zhiyu; Yi, Kang; Chen, Shengpei; Li, Lin; Fan, Xiaomei; Yang, Jie; Ye, Rui; Ju, Jia; Wang, Kai; Estrella, Heather; Deng, Shibing; Wei, Ping; Qiu, Ming; Wulur, Isabella H.; Liu, Jiangang; Ehsani, Mariam E.; Zhang, Chunsheng; Loboda, Andrey; Sung, Wing Kin; Aggarwal, Amit; Poon, Ronnie T.; Fan, Sheung Tat; Wang, Jun; Hardwick, James; Reinhard, Christoph; Dai, Hongyue; Li, Yingrui; Luk, John M.; Mao, Mao

    2013-01-01

    Hepatocellular carcinoma (HCC) is one of the most deadly cancers worldwide and has no effective treatment, yet the molecular basis of hepatocarcinogenesis remains largely unknown. Here we report findings from a whole-genome sequencing (WGS) study of 88 matched HCC tumor/normal pairs, 81 of which are Hepatitis B virus (HBV) positive, seeking to identify genetically altered genes and pathways implicated in HBV-associated HCC. We find beta-catenin to be the most frequently mutated oncogene (15.9%) and TP53 the most frequently mutated tumor suppressor (35.2%). The Wnt/beta-catenin and JAK/STAT pathways, altered in 62.5% and 45.5% of cases, respectively, are likely to act as two major oncogenic drivers in HCC. This study also identifies several prevalent and potentially actionable mutations, including activating mutations of Janus kinase 1 (JAK1), in 9.1% of patients and provides a path toward therapeutic intervention of the disease. PMID:23788652

  1. Genome Sequences of Mycobacteriophages Luchador and Nerujay

    PubMed Central

    Ahmed, Taha; Drobitch, Marissa K.; Early, David R.; Eljamri, Soukaina; Kasturiarachi, Naomi S.; Klonicki, Emily F.; Manjooran, Daniel T.; Ní Chochlain, Aífe N.; Puglionesi, Andrew O.; Rajakumar, Vinod; Shindle, Katherine A.; Tran, Mai T.; Brown, Bryony R.; Churilla, Bryce M.; Cohen, Karen L.; Wilkes, Kellyn E.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Luchador and Nerujay are two newly isolated mycobacteriophages recovered from soil samples using Mycobacterium smegmatis. Their genomes are 53,387 bp and 53,455 bp long and have 96 and 97 predicted open reading frames, respectively. Nerujay is related to subcluster A1 phages, and Luchador represents a new subcluster, A14. PMID:26089414

  2. Genome Sequences of Mycobacteriophages Luchador and Nerujay.

    PubMed

    Pope, Welkin H; Ahmed, Taha; Drobitch, Marissa K; Early, David R; Eljamri, Soukaina; Kasturiarachi, Naomi S; Klonicki, Emily F; Manjooran, Daniel T; Ní Chochlain, Aífe N; Puglionesi, Andrew O; Rajakumar, Vinod; Shindle, Katherine A; Tran, Mai T; Brown, Bryony R; Churilla, Bryce M; Cohen, Karen L; Wilkes, Kellyn E; Grubb, Sarah R; Warner, Marcie H; Bowman, Charles A; Russell, Daniel A; Hatfull, Graham F

    2015-01-01

    Luchador and Nerujay are two newly isolated mycobacteriophages recovered from soil samples using Mycobacterium smegmatis. Their genomes are 53,387 bp and 53,455 bp long and have 96 and 97 predicted open reading frames, respectively. Nerujay is related to subcluster A1 phages, and Luchador represents a new subcluster, A14. PMID:26089414

  3. Complete genome sequence of Croceibacter atlanticus HTCC2559T.

    PubMed

    Oh, Hyun-Myung; Kang, Ilnam; Ferriera, Steve; Giovannoni, Stephen J; Cho, Jang-Cheon

    2010-09-01

    Here we announce the complete genome sequence of Croceibacter atlanticus HTCC2559(T), which was isolated by high-throughput dilution-to-extinction culturing from the Bermuda Atlantic Time Series station in the Western Sargasso Sea. Strain HTCC2559(T) contained genes for carotenoid biosynthesis, flavonoid biosynthesis, and several macromolecule-degrading enzymes. The genome confirmed physiological observations of cultivated Croceibacter atlanticus strain HTCC2559(T), which identified it as an obligate chemoheterotroph. PMID:20639333

  4. Genome Sequence of the Urethral Isolate Pseudomonas aeruginosa RN21

    PubMed Central

    Wibberg, Daniel; Tielen, Petra; Narten, Maike; Schobert, Max; Blom, Jochen; Schatschneider, Sarah; Meyer, Ann-Kathrin; Neubauer, Rüdiger; Albersmeier, Andreas; Albaum, Stefan; Jahn, Martina; Goesmann, Alexander; Vorhölter, Frank-Jörg; Pühler, Alfred

    2015-01-01

    Pseudomonas aeruginosa is known to cause complicated urinary tract infections (UTI). The improved 7.0-Mb draft genome sequence of P. aeruginosa RN21, isolated from a patient with an acute UTI, was determined. It carries three (pro)phage genomes, genes for two restriction/modification systems, and a clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system. PMID:26184943

  5. Genome Sequence of the Urethral Isolate Pseudomonas aeruginosa RN21.

    PubMed

    Wibberg, Daniel; Tielen, Petra; Narten, Maike; Schobert, Max; Blom, Jochen; Schatschneider, Sarah; Meyer, Ann-Kathrin; Neubauer, Rüdiger; Albersmeier, Andreas; Albaum, Stefan; Jahn, Martina; Goesmann, Alexander; Vorhölter, Frank-Jörg; Pühler, Alfred; Jahn, Dieter

    2015-01-01

    Pseudomonas aeruginosa is known to cause complicated urinary tract infections (UTI). The improved 7.0-Mb draft genome sequence of P. aeruginosa RN21, isolated from a patient with an acute UTI, was determined. It carries three (pro)phage genomes, genes for two restriction/modification systems, and a clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system. PMID:26184943

  6. Contribution to Sequencing of the Deinococcus radiodurans Genome

    SciTech Connect

    Minton, K.W.

    1999-03-11

    The stated goal of this project was to supply The Institute for Genomic Research (TIGR) with pure DNA from the bacterium Deinocmus radiodurans RI for purposes of complete genomic sequencing by TIGR. We subsequently decided to expand this project to include a second goal; this second goal was the development of a NotI chromosomal map of D. radiodurans R1 using Pulsed Field Gel Electrophoresis (PFGE).

  7. Analysis of Complete Genome Sequences of Human Rhinovirus

    PubMed Central

    Palmenberg, Ann C.; Rathe, Jennifer A.; Liggett, Stephen B.

    2010-01-01

    Human Rhinovirus (HRV) infection is the cause of about one-half of asthma and COPD exacerbations. With >100 serotypes in the HRV reference set an effort was undertaken to sequence their complete genomes so as to understand diversity, structural variation, and evolution of the virus. Analysis revealed conserved motifs, hypervariable regions, a potential fourth HRV species, within-serotype variation in field isolates, a non-scanning internal ribosome entry site, and evidence for HRV recombination. Techniques have now been developed using next generation sequencing to generate complete genomes from patient isolates with high throughput, deep coverage, and low costs. Thus relationships can now be sought between obstructive lung phenotypes and variation in HRV genomes in infected patients, and, potential novel therapeutic strategies developed based on HRV sequence. PMID:20471068

  8. The complete genome sequence of Escherichia coli K-12.

    PubMed

    Blattner, F R; Plunkett, G; Bloch, C A; Perna, N T; Burland, V; Riley, M; Collado-Vides, J; Glasner, J D; Rode, C K; Mayhew, G F; Gregor, J; Davis, N W; Kirkpatrick, H A; Goeden, M A; Rose, D J; Mau, B; Shao, Y

    1997-09-01

    The 4,639,221-base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer. PMID:9278503

  9. The complete mitochondrial genome sequence of the Daweishan Mini chicken.

    PubMed

    Yan, Ming-Li; Ding, Su-Ping; Ye, Shao-Hui; Wang, Chun-Guang; He, Bao-Li; Yuan, Zhi-Dong; Liu, Li-Li

    2016-01-01

    Daweishan Mini chicken is a valuable chicken breed in China. In this study, the complete mitochondrial genome sequence of Daweishan Mini chicken using PCR amplification, sequencing and assembling has been obtained for the first time. The total length of the mitochondrial genome was 16,785?bp, with the base composition of 30.26% A, 23.73% T, 32.51% C, 13.51% G. It contained 37 genes (2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes) and a major non-coding control region (D-loop region). The protein start codons are ATG, except for COX1 that begins with GTG. The complete mitochondrial genome sequence of Daweishan Mini chicken provides an important data set for further investigation on the phylogenetic relationships within Gallus gallus. PMID:24450719

  10. Whole-Genome Sequencing for Optimized Patient Management

    PubMed Central

    Bainbridge, Matthew N.; Wiszniewski, Wojciech; Murdock, David R.; Friedman, Jennifer; Gonzaga-Jauregui, Claudia; Newsham, Irene; Reid, Jeffrey G.; Fink, John K.; Morgan, Margaret B.; Gingras, Marie-Claude; Muzny, Donna M.; Hoang, Linh D.; Yousaf, Shahed; Lupski, James R.; Gibbs, Richard A.

    2012-01-01

    Whole-genome sequencing of patient DNA can facilitate diagnosis of a disease, but its potential for guiding treatment has been under-realized. We interrogated the complete genome sequences of a 14-year-old fraternal twin pair diagnosed with dopa (3,4-dihydroxyphenylalanine)–responsive dystonia (DRD; Mendelian Inheritance in Man #128230). DRD is a genetically heterogeneous and clinically complex movement disorder that is usually treated with l-dopa, a precursor of the neurotransmitter dopamine. Whole-genome sequencing identified compound heterozygous mutations in the SPR gene encoding sepiapterin reductase. Disruption of SPR causes a decrease in tetrahydrobiopterin, a cofactor required for the hydroxylase enzymes that synthesize the neurotransmitters dopamine and serotonin. Supplementation of l-dopa therapy with 5-hydroxytryptophan, a serotonin precursor, resulted in clinical improvements in both twins. PMID:21677200

  11. The genomics and genetics of endometrial cancer

    PubMed Central

    O’Hara, Andrea J; Bell, Daphne W

    2012-01-01

    Most sporadic endometrial cancers (ECs) can be histologically classified as endometrioid, serous, or clear cell. Each histotype has a distinct natural history, clinical behavior, and genetic etiology. Endometrioid ECs have an overall favorable prognosis. They are typified by high frequency genomic alterations affecting PIK3CA, PIK3R1, PTEN, KRAS, FGFR2, ARID1A (BAF250a), and CTNNB1 (?-catenin), as well as epigenetic silencing of MLH1 resulting in microsatellite instability. Serous and clear cell ECs are clinically aggressive tumors that are rare at presentation but account for a disproportionate fraction of all endometrial cancer deaths. Serous ECs tend to be aneuploid and are typified by frequent genomic alterations affecting TP53 (p53), PPP2R1A, HER-2/ERBB2, PIK3CA, and PTEN; additionally, they display dysregulation of E-cadherin, p16, cyclin E, and BAF250a. The genetic etiology of clear cell ECs resembles that of serous ECs, but it remains relatively poorly defined. A detailed discussion of the characteristic patterns of genomic alterations that distinguish the three major histotypes of endometrial cancer is reviewed herein. PMID:22888282

  12. The Landscape of Microsatellite Instability in Colorectal and Endometrial Cancer Genomes

    PubMed Central

    Kim, Tae-Min; Laird, Peter W.; Park, Peter J.

    2013-01-01

    Summary Microsatellites - simple tandem repeats present at millions of sites in the human genome - can shorten or lengthen due to a defect in DNA mismatch repair. We present here the first comprehensive genome-wide analysis of the prevalence, mutational spectrum and functional consequences of microsatellite instability (MSI) in cancer genomes. We analyzed MSI in 277 colorectal and endometrial cancer genomes (including 57 microsatellite-unstable ones) using exome and whole-genome sequencing data. Recurrent MSI events in coding sequences showed tumor type-specificity, elevated frameshift-to-inframe ratios, and lower transcript levels than wildtype alleles. Moreover, genome-wide analysis revealed differences in the distribution of MSI versus point mutations, including overrepresentation of MSI in euchromatic and intronic regions compared to heterochromatic and intergenic regions, respectively, and depletion of MSI at nucleosome-occupied sequences. Our results provide a panoramic view of MSI in cancer genomes, highlighting their tumor type-specificity, impact on gene expression, and the role of chromatin organization. PMID:24209623

  13. The genome sequence of the colonial chordate, Botryllus schlosseri

    PubMed Central

    Voskoboynik, Ayelet; Neff, Norma F; Sahoo, Debashis; Newman, Aaron M; Pushkarev, Dmitry; Koh, Winston; Passarelli, Benedetto; Fan, H Christina; Mantalas, Gary L; Palmeri, Karla J; Ishizuka, Katherine J; Gissi, Carmela; Griggio, Francesca; Ben-Shlomo, Rachel; Corey, Daniel M; Penland, Lolita; White, Richard A; Weissman, Irving L; Quake, Stephen R

    2013-01-01

    Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI: http://dx.doi.org/10.7554/eLife.00569.001 PMID:23840927

  14. Standardized Metadata for Human Pathogen/Vector Genomic Sequences

    PubMed Central

    Dugan, Vivien G.; Emrich, Scott J.; Giraldo-Calderón, Gloria I.; Harb, Omar S.; Newman, Ruchi M.; Pickett, Brett E.; Schriml, Lynn M.; Stockwell, Timothy B.; Stoeckert, Christian J.; Sullivan, Dan E.; Singh, Indresh; Ward, Doyle V.; Yao, Alison; Zheng, Jie; Barrett, Tanya; Birren, Bruce; Brinkac, Lauren; Bruno, Vincent M.; Caler, Elizabet; Chapman, Sinéad; Collins, Frank H.; Cuomo, Christina A.; Di Francesco, Valentina; Durkin, Scott; Eppinger, Mark; Feldgarden, Michael; Fraser, Claire; Fricke, W. Florian; Giovanni, Maria; Henn, Matthew R.; Hine, Erin; Hotopp, Julie Dunning; Karsch-Mizrachi, Ilene; Kissinger, Jessica C.; Lee, Eun Mi; Mathur, Punam; Mongodin, Emmanuel F.; Murphy, Cheryl I.; Myers, Garry; Neafsey, Daniel E.; Nelson, Karen E.; Nierman, William C.; Puzak, Julia; Rasko, David; Roos, David S.; Sadzewicz, Lisa; Silva, Joana C.; Sobral, Bruno; Squires, R. Burke; Stevens, Rick L.; Tallon, Luke; Tettelin, Herve; Wentworth, David; White, Owen; Will, Rebecca; Wortman, Jennifer; Zhang, Yun; Scheuermann, Richard H.

    2014-01-01

    High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. PMID:24936976

  15. Standardized metadata for human pathogen/vector genomic sequences.

    PubMed

    Dugan, Vivien G; Emrich, Scott J; Giraldo-Calderón, Gloria I; Harb, Omar S; Newman, Ruchi M; Pickett, Brett E; Schriml, Lynn M; Stockwell, Timothy B; Stoeckert, Christian J; Sullivan, Dan E; Singh, Indresh; Ward, Doyle V; Yao, Alison; Zheng, Jie; Barrett, Tanya; Birren, Bruce; Brinkac, Lauren; Bruno, Vincent M; Caler, Elizabet; Chapman, Sinéad; Collins, Frank H; Cuomo, Christina A; Di Francesco, Valentina; Durkin, Scott; Eppinger, Mark; Feldgarden, Michael; Fraser, Claire; Fricke, W Florian; Giovanni, Maria; Henn, Matthew R; Hine, Erin; Hotopp, Julie Dunning; Karsch-Mizrachi, Ilene; Kissinger, Jessica C; Lee, Eun Mi; Mathur, Punam; Mongodin, Emmanuel F; Murphy, Cheryl I; Myers, Garry; Neafsey, Daniel E; Nelson, Karen E; Nierman, William C; Puzak, Julia; Rasko, David; Roos, David S; Sadzewicz, Lisa; Silva, Joana C; Sobral, Bruno; Squires, R Burke; Stevens, Rick L; Tallon, Luke; Tettelin, Herve; Wentworth, David; White, Owen; Will, Rebecca; Wortman, Jennifer; Zhang, Yun; Scheuermann, Richard H

    2014-01-01

    High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium's minimal information (MIxS) and NCBI's BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. PMID:24936976

  16. The complete mitochondrial genome sequence of the budgerigar, Melopsittacus undulatus.

    PubMed

    Guan, Xiaojing; Xu, Jun; Smith, Edward J

    2016-01-01

    Here, we describe the budgie's mitochondrial genome sequence, a resource that can facilitate this parrot's use as a model organism as well as for determining its phylogenetic relatedness to other parrots/Psittaciformes. The estimated total length of the sequence was 18,193?bp. In addition to the to the 13 protein and tRNA and rRNA coding regions, the sequence also includes a duplicated hypervariable region, a feature unique to only a few birds. The two hypervariable regions shared a sequence identity of about 86%. PMID:24660934

  17. Complete genome sequence of Haliscomenobacter hydrossis type strain (OT)

    SciTech Connect

    Daligault, Hajnalka E.; Lapidus, Alla L.; Zeytun, Ahmet; Nolan, Matt; Lucas, Susan; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Tapia, Roxanne; Han, Cliff; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Pagani, Ioanna; Ivanova, N; Huntemann, Marcel; Mavromatis, K; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Brambilla, Evelyne-Marie; Rohde, Manfred; Verbarg, Susanne; Goker, Markus; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Woyke, Tanja

    2011-01-01

    Haliscomenobacter hydrossis van Veen et al. 1973 is the type species of the genus Halisco- menobacter, which belongs to order 'Sphingobacteriales'. The species is of interest because of its isolated phylogenetic location in the tree of life, especially the so far genomically un- charted part of it, and because the organism grows in a thin, hardly visible hyaline sheath. Members of the species were isolated from fresh water of lakes and from ditch water. The genome of H. hydrossis is the first completed genome sequence reported from a member of the family 'Saprospiraceae'. The 8,771,651 bp long genome with its three plasmids of 92 kbp, 144 kbp and 164 kbp length contains 6,848 protein-coding and 60 RNA genes, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  18. The minimum information about a genome sequence (MIGS) specification.

    PubMed

    Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; DePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; San Gil, Ingio; Wilson, Gareth; Wipat, Anil

    2008-05-01

    With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases. PMID:18464787

  19. Castor Bean Organelle Genome Sequencing and Worldwide Genetic Diversity Analysis

    PubMed Central

    Chan, Agnes P.; Williams, Amber L.; Rice, Danny W.; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M. J.; Khouri, Hoda M.; Beckstrom-Sternberg, Stephen M.; Allan, Gerard J.; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D.

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  20. Complete genome sequence of Pyrolobus fumarii type strain (1AT)

    SciTech Connect

    Anderson, Iain; Goker, Markus; Nolan, Matt; Lucas, Susan; Hammon, Nancy; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxanne; Han, Cliff; Goodwin, Lynne A.; Pitluck, Sam; Huntemann, Marcel; Liolios, Konstantinos; Ivanova, N; Pagani, Ioanna; Mavromatis, K; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam L; Hauser, Loren John; Brambilla, Evelyne-Marie; Huber, Harald; Yasawong, Montri; Rohde, Manfred; Spring, Stefan; Abt, Birte; Sikorski, Johannes; Wirth, Reinhard; Detter, J. Chris; Woyke, Tanja; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla L.

    2011-01-01

    Pyrolobus fumarii Bl chl et al. 1997 is the type species of the genus Pyrolobus, which be- longs to the crenarchaeal family Pyrodictiaceae. The species is a facultatively microaerophilic non-motile crenarchaeon. It is of interest because of its isolated phylogenetic location in the tree of life and because it is a hyperthermophilic chemolithoautotroph known as the primary producer of organic matter at deep-sea hydrothermal vents. P. fumarii exhibits currently the highest optimal growth temperature of all life forms on earth (106 C). This is the first com- pleted genome sequence of a member of the genus Pyrolobus to be published and only the second genome sequence from a member of the family Pyrodictiaceae. Although Diversa Corporation announced the completion of sequencing of the P. fumarii genome on Septem- ber 25, 2001, this sequence was never released to the public. The 1,843,267 bp long genome with its 1,986 protein-coding and 52 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.