Note: This page contains sample records for the topic cancer genome sequences from Science.gov.
While these samples are representative of the content of Science.gov,
they are not comprehensive nor are they the most current set.
We encourage you to perform a real-time search of Science.gov
to obtain the most current and comprehensive results.
Last update: August 15, 2014.
1

Clinical relevance of cancer genome sequencing  

PubMed Central

The arrival of both high-throughput and bench-top next-generation sequencing technologies and sequence enrichment methods has revolutionized our approach to dissecting the genetic basis of cancer. These technologies have been almost invariably employed in whole-genome sequencing (WGS) and whole-exome sequencing (WES) studies. Both WGS and WES approaches have been widely applied to interrogate the somatic mutational landscape of sporadic cancers and identify novel germline mutations underlying familial cancer syndromes. The clinical implications of cancer genome sequencing have become increasingly clear, for example in diagnostics. In this editorial, we present these advances in the context of research discovery and discuss both the clinical relevance of cancer genome sequencing and the challenges associated with the adoption of these genomic technologies in a clinical setting.

Ku, Chee Seng; Cooper, David N; Roukos, Dimitrios H

2013-01-01

2

Cancer Genome Sequencing - An Interim Analysis  

PubMed Central

With the publishing of the first complete, whole genome of a human cancer and its paired normal, we have passed a key milestone in the cancer genome sequencing strategy. The generation of such data will, thanks to technical advances, soon become commonplace. As a significant number of proof-of-concept studies have been published, it is important to analyze now the likely implications of this data and how it might frame cancer research in the near future. The diversity of genes mutated within individual tumor-types, the most striking feature of all studies reported to date, challenges gene-centric models of tumorigenesis. While cancer genome sequencing will revolutionize certain aspects of personalized care, the value of these studies in facilitating the development of new therapies, their primary goal, appears less promising. Most significantly, however, the cancer genome sequencing strategy, as currently applied, fails to characterize the most relevant genomic features of cancer – the mutational heterogeneity within individual tumors.

Fox, Edward J.; Salk, Jesse J.; Loeb, Lawrence A.

2009-01-01

3

Science Originals: Sequencing Cancer Genomes: Targeted Cancer Therapies  

NSDL National Science Digital Library

Applying DNA sequencing to cancer genomes is providing insights that have allowed researchers to turn some cancers into chronic diseases rather than deadly ones. Still, the ultimate goal is to kill the cancer.

Robert Frederick (AAAS;)

2011-03-25

4

Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines  

PubMed Central

New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines.

Li, Lijin; Goedegebuure, Peter; Mardis, Elaine R.; Ellis, Matthew J.C.; Zhang, Xiuli; Herndon, John M.; Fleming, Timothy P.; Carreno, Beatriz M.; Hansen, Ted H.; Gillanders, William E.

2011-01-01

5

Whole genome sequencing for lung cancer  

PubMed Central

Lung cancer is a leading cause of cancer related morbidity and mortality globally, and carries a dismal prognosis. Improved understanding of the biology of cancer is required to improve patient outcomes. Next-generation sequencing (NGS) is a powerful tool for whole genome characterisation, enabling comprehensive examination of somatic mutations that drive oncogenesis. Most NGS methods are based on polymerase chain reaction (PCR) amplification of platform-specific DNA fragment libraries, which are then sequenced. These techniques are well suited to high-throughput sequencing and are able to detect the full spectrum of genomic changes present in cancer. However, they require considerable investments in time, laboratory infrastructure, computational analysis and bioinformatic support. Next-generation sequencing has been applied to studies of the whole genome, exome, transcriptome and epigenome, and is changing the paradigm of lung cancer research and patient care. The results of this new technology will transform current knowledge of oncogenic pathways and provide molecular targets of use in the diagnosis and treatment of cancer. Somatic mutations in lung cancer have already been identified by NGS, and large scale genomic studies are underway. Personalised treatment strategies will improve care for those likely to benefit from available therapies, while sparing others the expense and morbidity of futile intervention. Organisational, computational and bioinformatic challenges of NGS are driving technological advances as well as raising ethical issues relating to informed consent and data release. Differentiation between driver and passenger mutations requires careful interpretation of sequencing data. Challenges in the interpretation of results arise from the types of specimens used for DNA extraction, sample processing techniques and tumour content. Tumour heterogeneity can reduce power to detect mutations implicated in oncogenesis. Next-generation sequencing will facilitate investigation of the biological and clinical implications of such variation. These techniques can now be applied to single cells and free circulating DNA, and possibly in the future to DNA obtained from body fluids and from subpopulations of tumour. As costs reduce, and speed and processing accuracy increase, NGS technology will become increasingly accessible to researchers and clinicians, with the ultimate goal of improving the care of patients with lung cancer.

Goh, Felicia; Wright, Casey M; Sriram, Krishna B; Relan, Vandana; Clarke, Belinda E; Duhig, Edwina E; Bowman, Rayleen V; Yang, Ian A; Fong, Kwun M

2012-01-01

6

Advances in understanding cancer genomes through second-generation sequencing  

Microsoft Academic Search

Cancers are caused by the accumulation of genomic alterations. Therefore, analyses of cancer genome sequences and structures provide insights for understanding cancer biology, diagnosis and therapy. The application of second-generation DNA sequencing technologies (also known as next-generation sequencing) — through whole-genome, whole-exome and whole-transcriptome approaches — is allowing substantial advances in cancer genomics. These methods are facilitating an increase in

Stacey Gabriel; Gad Getz; Matthew Meyerson

2010-01-01

7

Reconstructing cancer genomes from paired-end sequencing data  

PubMed Central

Background A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. Results By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. Conclusions We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/.

2012-01-01

8

Perspectives of integrative cancer genomics in next generation sequencing era.  

PubMed

The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

2012-06-01

9

Comprehensive genome sequence analysis of a breast cancer amplicon.  

PubMed

Gene amplification occurs in most solid tumors and is associated with poor prognosis. Amplification of 20q13.2 is common to several tumor types including breast cancer. The 1 Mb of sequence spanning the 20q13.2 breast cancer amplicon is one of the most exhaustively studied segments of the human genome. These studies have included amplicon mapping by comparative genomic hybridization (CGH), fluorescent in-situ hybridization (FISH), array-CGH, quantitative microsatellite analysis (QUMA), and functional genomic studies. Together these studies revealed a complex amplicon structure suggesting the presence of at least two driver genes in some tumors. One of these, ZNF217, is capable of immortalizing human mammary epithelial cells (HMEC) when overexpressed. In addition, we now report the sequencing of this region in human and mouse, and on quantitative expression studies in tumors. Amplicon localization now is straightforward and the availability of human and mouse genomic sequence facilitates their functional analysis. However, comprehensive annotation of megabase-scale regions requires integration of vast amounts of information. We present a system for integrative analysis and demonstrate its utility on 1.2 Mb of sequence spanning the 20q13.2 breast cancer amplicon and 865 kb of syntenic murine sequence. We integrate tumor genome copy number measurements with exhaustive genome landscape mapping, showing that amplicon boundaries are associated with maxima in repetitive element density and a region of evolutionary instability. This integration of comprehensive sequence annotation, quantitative expression analysis, and tumor amplicon boundaries provide evidence for an additional driver gene prefoldin 4 (PFDN4), coregulated genes, conserved noncoding regions, and associate repetitive elements with regions of genomic instability at this locus. PMID:11381030

Collins, C; Volik, S; Kowbel, D; Ginzinger, D; Ylstra, B; Cloutier, T; Hawkins, T; Predki, P; Martin, C; Wernick, M; Kuo, W L; Alberts, A; Gray, J W

2001-06-01

10

Genome Sequencing Centers  

Cancer.gov

The Cancer Genome Atlas (TCGA) Genome Sequencing Centers (GSCs) perform large-scale DNA sequencing using the latest sequencing technologies. Supported by the National Human Genome Research Institute (NHGRI) large-scale sequencing program, the GSCs generate the enormous volume of data required by TCGA, while continually improving existing technologies and methods to expand the frontier of what can be achieved in cancer genome sequencing.

11

Exploring Cancer through Genomic Sequence Comparisons: A National Cancer Institute-National Human Genome Research Institute Workshop. Held in Bethesda, Maryland on April 14-15, 2004.  

National Technical Information Service (NTIS)

On April 14-15, 2004, the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) convened a workshop, 'Exploring Cancer through Genomic Sequence Comparisons.' Participants included leaders from the Nation's cancer centers...

2004-01-01

12

Returning individual research results for genome sequences of pancreatic cancer  

PubMed Central

Background Disclosure of individual results to participants in genomic research is a complex and contentious issue. There are many existing commentaries and opinion pieces on the topic, but little empirical data concerning actual cases describing how individual results have been returned. Thus, the real life risks and benefits of disclosing individual research results to participants are rarely if ever presented as part of this debate. Methods The Australian Pancreatic Cancer Genome Initiative (APGI) is an Australian contribution to the International Cancer Genome Consortium (ICGC), that involves prospective sequencing of tumor and normal genomes of study participants with pancreatic cancer in Australia. We present three examples that illustrate different facets of how research results may arise, and how they may be returned to individuals within an ethically defensible and clinically practical framework. This framework includes the necessary elements identified by others including consent, determination of the significance of results and which to return, delineation of the responsibility for communication and the clinical pathway for managing the consequences of returning results. Results Of 285 recruited patients, we returned results to a total of 25 with no adverse events to date. These included four that were classified as medically actionable, nine as clinically significant and eight that were returned at the request of the treating clinician. Case studies presented depict instances where research results impacted on cancer susceptibility, current treatment and diagnosis, and illustrate key practical challenges of developing an effective framework. Conclusions We suggest that return of individual results is both feasible and ethically defensible but only within the context of a robust framework that involves a close relationship between researchers and clinicians.

2014-01-01

13

Trastuzumab and beyond: sequencing cancer genomes and predicting molecular networks.  

PubMed

Life diversity can now be clearly explored with the next-generation DNA sequencing technology, allowing the discovery of genetic variants among individuals, patients and tumors. However, beyond causal mutations catalog completion, systems medicine is essential to link genotype to phenotypic cancer diversity towards personalized medicine. Despite advances with traditional single genes molecular research, including rare mutations in BRCA1/2 and CDH1 for primary prevention and trastuzumab for treating HER2-overexpressing breast and gastric tumors, overall, treatment failure and death rates are still alarmingly high. Revolution in sequencing reveals that, now both a huge number and widespread variability of driver mutations, including single-nucleotide polymorphisms, genomic rearrangements and copy-number changes involved in breast cancer development. All these genetic alterations result in a heterogeneous deregulation of signaling pathways, including EGFR, HER2, VEGF, Wnt/Notch, TGF and others.Cancer initiation, progression and metastases are driven by complex molecular networks rather than linear genotype-phenotype relationship. Therefore, clinical expectations by traditional molecular research strategies targeting single genes and single signaling pathways are likely minimal. This review discusses the necessity of molecular networks modeling to understand complex gene-gene, protein-protein and gene-environment interactions. Moreover, the potential of systems clinico-biological approaches to predict intracellular signaling pathways components networks and cancer heterogeneous cells within an individual tumor is described. A flowchart specific for three steps in cancer evolution separately tumorigenesis, early-stage and advanced-stage breast cancer is presented. Using reverse engineering starting with the integration of available established clinical, environmental, treatment and oncological outcomes (survival and death) data and then the still incomplete but progressively accumulating genotypic data into computational networks modeling may lead to bionetworks-based discovery of robust biomarkers and highly effective cancer drugs targets. PMID:20975737

Roukos, D H

2011-04-01

14

Genomic sequencing.  

PubMed Central

Unique DNA sequences can be determined directly from mouse genomic DNA. A denaturing gel separates by size mixtures of unlabeled DNA fragments from complete restriction and partial chemical cleavages of the entire genome. These lanes of DNA are transferred and UV-crosslinked to nylon membranes. Hybridization with a short 32P-labeled single-stranded probe produces the image of a DNA sequence "ladder" extending from the 3' or 5' end of one restriction site in the genome. Numerous different sequences can be obtained from a single membrane by reprobing. Each band in these sequences represents 3 fg of DNA complementary to the probe. Sequence data from mouse immunoglobulin heavy chain genes from several cell types are presented. The genomic sequencing procedures are applicable to the analysis of genetic polymorphisms, DNA methylation at deoxycytidines, and nucleic acid-protein interactions at single nucleotide resolution. Images

Church, G M; Gilbert, W

1984-01-01

15

Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer.  

PubMed

The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PMID:22341448

Murchison, Elizabeth P; Schulz-Trieglaff, Ole B; Ning, Zemin; Alexandrov, Ludmil B; Bauer, Markus J; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R; Cheetham, R Keira; Cheng, William; Connor, Thomas R; Cox, Anthony J; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J; Harris, Simon R; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J; Wedge, David C; Woods, Gregory M; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M J; Carter, Nigel P; Papenfuss, Anthony T; Futreal, P Andrew; Campbell, Peter J; Yang, Fengtang; Bentley, David R; Evers, Dirk J; Stratton, Michael R

2012-02-17

16

Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer  

PubMed Central

Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip

Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

2012-01-01

17

Exome Sequencing Reveals Comprehensive Genomic Alterations across Eight Cancer Cell Lines  

PubMed Central

It is well established that genomic alterations play an essential role in oncogenesis, disease progression, and response of tumors to therapeutic intervention. The advances of next-generation sequencing technologies (NGS) provide unprecedented capabilities to scan genomes for changes such as mutations, deletions, and alterations of chromosomal copy number. However, the cost of full-genome sequencing still prevents the routine application of NGS in many areas. Capturing and sequencing the coding exons of genes (the “exome”) can be a cost-effective approach for identifying changes that result in alteration of protein sequences. We applied an exome-sequencing technology (Roche Nimblegen capture paired with 454 sequencing) to identify sequence variation and mutations in eight commonly used cancer cell lines from a variety of tissue origins (A2780, A549, Colo205, GTL16, NCI-H661, MDA-MB468, PC3, and RD). We showed that this technology can accurately identify sequence variation, providing ?95% concordance with Affymetrix SNP Array 6.0 performed on the same cell lines. Furthermore, we detected 19 of the 21 mutations reported in Sanger COSMIC database for these cell lines. We identified an average of 2,779 potential novel sequence variations/mutations per cell line, of which 1,904 were non-synonymous. Many non-synonymous changes were identified in kinases and known cancer-related genes. In addition we confirmed that the read-depth of exome sequence data can be used to estimate high-level gene amplifications and identify homologous deletions. In summary, we demonstrate that exome sequencing can be a reliable and cost-effective way for identifying alterations in cancer genomes, and we have generated a comprehensive catalogue of genomic alterations in coding regions of eight cancer cell lines. These findings could provide important insights into cancer pathways and mechanisms of resistance to anti-cancer therapies.

Chang, Han; Jackson, Donald G.; Kayne, Paul S.; Ross-Macdonald, Petra B.; Ryseck, Rolf-Peter; Siemers, Nathan O.

2011-01-01

18

Detection and Mapping of Amplified DNA Sequences in Breast Cancer by Comparative Genomic Hybridization  

Microsoft Academic Search

Comparative genomic hybridization was applied to 5 breast cancer cell lines and 33 primary tumors to discover and map regions of the genome with increased DNA-sequence copy-number. Two-thirds of primary tumors and almost all cell lines showed increased DNA-sequence copy-number affecting a total of 26 chromosomal subregions. Most of these loci were distinct from those of currently known amplified genes

Anne Kallioniemi; Olli-Pekka Kallioniemi; Jim Piper; Minna Tanner; Trond Stokke; Ling Chen; Helene S. Smith; Dan Pinkel; Joe W. Gray; Frederic M. Waldman

1994-01-01

19

Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing  

Microsoft Academic Search

Human cancers often carry many somatically acquired genomic rearrangements, some of which may be implicated in cancer development. However, conventional strategies for characterizing rearrangements are laborious and low-throughput and have low sensitivity or poor resolution. We used massively parallel sequencing to generate sequence reads from both ends of short DNA fragments derived from the genomes of two individuals with lung

Peter J Campbell; Philip J Stephens; Erin D Pleasance; Sarah O'Meara; Heng Li; Thomas Santarius; Lucy A Stebbings; Catherine Leroy; Sarah Edkins; Claire Hardy; Jon W Teague; Andrew Menzies; Ian Goodhead; Daniel J Turner; Christopher M Clee; Michael A Quail; Antony Cox; Clive Brown; Richard Durbin; Matthew E Hurles; Paul A W Edwards; Graham R Bignell; Michael R Stratton; P Andrew Futreal

2008-01-01

20

Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization.  

PubMed Central

Comparative genomic hybridization was applied to 5 breast cancer cell lines and 33 primary tumors to discover and map regions of the genome with increased DNA-sequence copy-number. Two-thirds of primary tumors and almost all cell lines showed increased DNA-sequence copy-number affecting a total of 26 chromosomal subregions. Most of these loci were distinct from those of currently known amplified genes in breast cancer, with sequences originating from 17q22-q24 and 20q13 showing the highest frequency of amplification. The results indicate that these chromosomal regions may contain previously unknown genes whose increased expression contributes to breast cancer progression. Chromosomal regions with increased copy-number often spanned tens of Mb, suggesting involvement of more than one gene in each region. Images

Kallioniemi, A; Kallioniemi, O P; Piper, J; Tanner, M; Stokke, T; Chen, L; Smith, H S; Pinkel, D; Gray, J W; Waldman, F M

1994-01-01

21

Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing  

PubMed Central

Inherited loss-of-function mutations in the tumor suppressor genes BRCA1, BRCA2, and multiple other genes predispose to high risks of breast and/or ovarian cancer. Cancer-associated inherited mutations in these genes are collectively quite common, but individually rare or even private. Genetic testing for BRCA1 and BRCA2 mutations has become an integral part of clinical practice, but testing is generally limited to these two genes and to women with severe family histories of breast or ovarian cancer. To determine whether massively parallel, “next-generation” sequencing would enable accurate, thorough, and cost-effective identification of inherited mutations for breast and ovarian cancer, we developed a genomic assay to capture, sequence, and detect all mutations in 21 genes, including BRCA1 and BRCA2, with inherited mutations that predispose to breast or ovarian cancer. Constitutional genomic DNA from subjects with known inherited mutations, ranging in size from 1 to >100,000 bp, was hybridized to custom oligonucleotides and then sequenced using a genome analyzer. Analysis was carried out blind to the mutation in each sample. Average coverage was >1200 reads per base pair. After filtering sequences for quality and number of reads, all single-nucleotide substitutions, small insertion and deletion mutations, and large genomic duplications and deletions were detected. There were zero false-positive calls of nonsense mutations, frameshift mutations, or genomic rearrangements for any gene in any of the test samples. This approach enables widespread genetic testing and personalized risk assessment for breast and ovarian cancer.

Walsh, Tom; Lee, Ming K.; Casadei, Silvia; Thornton, Anne M.; Stray, Sunday M.; Pennil, Christopher; Nord, Alex S.; Mandell, Jessica B.; Swisher, Elizabeth M.; King, Mary-Claire

2010-01-01

22

Clinical genomics information management software linking cancer genome sequence and clinical decisions.  

PubMed

Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician. PMID:23603536

Watt, Stuart; Jiao, Wei; Brown, Andrew M K; Petrocelli, Teresa; Tran, Ben; Zhang, Tong; McPherson, John D; Kamel-Reid, Suzanne; Bedard, Philippe L; Onetto, Nicole; Hudson, Thomas J; Dancey, Janet; Siu, Lillian L; Stein, Lincoln; Ferretti, Vincent

2013-09-01

23

SomatiCA: Identifying, Characterizing and Quantifying Somatic Copy Number Aberrations from Cancer Genome Sequencing Data  

PubMed Central

Whole genome sequencing of matched tumor-normal sample pairs is becoming routine in cancer research. However, analysis of somatic copy-number changes from sequencing data is still challenging because of insufficient sequencing coverage, unknown tumor sample purity and subclonal heterogeneity. Here we describe a computational framework, named SomatiCA, which explicitly accounts for tumor purity and subclonality in the analysis of somatic copy-number profiles. Taking read depths (RD) and lesser allele frequencies (LAF) as input, SomatiCA will output 1) admixture rate for each tumor sample, 2) somatic allelic copy-number for each genomic segment, 3) fraction of tumor cells with subclonal change in each somatic copy number aberration (SCNA), and 4) a list of substantial genomic aberration events including gain, loss and LOH. SomatiCA is available as a Bioconductor R package at http://www.bioconductor.org/packages/2.13/bioc/html/SomatiCA.html.

Chen, Mengjie; Gunel, Murat; Zhao, Hongyu

2013-01-01

24

Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine  

PubMed Central

High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer.

2014-01-01

25

Molecular pathology of prostate cancer revealed by next-generation sequencing: opportunities for genome-based personalized therapy  

PubMed Central

Purpose of review This article reviews recently identified genomic mutations in prostate cancer. Recent findings Advanced sequencing technologies have made it possible to obtain large amounts of data on genomes and transcriptomes of cancers. Such technologies have been used to sequence prostate cancer of different stages, from treatment-naive cancers, to advanced, castration-resistant cancers to the aggressive small cell neuroendocrine carcinomas. For each category of prostate cancer, distinct and overlapping DNA sequence alterations were discovered, including point mutations, small insertions or deletions, copy number changes and chromosomal rearrangements. There appears to be a stepwise increase in genomic alterations from low risk to high risk to advanced cancers. Summary These novel findings have significantly increased our knowledge of the genetic basis of human prostate cancer and the molecular mechanisms responsible for disease progression and treatment resistance. Some of the lesions are potential therapeutic targets. Studies along this direction will eventually make it possible to design personalized management plans for individual patients.

Huang, Jiaoti; Wang, Jason K.; Sun, Yin

2014-01-01

26

Identification of somatic mutations in cancer through Bayesian-based analysis of sequenced genome pairs  

PubMed Central

Background The field of cancer genomics has rapidly adopted next-generation sequencing (NGS) in order to study and characterize malignant tumors with unprecedented resolution. In particular for cancer, one is often trying to identify somatic mutations – changes specific to a tumor and not within an individual’s germline. However, false positive and false negative detections often result from lack of sufficient variant evidence, contamination of the biopsy by stromal tissue, sequencing errors, and the erroneous classification of germline variation as tumor-specific. Results We have developed a generalized Bayesian analysis framework for matched tumor/normal samples with the purpose of identifying tumor-specific alterations such as single nucleotide mutations, small insertions/deletions, and structural variation. We describe our methodology, and discuss its application to other types of paired-tissue analysis such as the detection of loss of heterozygosity as well as allelic imbalance. We also demonstrate the high level of sensitivity and specificity in discovering simulated somatic mutations, for various combinations of a) genomic coverage and b) emulated heterogeneity. Conclusion We present a Java-based implementation of our methods named Seurat, which is made available for free academic use. We have demonstrated and reported on the discovery of different types of somatic change by applying Seurat to an experimentally-derived cancer dataset using our methods; and have discussed considerations and practices regarding the accurate detection of somatic events in cancer genomes. Seurat is available at https://sites.google.com/site/seuratsomatic.

2013-01-01

27

Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing.  

PubMed

Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon insertions from sequencing data, to whole genomes from 200 tumor/normal pairs across 11 tumor types as part of The Cancer Genome Atlas (TCGA) Pan-Cancer Project. In addition to novel germline polymorphisms, we find 810 somatic retrotransposon insertions primarily in lung squamous, head and neck, colorectal, and endometrial carcinomas. Many somatic retrotransposon insertions occur in known cancer genes. We find that high somatic retrotransposition rates in tumors are associated with high rates of genomic rearrangement and somatic mutation. Finally, we developed TranspoSeq-Exome to interrogate an additional 767 tumor samples with hybrid-capture exome data and discovered 35 novel somatic retrotransposon insertions into exonic regions, including an insertion into an exon of the PTEN tumor suppressor gene. The results of this large-scale, comprehensive analysis of retrotransposon movement across tumor types suggest that somatic retrotransposon insertions may represent an important class of structural variation in cancer. PMID:24823667

Helman, Elena; Lawrence, Michael S; Stewart, Chip; Sougnez, Carrie; Getz, Gad; Meyerson, Matthew

2014-07-01

28

TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence  

Cancer.gov

Carolyn Hutter, Ph.D., Program Director of NHGRI's Division of Genomic Medicine, discusses the expansion of TCGA's Pan-Cancer efforts to include the Pan-Cancer Analysis of Whole Genomes (PAWG) project.

29

Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer.  

PubMed

Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and epigenetic perturbations and unique mutational signatures. We identified previously known (TP53, ARID1A and CDH1) and new (MUC6, CTNNA2, GLI3, RNF43 and others) significantly mutated driver genes. Specifically, we found RHOA mutations in 14.3% of diffuse-type tumors but not in intestinal-type tumors (P < 0.001). The mutations clustered in recurrent hotspots affecting functional domains and caused defective RHOA signaling, promoting escape from anoikis in organoid cultures. The top perturbed pathways in gastric cancer included adherens junction and focal adhesion, in which RHOA and other mutated genes we identified participate as key players. These findings illustrate a multidimensional and comprehensive genomic landscape that highlights the molecular complexity of gastric cancer and provides a road map to facilitate genome-guided personalized therapy. PMID:24816253

Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

2014-06-01

30

Detection of Chromosomal Alterations in the Circulation of Cancer Patients with Whole-Genome Sequencing  

PubMed Central

Clinical management of cancer patients could be improved through the development of noninvasive approaches for the detection of incipient, residual, and recurrent tumors. We describe an approach to directly identify tumor-derived chromosomal alterations through analysis of circulating cell-free DNA from cancer patients. Whole-genome analyses of DNA from the plasma of 10 colorectal and breast cancer patients and 10 healthy individuals with massively parallel sequencing identified, in all patients, structural alterations that were not present in plasma DNA from healthy subjects. Detected alterations comprised chromosomal copy number changes and rearrangements, including amplification of cancer driver genes such as ERBB2 and CDK6. The level of circulating tumor DNA in the cancer patients ranged from 1.4 to 47.9%. The sensitivity and specificity of this approach are dependent on the amount of sequence data obtained and are derived from the fact that most cancers harbor multiple chromosomal alterations, each of which is unlikely to be present in normal cells. Given that chromosomal abnormalities are present in nearly all human cancers, this approach represents a useful method for the noninvasive detection of human tumors that is not dependent on the availability of tumor biopsies.

Leary, Rebecca J.; Sausen, Mark; Kinde, Isaac; Papadopoulos, Nickolas; Carpten, John D.; Craig, David; O'Shaughnessy, Joyce; Kinzler, Kenneth W.; Parmigiani, Giovanni; Vogelstein, Bert; Diaz, Luis A.; Velculescu, Victor E.

2013-01-01

31

Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.  

PubMed

Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology. PMID:22294438

Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

2012-05-01

32

The cancer genome  

PubMed Central

All cancers arise as a result of changes that have occurred in the DNA sequence of the genomes of cancer cells. Over the past quarter of a century much has been learnt about these mutations and the abnormal genes that operate in human cancers. We are now, however, moving into an era in which it will be possible to obtain the complete DNA sequence of large numbers of cancer genomes. These studies will provide us with a detailed and comprehensive perspective on how individual cancers have developed.

Stratton, Michael R.; Campbell, Peter J.; Futreal, P. Andrew

2010-01-01

33

Draft Genome Sequences of Helicobacter pylori Strains Isolated from Regions of Low and High Gastric Cancer Risk in Colombia  

PubMed Central

The draft genome sequences of six Colombian Helicobacter pylori strains are presented. These strains were isolated from patients from regions of high and low gastric cancer risk in Colombia and were characterized by multilocus sequence typing. The data provide insights into differences between H. pylori strains of different phylogeographic origins.

Sheh, Alexander; Piazuelo, M. Blanca; Wilson, Keith T.; Correa, Pelayo

2013-01-01

34

Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability  

PubMed Central

Background Recent advances in the treatment of cancer have focused on targeting genomic aberrations with selective therapeutic agents. In rare tumors, where large-scale clinical trials are daunting, this targeted genomic approach offers a new perspective and hope for improved treatments. Cancers of the ampulla of Vater are rare tumors that comprise only about 0.2% of gastrointestinal cancers. Consequently, they are often treated as either distal common bile duct or pancreatic cancers. Methods We analyzed DNA from a resected cancer of the ampulla of Vater and whole blood DNA from a 63 year-old man who underwent a pancreaticoduodenectomy by whole genome sequencing, achieving 37× and 40× coverage, respectively. We determined somatic mutations and structural alterations. Results We identified relevant aberrations, including deleterious mutations of KRAS and SMAD4 as well as a homozygous focal deletion of the PTEN tumor suppressor gene. These findings suggest that these tumors have a distinct oncogenesis from either common bile duct cancer or pancreatic cancer. Furthermore, this combination of genomic aberrations suggests a therapeutic context for dual mTOR/PI3K inhibition. Conclusions Whole genome sequencing can elucidate an oncogenic context and expose potential therapeutic vulnerabilities in rare cancers.

2012-01-01

35

Sequencing technologies and genome sequencing  

Microsoft Academic Search

The high-throughput - next generation sequencing (HT-NGS) technologies are currently the hottest topic in the field of human\\u000a and animals genomics researches, which can produce over 100 times more data compared to the most sophisticated capillary sequencers\\u000a based on the Sanger method. With the ongoing developments of high throughput sequencing machines and advancement of modern\\u000a bioinformatics tools at unprecedented pace,

Chandra Shekhar Pareek; Rafal Smoczynski; Andrzej Tretyn

36

Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing.  

PubMed

As more clinically relevant cancer genes are identified, comprehensive diagnostic approaches are needed to match patients to therapies, raising the challenge of optimization and analytical validation of assays that interrogate millions of bases of cancer genomes altered by multiple mechanisms. Here we describe a test based on massively parallel DNA sequencing to characterize base substitutions, short insertions and deletions (indels), copy number alterations and selected fusions across 287 cancer-related genes from routine formalin-fixed and paraffin-embedded (FFPE) clinical specimens. We implemented a practical validation strategy with reference samples of pooled cell lines that model key determinants of accuracy, including mutant allele frequency, indel length and amplitude of copy change. Test sensitivity achieved was 95-99% across alteration types, with high specificity (positive predictive value >99%). We confirmed accuracy using 249 FFPE cancer specimens characterized by established assays. Application of the test to 2,221 clinical cases revealed clinically actionable alterations in 76% of tumors, three times the number of actionable alterations detected by current diagnostic tests. PMID:24142049

Frampton, Garrett M; Fichtenholtz, Alex; Otto, Geoff A; Wang, Kai; Downing, Sean R; He, Jie; Schnall-Levin, Michael; White, Jared; Sanford, Eric M; An, Peter; Sun, James; Juhn, Frank; Brennan, Kristina; Iwanik, Kiel; Maillet, Ashley; Buell, Jamie; White, Emily; Zhao, Mandy; Balasubramanian, Sohail; Terzic, Selmira; Richards, Tina; Banning, Vera; Garcia, Lazaro; Mahoney, Kristen; Zwirko, Zac; Donahue, Amy; Beltran, Himisha; Mosquera, Juan Miguel; Rubin, Mark A; Dogan, Snjezana; Hedvat, Cyrus V; Berger, Michael F; Pusztai, Lajos; Lechner, Matthias; Boshoff, Chris; Jarosz, Mirna; Vietz, Christine; Parker, Alex; Miller, Vincent A; Ross, Jeffrey S; Curran, John; Cronin, Maureen T; Stephens, Philip J; Lipson, Doron; Yelensky, Roman

2013-11-01

37

Whole Genome Sequencing  

MedlinePLUS

... If you do choose to have your whole genome sequenced, it is very important and helpful to review your results with a trained professional. Also, you should make sure the lab is CLIA certified. What do the test results mean? Whole genome sequencing is not your average diagnostic test. A ...

38

The Cancer Genome Atlas - TCGA - Home Page  

Cancer.gov

The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

39

Germline variation in cancer-susceptibility genes in a healthy, ancestrally diverse cohort: implications for individual genome sequencing.  

PubMed

Technological advances coupled with decreasing costs are bringing whole genome and whole exome sequencing closer to routine clinical use. One of the hurdles to clinical implementation is the high number of variants of unknown significance. For cancer-susceptibility genes, the difficulty in interpreting the clinical relevance of the genomic variants is compounded by the fact that most of what is known about these variants comes from the study of highly selected populations, such as cancer patients or individuals with a family history of cancer. The genetic variation in known cancer-susceptibility genes in the general population has not been well characterized to date. To address this gap, we profiled the nonsynonymous genomic variation in 158 genes causally implicated in carcinogenesis using high-quality whole genome sequences from an ancestrally diverse cohort of 681 healthy individuals. We found that all individuals carry multiple variants that may impact cancer susceptibility, with an average of 68 variants per individual. Of the 2,688 allelic variants identified within the cohort, most are very rare, with 75% found in only 1 or 2 individuals in our population. Allele frequencies vary between ancestral groups, and there are 21 variants for which the minor allele in one population is the major allele in another. Detailed analysis of a selected subset of 5 clinically important cancer genes, BRCA1, BRCA2, KRAS, TP53, and PTEN, highlights differences between germline variants and reported somatic mutations. The dataset can serve a resource of genetic variation in cancer-susceptibility genes in 6 ancestry groups, an important foundation for the interpretation of cancer risk from personal genome sequences. PMID:24728327

Bodian, Dale L; McCutcheon, Justine N; Kothiyal, Prachi; Huddleston, Kathi C; Iyer, Ramaswamy K; Vockley, Joseph G; Niederhuber, John E

2014-01-01

40

Malaria Genome Sequencing Project.  

National Technical Information Service (NTIS)

The objectives of this 5-year Cooperative Agreement between TIGR and the Malaria Program, NMRC, were to: Specific Aim 1, sequence 3.5 Mb of P. falciparum genomic DNA; Specific Aim 2, annotate the sequence; Specific Aim 3, release the information to the sc...

M. J. Gardner

2002-01-01

41

Malaria Genome Sequencing Project.  

National Technical Information Service (NTIS)

The objectives of this Cooperative Agreement were: Specific Aim 1, sequence 3.5 Mb of P. falciparum genomic DNA; Specific Aim 2, annotate the sequence; Specific Aim 3, release the information to the scientific community. Two Specific Aims were added to th...

M. J. Gardner

2004-01-01

42

Malaria Genome Sequencing Project.  

National Technical Information Service (NTIS)

The objectives of this 5-year Cooperative Agreement between TIGR and the Malaria Program, NMRC, were to: Specific Aim 1, sequence 3.5 Mb of P. falciparum genomic DNA; Specific Aim 2, annotate the sequence; Specific Aim 3, release the information to the sc...

M. J. Gardner

2000-01-01

43

Malaria Genome Sequencing Project.  

National Technical Information Service (NTIS)

The objectives of this 5-year Cooperative Agreement between TTGR and the Malaria Program, NMRC, were to: (Specific Aim 1) sequence 3.5 Mb of P. falciparum genomic DNA; (Specific Aim 2) annotate the sequence; (Specific Aim 3) release the information to the...

M. J. Gardner

2003-01-01

44

Malaria Genome Sequencing Project.  

National Technical Information Service (NTIS)

The objectives of this 5-year Cooperative Agreement between TICR and the Malaria Program, NMPC, were to: Specific Aim 1, sequence 3.5 Mb of P. ralciparum genomic DNA; Specific Aim 2, annotate the sequence; Specific Aim 3, release the information to the sc...

M. J. Gardner

2001-01-01

45

Bacterial genome sequencing.  

PubMed

For over 30 yr, the Sanger method has been the standard for DNA sequencing. Instruments have been developed and improved over time to increase throughput, but they always relied on the same technology. Today, we are facing a revolution in DNA sequencing with many drastically different platforms that have become or will soon become available on the market. We review a number of sequencing technologies and provide examples of applications. We also discuss the impact genomics and new DNA sequencing approaches have had on various fields of biological research. PMID:19521879

Tettelin, Hervé; Feldblyum, Tamara

2009-01-01

46

Prenatal Whole Genome Sequencing  

PubMed Central

With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means.

Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

2014-01-01

47

A study based on whole-genome sequencing yields a rare variant at 8q24 associated with prostate cancer  

PubMed Central

Western countries, prostate cancer is the most prevalent cancer of men, and one of the leading causes of cancer-related death in men. Several genome-wide association studies have yielded numerous common variants conferring risk of prostate cancer. In the present study we analyzed 32.5 million variants discovered by whole-genome sequencing 1,795 Icelanders. One variant was found to be associated with prostate cancer in European populations: rs188140481[A] (OR = 2.90, Pcomb = 6.2×10?34) located on 8q24, with an average risk allele control frequency of 0.54%. This variant is only very weakly correlated (r2 ? 0.06) with previously reported risk variants on 8q24, and remains significant after adjustment for all of them. Carriers of rs188140481[A] were diagnosed with prostate cancer 1.26 years younger than non-carriers (P = 0.0059). We also report results for the previously described HOXB13 mutation (rs138213197[T]), confirming it as prostate cancer risk variant in populations from all over Europe.

Gudmundsson, Julius; Sulem, Patrick; Gudbjartsson, Daniel F.; Masson, Gisli; Agnarsson, Bjarni A.; Benediktsdottir, Kristrun R.; Sigurdsson, Asgeir; Magnusson, Olafur Th.; Gudjonsson, Sigurjon A.; Magnusdottir, Droplaug N.; Johannsdottir, Hrefna; Helgadottir, Hafdis Th.; Stacey, Simon N.; Jonasdottir, Adalbjorg; Olafsdottir, Stefania B.; Thorleifsson, Gudmar; Jonasson, Jon G.; Tryggvadottir, Laufey; Navarrete, Sebastian; Fuertes, Fernando; Helfand, Brian T.; Hu, Qiaoyan; Csiki, Irma E.; Mates, Ioan N.; Jinga, Viorel; Aben, Katja K. H.; van Oort, Inge M.; Vermeulen, Sita H.; Donovan, Jenny L.; Hamdy, Freddy C.; Ng, Chi-Fai; Chiu, Peter K.F.; Lau, Kin-Mang; Ng, Maggie C.Y.; Gulcher, Jeffrey R.; Kong, Augustine; Catalona, William J.; Mayordomo, Jose I.; Einarsson, Gudmundur V.; Barkardottir, Rosa B.; Jonsson, Eirikur; Mates, Dana; Neal, David E.; Kiemeney, Lambertus A.; Thorsteinsdottir, Unnur; Rafnar, Thorunn; Stefansson, Kari

2013-01-01

48

Cancer epigenomics: beyond genomics.  

PubMed

For many years cancer research has focused on genetic defects, but during the last decade epigenetic deregulation has been increasingly recognized as a hallmark of cancer. The advent of genome-scale analysis techniques, including the recently developed next-generation sequencing, has enabled an invaluable advance in the molecular mechanisms underlying tumor initiation, progression, and expansion. In this review we describe recent advances in the field of cancer epigenomics concerning DNA methylation, histone modifications, and miRNAs. In the near future, this information will be used to generate novel biomarkers of relevance to diagnosis, prognosis, and chemotherapeutic response. PMID:22402447

Sandoval, Juan; Esteller, Manel

2012-02-01

49

From human genome to cancer genome: The first decade  

PubMed Central

The realization that cancer progression required the participation of cellular genes provided one of several key rationales, in 1986, for embarking on the human genome project. Only with a reference genome sequence could the full spectrum of somatic changes leading to cancer be understood. Since its completion in 2003, the human reference genome sequence has fulfilled its promise as a foundational tool to illuminate the pathogenesis of cancer. Herein, we review the key historical milestones in cancer genomics since the completion of the genome, and some of the novel discoveries that are shaping our current understanding of cancer.

Wheeler, David A.; Wang, Linghua

2013-01-01

50

A whole-genome massively parallel sequencing analysis of BRCA1 mutant oestrogen receptor-negative and -positive breast cancers.  

PubMed

BRCA1 encodes a tumour suppressor protein that plays pivotal roles in homologous recombination (HR) DNA repair, cell-cycle checkpoints, and transcriptional regulation. BRCA1 germline mutations confer a high risk of early-onset breast and ovarian cancer. In more than 80% of cases, tumours arising in BRCA1 germline mutation carriers are oestrogen receptor (ER)-negative; however, up to 15% are ER-positive. It has been suggested that BRCA1 ER-positive breast cancers constitute sporadic cancers arising in the context of a BRCA1 germline mutation rather than being causally related to BRCA1 loss-of-function. Whole-genome massively parallel sequencing of ER-positive and ER-negative BRCA1 breast cancers, and their respective germline DNAs, was used to characterize the genetic landscape of BRCA1 cancers at base-pair resolution. Only BRCA1 germline mutations, somatic loss of the wild-type allele, and TP53 somatic mutations were recurrently found in the index cases. BRCA1 breast cancers displayed a mutational signature consistent with that caused by lack of HR DNA repair in both ER-positive and ER-negative cases. Sequencing analysis of independent cohorts of hereditary BRCA1 and sporadic non-BRCA1 breast cancers for the presence of recurrent pathogenic mutations and/or homozygous deletions found in the index cases revealed that DAPK3, TMEM135, KIAA1797, PDE4D, and GATA4 are potential additional drivers of breast cancers. This study demonstrates that BRCA1 pathogenic germline mutations coupled with somatic loss of the wild-type allele are not sufficient for hereditary breast cancers to display an ER-negative phenotype, and has led to the identification of three potential novel breast cancer genes (ie DAPK3, TMEM135, and GATA4). PMID:22362584

Natrajan, Rachael; Mackay, Alan; Lambros, Maryou B; Weigelt, Britta; Wilkerson, Paul M; Manie, Elodie; Grigoriadis, Anita; A'hern, Roger; van der Groep, Petra; Kozarewa, Iwanka; Popova, Tatiana; Mariani, Odette; Turajlic, Samra; Furney, Simon J; Marais, Richard; Rodruigues, Daniel-Nava; Flora, Adriana C; Wai, Patty; Pawar, Vidya; McDade, Simon; Carroll, Jason; Stoppa-Lyonnet, Dominique; Green, Andrew R; Ellis, Ian O; Swanton, Charles; van Diest, Paul; Delattre, Olivier; Lord, Christopher J; Foulkes, William D; Vincent-Salomon, Anne; Ashworth, Alan; Henri Stern, Marc; Reis-Filho, Jorge S

2012-05-01

51

Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges  

PubMed Central

Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections.

Liu, Biao; Morrison, Carl D.; Johnson, Candace S.; Trump, Donald L.; Qin, Maochun; Conroy, Jeffrey C.; Wang, Jianmin; Liu, Song

2013-01-01

52

The Cancer Genome Atlas completes detailed ovarian cancer analysis:  

Cancer.gov

An analysis of genomic changes in ovarian cancer has provided the most comprehensive and integrated view of cancer genes for any cancer type to date. Ovarian serous adenocarcinoma tumors from 500 patients were examined by The Cancer Genome Atlas (TCGA) Research Network. TCGA researchers completed whole-exome sequencing, which examines the protein-coding regions of the genome, on an unprecedented 316 tumors.

53

Understanding Cancer Series: Cancer Genomics  

MedlinePLUS

... Cancer Statistics Research & Funding News About NCI Understanding Cancer Series Posted: 01/28/2005 Reviewed: 09/01/ ... Dictionary Search for Clinical Trials NCI Publications Español Cancer Genomics Slide Number and Title What Is the ...

54

A Streamlined Method for Detecting Structural Variants in Cancer Genomes by Short Read Paired-End Sequencing  

PubMed Central

Defining the architecture of a specific cancer genome, including its structural variants, is essential for understanding tumor biology, mechanisms of oncogenesis, and for designing effective personalized therapies. Short read paired-end sequencing is currently the most sensitive method for detecting somatic mutations that arise during tumor development. However, mapping structural variants using this method leads to a large number of false positive calls, mostly due to the repetitive nature of the genome and the difficulty of assigning correct mapping positions to short reads. This study describes a method to efficiently identify large tumor-specific deletions, inversions, duplications and translocations from low coverage data using SVDetect or BreakDancer software and a set of novel filtering procedures designed to reduce false positive calls. Applying our method to a spontaneous T cell lymphoma arising in a core RAG2/p53-deficient mouse, we identified 40 validated tumor-specific structural rearrangements supported by as few as 2 independent read pairs.

Mijuskovic, Martina; Brown, Stuart M.; Tang, Zuojian; Lindsay, Cory R.; Efstathiadis, Efstratios; Deriano, Ludovic; Roth, David B.

2012-01-01

55

Exploration of liver cancer genomes.  

PubMed

Liver cancer is the third leading cause of cancer-related death worldwide. Advances in sequencing technologies have enabled the examination of liver cancer genomes at high resolution; somatic mutations, structural alterations, HBV integration, RNA editing and retrotransposon changes have been comprehensively identified. Furthermore, integrated analyses of trans-omics data (genome, transcriptome and methylome data) have identified multiple critical genes and pathways implicated in hepatocarcinogenesis. These analyses have uncovered potential therapeutic targets, including growth factor signalling, WNT signalling, the NFE2L2-mediated oxidative pathway and chromatin modifying factors, and paved the way for new molecular classifications for clinical application. The aetiological factors associated with liver cancer are well understood; however, their effects on the accumulation of somatic changes and the influence of ethnic variation in risk factors still remain unknown. The international collaborations of cancer genome sequencing projects are expected to contribute to an improved understanding of risk evaluation, diagnosis and therapy for this cancer. PMID:24473361

Shibata, Tatsuhiro; Aburatani, Hiroyuki

2014-06-01

56

Evolution of the cancer genome  

PubMed Central

The advent of massively parallel sequencing technologies has allowed the characterization of cancer genomes at an unprecedented resolution. Investigation of the mutational landscape of tumours is providing new insights into cancer genome evolution, laying bare the interplay of somatic mutation, adaptation of clones to their environment and natural selection. These studies have demonstrated the extent of the heterogeneity of cancer genomes, have allowed inferences to be made about the forces that act on nascent cancer clones as they evolve and have shown insight into the mutational processes that generate genetic variation. Here we review our emerging understanding of the dynamic evolution of the cancer genome and of the implications for basic cancer biology and the development of antitumour therapy.

Yates, Lucy R.; Campbell, Peter J.

2013-01-01

57

ENABLING CLINICAL CANCER GENOMICS FOR RARE MUTATIONS: COLD-PCR MAGNIFIES MUTATIONS PRIOR TO TARGETED AMPLICON RE-SEQUENCING  

PubMed Central

Despite widespread interest in the application of next-generation-sequencing (NGS) to the mutation profiling of individual cancer specimens, the onset of personalized clinical genomics is currently stalled due in part to technical hurdles. As tumors are genetically-heterogeneous and often mixed with normal/stromal cells, the resulting low-abundance DNA somatic mutations often produce ambiguous results or fall below the current NGS detection limit, thus hindering mutation calling that abides to clinical sensitivity/specificity standards. Here we examine the feasibility of applying COLD-PCR, a form of PCR that magnifies selectively the mutations, to boost the detection of unknown rare somatic mutations prior to applying NGS-based amplicon re-sequencing to clinical samples. We amplified DNA from serially-diluted mutation-containing human cell-lines into wild-type (WT) DNA, as well as lung adenocarcinoma and colorectal cancer specimens using COLD-PCR or conventional PCR for comparison. Following individual amplification of TP53, KRAS, IDH1, and EGFR regions, PCR products were barcoded, pooled for library preparation and sequenced on the Illumina-HiSeq2000 platform. Regardless of sequencing depth, sequencing errors dictated a mutation-detection limit of ~1–2% mutation abundance in conventional PCR amplicons analyzed by NGS. In contrast, COLD-PCR amplicons enabled genuine mutations to exceed the sequence noise levels, thus allowing reliable identification of mutation abundances of ~0.04%. Sequencing depth was not a significant factor in the identification of COLD-PCR-magnified mutations. The analyzed clinical specimens revealed several TP53 and KRAS missense mutations that could not be called following NGS of conventional amplicons, yet were clearly detectable in COLD-PCR amplicons. Extensive tumor heterogeneity in the TP53 gene was revealed in some samples. As cancer care shifts toward personalized intervention, based on the unique genetic abnormalities in each patient’s tumor genome, we anticipate that COLD-PCR-NGS will elucidate the role of rare mutations in tumors, enable NGS-based analysis of diverse clinical specimens and the broad inter-phasing of NGS with clinical practice.

Milbury, Coren A.; Correll, Mick; Quackenbush, John; Rubio, Renee; Makrigiorgos, G. Mike

2012-01-01

58

Genome Sequence Databases (Overview): Sequencing and Assembly  

SciTech Connect

From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

Lapidus, Alla L.

2009-01-01

59

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

Schadt, Christopher Warren [ORNL; Baker, Scott [Pacific Northwest National Laboratory (PNNL); Thykaer, Jette [Pacific Northwest National Laboratory (PNNL); Adney, William S [National Renewable Energy Laboratory (NREL); Brettin, Tom [Los Alamos National Laboratory (LANL); Brockman, Fred [Pacific Northwest National Laboratory (PNNL); Dhaeseleer, Patrick [Lawrence Livermore National Laboratory (LLNL); Martinez, A diego [Los Alamos National Laboratory (LANL); Miller, R michael [Argonne National Laboratory (ANL); Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Torok, Tamas [U.S. Department of Energy, Joint Genome Institute; Tuskan, Gerald A [ORNL; Bennett, Joan [Rutgers University; Berka, Randy [Novozymes, Inc; Briggs, Steven [University of California, San Diego; Heitman, Joseph [Duke University; Rizvi, L [Royal Ontario Museum; Taylor, John [University of California, Berkeley; Turgeon, Gillian [Cornell University; Werner-Washburne, Maggie [University of New Mexico, Albuquerque; Himmel, Michael [ORNL

2008-01-01

60

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions. Published by Elsevier Ltd on behalf of The British Mycological Society.

Baker, Scott [Pacific Northwest National Laboratory (PNNL); Thykaer, Jette [Pacific Northwest National Laboratory (PNNL); Adney, William S [National Renewable Energy Laboratory (NREL); Brettin, Tom [Los Alamos National Laboratory (LANL); Brockman, Fred [Pacific Northwest National Laboratory (PNNL); Dhaeseleer, Patrick [Lawrence Livermore National Laboratory (LLNL); Martinez, A diego [Los Alamos National Laboratory (LANL); Miller, R michael [Argonne National Laboratory (ANL); Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Schadt, Christopher Warren [ORNL; Torok, Tamas [U.S. Department of Energy, Joint Genome Institute; Tuskan, Gerald A [ORNL; Bennett, Joan [Rutgers University; Berka, Randy [Novozymes, Inc; Briggs, Steven [University of California, San Diego; Heitman, Joseph [Duke University; Taylor, John [University of California, Berkeley; Turgeon, Gillian [Cornell University; Werner-Washburne, Maggie [University of New Mexico, Albuquerque; Himmel, Michael E [National Renewable Energy Laboratory (NREL)

2008-01-01

61

Pilot genome-wide study of tandem 3? UTRs in esophageal cancer using high-throughput sequencing  

PubMed Central

Regulatory regions within the 3? untranslated region (UTR) influence polyadenylation (polyA), translation efficiency, localization and stability of mRNA. Alternative polyA (APA) has been considered to have a key role in gene regulation since 2008. Esophageal carcinoma is the eighth most common type of cancer worldwide. The association between polyA and disease highlights the requirement for comprehensive characterization of genome-wide polyA profiles. In the present study, global polyA profiles were established using the sequencing APA sites (SAPAS) method in order to elucidate the interrelation between 3? UTR length and the development of esophageal cancer. PolyA profiles were analyzed in squamous cell carcinoma, with ~903 genes identified to have shortened 3? UTRs and 917 genes identified to use distal polyA sites. The genes with shortened 3? UTRs were primarily associated with adherens junctions and the cell cycle. Four differentially expressed genes were also found, among which three genes were observed to be upregulated in cancerous tissue and involved in the positive regulation of cell motion, migration and locomotion. One gene was found to be downregulated in cancerous tissue, and associated with oxidative phosphorylation. These findings suggest that esophagitis may have a key role in the development of esophageal carcinoma. Furthermore, the genes with tandem 3? UTRs and differential expression identified in the present study may have the potential to be used as biomarkers for the diagnosis and prognosis of esophageal cancer.

SUN, MINGZHONG; JU, HUIXIANG; ZHOU, ZHONGWEI; ZHU, RONG

2014-01-01

62

The identification of a novel TP53 cancer susceptibility mutation through whole genome sequencing of a patient with therapy-related AML  

PubMed Central

Context The identification of patients with inherited cancer susceptibility syndromes facilitates early diagnosis, prevention, and treatment. However, in many cases of suspected cancer susceptibility, the family history is unclear and genetic testing of common cancer susceptibility genes is unrevealing. Objective To apply whole-genome sequencing to a patient with suspected cancer susceptibility (and lacking a clear family history of cancer and no BRCA1 and BRCA2 mutations) to identify rare or novel germline variants in cancer susceptibility genes. Design, Setting, and Participant Skin (normal) and bone marrow (leukemia) DNA were obtained from a patient with early-onset breast and ovarian cancer and therapy-related acute myeloid leukemia (t-AML), and analyzed with: 1) whole genome sequencing using paired end reads; 2) SNP genotyping; 3) RNA expression profiling; and 4) spectral karyotyping. Main Outcome Measures Structural variants, copy number alterations, single nucleotide variants and small insertions and deletions (indels) were detected and validated using the above platforms. Results Whole genome sequencing revealed a novel, heterozygous 3 Kb deletion removing exons 7-9 of TP53 in the patient’s normal skin DNA, which was homozygous in the leukemia DNA as a result of uniparental disomy. In addition, a total of 28 validated somatic single nucleotide variations or indels in coding genes, 8 somatic structural variants, and 12 somatic copy number alterations were detected in the patient’s leukemia genome. Conclusions Whole genome sequencing can identify novel, cryptic variants in cancer susceptibility genes in addition to providing unbiased information on the spectrum of mutations in a cancer genome.

Link, Daniel C.; Schuettpelz, Laura G.; Shen, Dong; Wang, Jinling; Walter, Matthew J.; Kulkarni, Shashikant; Payton, Jacqueline E.; Ivanovich, Jennifer; Goodfellow, Paul J.; Le Beau, Michelle; Koboldt, Daniel C.; Dooling, David J.; Fulton, Robert S.; Bender, R. Hugh F.; Fulton, Lucinda L.; Delehaunty, Kimberly D.; Fronick, Catrina C.; Appelbaum, Elizabeth L.; Schmidt, Heather; Abbott, Rachel; O'Laughlin, Michelle; Chen, Ken; McLellan, Michael D.; Varghese, Nobish; Nagarajan, Rakesh; Heath, Sharon; Graubert, Timothy A.; Ding, Li; Ley, Timothy J.; Zambetti, Gerard P.; Wilson, Richard K.; Mardis, Elaine R.

2011-01-01

63

Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with high-throughput sequencing  

Microsoft Academic Search

BACKGROUND: Cancer cells undergo massive alterations to their DNA methylation patterns that result in aberrant gene expression and malignant phenotypes. However, the mechanisms that underlie methylome changes are not well understood nor is the genomic distribution of DNA methylation changes well characterized. RESULTS: Here, we performed methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq) to obtain whole-genome DNA methylation profiles

Yoshinao Ruike; Yukako Imanaka; Fumiaki Sato; Kazuharu Shimizu; Gozoh Tsujimoto

2010-01-01

64

Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden.  

PubMed

Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression. PMID:24777035

Cazier, J-B; Rao, S R; McLean, C M; Walker, A L; Wright, B J; Jaeger, E E M; Kartsonaki, C; Marsden, L; Yau, C; Camps, C; Kaisaki, P; Taylor, J; Catto, J W; Tomlinson, I P M; Kiltie, A E; Hamdy, F C

2014-01-01

65

Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden  

PubMed Central

Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression.

Cazier, J.-B.; Rao, S.R.; McLean, C.M.; Walker, A.L.; Wright, B.J.; Jaeger, E.E.M.; Kartsonaki, C.; Marsden, L.; Yau, C.; Camps, C.; Kaisaki, P.; Allan, Christopher; Attar, Moustafa; Bell, John; Bentley, David; Broxholme, John; Buck, David; Cazier, Jean-Baptiste; Copley, Richard; Cornall, Richard; Donnelly, Peter; Fiddy, Simon; Green, Angie; Gregory, Lorna; Grocock, Russell; Hatton, Edouard; Holmes, Chris; Hughes, Linda; Humburg, Peter; Humphray, Sean; Kanapin, Alexander; Kingsbury, Zoya; Knight, Julian; Lamble, Sarah; Lise, Stefano; Lonie, Lorne; Lunter, Gerton; Martin, Hilary; Murray, Lisa; McCarthy, Davis; McVean, Gil; Pagnamenta, Alistair; Piazza, Paolo; Polanco, Guadelupe; Ratcliffe, Peter; Rimmer, Andy; Sahgal, Natasha; Taylor, Jenny; Tomlinson, Ian; Trebes, Amy; Wilkie, Andrew; Wright, Ben; Yau, Chris; Taylor, J.; Catto, J.W.; Tomlinson, I.P.M.; Kiltie, A.E.; Hamdy, F.C.

2014-01-01

66

Whole-exome/genome sequencing and genomics.  

PubMed

As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers. PMID:24298129

Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne

2013-12-01

67

Mutation Discovery in Regions of Segmental Cancer Genome Amplifications with CoNAn-SNV: A Mixture Model for Next Generation Sequencing of Tumors  

PubMed Central

Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome—in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs)—which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended genotype space where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). We introduce the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to discover 21 experimentally revalidated somatic non-synonymous mutations in a lobular breast cancer genome that were not detected using copy number insensitive SNV detection algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. This was also supported by analysis of a recently published lymphoma genome with a relatively quiescent karyotype, where CoNAn-SNV showed similar results to other callers except in regions of copy number gain where increased sensitivity was conferred. Our results indicate that in genomically unstable tumors, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes.

Crisan, Anamaria; Goya, Rodrigo; Ha, Gavin; Ding, Jiarui; Prentice, Leah M.; Oloumi, Arusha; Senz, Janine; Zeng, Thomas; Tse, Kane; Delaney, Allen; Marra, Marco A.; Huntsman, David G.; Hirst, Martin; Aparicio, Sam; Shah, Sohrab

2012-01-01

68

The Trichomonas vaginalis Genome Sequencing Project  

NSDL National Science Digital Library

The Institute for Genomic Research (TIGR) in 2003 released the first draft assembly of the Trichomonas vaginalis_genome, available through this website to the academic and not-for-profit research community for noncommercial use only. TIGR will release more data at regular intervals during the sequencing project, which should help researchers better understand this widespread parasite and its role in HIV infection, neo-natal disorders, predisposition to cervical cancer, and of course, vaginitis. The website also includes background information on T. vaginalis, as well as a link to TIGR's sequencing project for Entamoeba histolytica -- a closely related organism.

69

Whole-genome sequences of DA and F344 rats with different susceptibilities to arthritis, autoimmunity, inflammation and cancer.  

PubMed

DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301

Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S

2013-08-01

70

Testing personalized medicine: patient and physician expectations of next-generation genomic sequencing in late-stage cancer care.  

PubMed

Developments in genomics, including next-generation sequencing technologies, are expected to enable a more personalized approach to clinical care, with improved risk stratification and treatment selection. In oncology, personalized medicine is particularly advanced and increasingly used to identify oncogenic variants in tumor tissue that predict responsiveness to specific drugs. Yet, the translational research needed to validate these technologies will be conducted in patients with late-stage cancer and is expected to produce results of variable clinical significance and incidentally identify genetic risks. To explore the experiential context in which much of personalized cancer care will be developed and evaluated, we conducted a qualitative interview study alongside a pilot feasibility study of targeted DNA sequencing of metastatic tumor biopsies in adult patients with advanced solid malignancies. We recruited 29/73 patients and 14/17 physicians; transcripts from semi-structured interviews were analyzed for thematic patterns using an interpretive descriptive approach. Patient hopes of benefit from research participation were enhanced by the promise of novel and targeted treatment but challenged by non-findings or by limited access to relevant trials. Family obligations informed a willingness to receive genetic information, which was perceived as burdensome given disease stage or as inconsequential given faced challenges. Physicians were optimistic about long-term potential but conservative about immediate benefits and mindful of elevated patient expectations; consent and counseling processes were expected to mitigate challenges from incidental findings. These findings suggest the need for information and decision tools to support physicians in communicating realistic prospects of benefit, and for cautious approaches to the generation of incidental genetic information. PMID:23860039

Miller, Fiona A; Hayeems, Robin Z; Bytautas, Jessica P; Bedard, Philippe L; Ernst, Scott; Hirte, Hal; Hotte, Sebastien; Oza, Amit; Razak, Albiruni; Welch, Stephen; Winquist, Eric; Dancey, Janet; Siu, Lillian L

2014-03-01

71

Sequencing Complex Genomic Regions  

SciTech Connect

Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

Eichler, Evan [University of Washington

2009-05-28

72

Sequencing Complex Genomic Regions  

SciTech Connect

Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

Eichler, Evan [University of Washington

2009-05-28

73

Genome sequences and great expectations.  

PubMed

To assess how automatic function assignment will contribute to genome annotation in the next five years, we have performed an analysis of 31 available genome sequences. An emerging pattern is that function can be predicted for almost two-thirds of the 73,500 genes that were analyzed. Despite progress in computational biology, there will always be a great need for large-scale experimental determination of protein function. PMID:11178275

Iliopoulos, I; Tsoka, S; Andrade, M A; Janssen, P; Audit, B; Tramontano, A; Valencia, A; Leroy, C; Sander, C; Ouzounis, C A

2001-01-01

74

Second Generation Sequencing of the Mesothelioma Tumor Genome  

Microsoft Academic Search

The current paradigm for elucidating the molecular etiology of cancers relies on the interrogation of small numbers of genes, which limits the scope of investigation. Emerging second-generation massively parallel DNA sequencing technologies have enabled more precise definition of the cancer genome on a global scale. We examined the genome of a human primary malignant pleural mesothelioma (MPM) tumor and matched

Raphael Bueno; Assunta de Rienzo; Lingsheng Dong; Gavin J. Gordon; Colin F. Hercus; William G. Richards; Roderick V. Jensen; Arif Anwar; Gautam Maulik; Lucian R. Chirieac; Kim-Fong Ho; Bruce E. Taillon; Cynthia L. Turcotte; Robert G. Hercus; Steven R. Gullans; David J. Sugarbaker; Anita Brandstaetter

2010-01-01

75

NCI Center for Cancer Genomics  

Cancer.gov

NCI’s Center for Cancer Genomics applies genome science to better diagnose and treat cancer patients. The Center supports research to identify the genetic drivers of cancer and to advance the adoption of precise tumor diagnosis and treatment.

76

Development in Rice Genome Research Based on Accurate Genome Sequence  

PubMed Central

Rice is one of the most important crops in the world. Although genetic improvement is a key technology for the acceleration of rice breeding, a lack of genome information had restricted efforts in molecular-based breeding until the completion of the high-quality rice genome sequence, which opened new opportunities for research in various areas of genomics. The syntenic relationship of the rice genome to other cereal genomes makes the rice genome invaluable for understanding how cereal genomes function. Producing an accurate genome sequence is not an easy task, and it is becoming more important as sequence deviations among, and even within, species highlight functional or evolutionary implications for comparative genomics.

Matsumoto, Takashi; Wu, Jianzhong; Antonio, Baltazar A.; Sasaki, Takuji

2008-01-01

77

Genome-wide significant association between a sequence variant at 15q15.2 and lung cancer risk  

PubMed Central

Genome-wide association studies (GWAS) have identified three genomic regions, at 15q24-25.1, 5p15.33 and 6p21.33, which associate with risk of lung cancer. Large meta-analyses of GWA data have failed to find additional associations of genome-wide significance. In this study, we sought to confirm 7 variants with suggestive association to lung cancer (P<10?5) in a recently published meta-analysis. In a GWA dataset of 1,447 lung cancer cases and 36,256 controls in Iceland, three correlated variants on 15q15.2 (rs504417, rs11853991 and rs748404) showed a significant association with lung cancer whereas rs4254535 on 2p14, rs1530057 on 3p24.1, rs6438347 on 3q13.31 and rs1926203 on 10q23.31 did not. The most significant variant, rs748404, was genotyped in additional 1,299 lung cancer cases and 4,102 controls from the Netherlands, Spain and the USA and the results combined with published GWAS data. In this analysis, the T allele of rs748404 reached genome-wide significance (OR=1.15, P=1.1×10?9). Another variant at the same locus, rs12050604, showed association with lung cancer (OR=1.09, 3.6×10?6) and remained significant after adjustment for rs748404 and vice versa. rs748404 is located 140 kb centromeric of the TP53BP1 gene that has been implicated in lung cancer risk. Two fully correlated, non-synonymous coding variants in TP53BP1, rs2602141 (Q1136K) and rs560191 (E353D), showed association with lung cancer in our sample set; however, this association did not remain significant after adjustment for rs748404. Our data show that one or more lung cancer risk variants of genome-wide significance and distinct from the coding variants in TP53BP1 are located at 15q15.2.

Rafnar, Thorunn; Sulem, Patrick; Besenbacher, Soren; Gudbjartsson, Daniel F.; Zanon, Carlo; Gudmundsson, Julius; Stacey, Simon N.; Kostic, Jelena P.; Thorgeirsson, Thorgeir E.; Thorleifsson, Gudmar; Bjarnason, Hjordis; Skuladottir, Halla; Gudbjartsson, Tomas; Isaksson, Helgi J.; Isla, Dolores; Murillo, Laura; Garcia-Prats, Maria D.; Panadero, Angeles; Aben, Katja K.H.; Vermeulen, Sita H.; van der Heijden, Henricus F.M.; Feser, William; Miller, York E.; Bunn, Paul A.; Kong, Augustine; Wolf, Holly J.; Franklin, Wilbur A.; Mayordomo, Jose I; Kiemeney, Lambertus A.; Jonsson, Steinn; Thorsteinsdottir, Unnur; Stefansson, Kari

2010-01-01

78

Draft Genome Sequences of Helicobacter pylori Isolates from Malaysia, Cultured from Patients with Functional Dyspepsia and Gastric Cancer  

PubMed Central

Helicobacter pylori is the main bacterial causative agent of gastroduodenal disorders and a risk factor for gastric adenocarcinoma and mucosa-associated lymphoid tissue (MALT) lymphoma. The draft genomes of 10 closely related H. pylori isolates from the multiracial Malaysian population will provide an insight into the genetic diversity of isolates in Southeast Asia. These isolates were cultured from gastric biopsy samples from patients with functional dyspepsia and gastric cancer. The availability of this genomic information will provide an opportunity for examining the evolution and population structure of H. pylori isolates from Southeast Asia, where the East meets the West.

Gunaletchumy, Selva Perumal; Teh, Xinsheng; Khosravi, Yalda; Ramli, Nur Siti Khadijah; Chua, Eng Guan; Kavitha, Thevakumar; Mason, Joanne N.; Lee, Huey Tyng; Alias, Halimah; Zaidan, Nur Zafirah; Yassin, Norzawani Buang M.; Tay, Liang Chung; Rudd, Stephen; Mitchell, Hazel M.; Kaakoush, Nadeem O.; Loke, Mun Fai; Goh, Khean Lee

2012-01-01

79

Endometrial and acute myeloid leukemia cancer genomes characterized  

Cancer.gov

The characterization of acute myeloid leukemia and endometrial cancer are the latest results of The Cancer Genome Atlas Research Network’s efforts to sequence the genomes of 20 major cancers. The photo above shows technicians from The Genome Institute at Washington University in St. Louis.

80

The diploid genome sequence of Candida albicans  

Microsoft Academic Search

We present the diploid genome sequence of the fungal pathogen Candida albicans. Because C. albicans has no known haploid or homozygous form, sequencing was performed as a whole-genome shotgun of the heterozygous diploid genome in strain SC5314, a clinical isolate that is the parent of strains widely used for molecular analysis. We developed computational methods to assemble a diploid genome

Ted Jones; Nancy A. Federspiel; Hiroji Chibana; Jan Dungan; Sue Kalman; B. B. Magee; George Newport; Yvonne R. Thorstenson; Nina Agabian; P. T. Magee; Ronald W. Davis; Stewart Scherer

2004-01-01

81

The Sequence of the Human Genome  

Microsoft Academic Search

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

2001-01-01

82

Personal genome sequencing: current approaches and challenges  

PubMed Central

The revolution in DNA sequencing technologies has now made it feasible to determine the genome sequences of many individuals; i.e., “personal genomes.” Genome sequences of cells and tissues from both normal and disease states have been determined. Using current approaches, whole human genome sequences are not typically assembled and determined de novo, but, instead, variations relative to a reference sequence are identified. We discuss the current state of personal genome sequencing, the main steps involved in determining a genome sequence (i.e., identifying single-nucleotide polymorphisms [SNPs] and structural variations [SVs], assembling new sequences, and phasing haplotypes), and the challenges and performance metrics for evaluating the accuracy of the reconstruction. Finally, we consider the possible individual and societal benefits of personal genome sequences.

Snyder, Michael; Du, Jiang; Gerstein, Mark

2010-01-01

83

Plant genome sequencing - applications for crop improvement.  

PubMed

It is over 10 years since the genome sequence of the first crop was published. Since then, the number of crop genomes sequenced each year has increased steadily. The amazing pace at which genome sequences are becoming available is largely due to the improvement in sequencing technologies both in terms of cost and speed. Modern sequencing technologies allow the sequencing of multiple cultivars of smaller crop genomes at a reasonable cost. Though many of the published genomes are considered incomplete, they nevertheless have proved a valuable tool to understand important crop traits such as fruit ripening, grain traits and flowering time adaptation. PMID:24679255

Bolger, Marie E; Weisshaar, Bernd; Scholz, Uwe; Stein, Nils; Usadel, Björn; Mayer, Klaus F X

2014-04-01

84

Genome sequencing of lymphoid malignancies.  

PubMed

Our understanding of the pathogenesis of lymphoid malignancies has been transformed by next-generation sequencing. The studies in this review have used whole-genome, exome, and transcriptome sequencing to identify recurring structural genetic alterations and sequence mutations that target key cellular pathways in acute lymphoblastic leukemia (ALL) and the lymphomas. Although each tumor type is characterized by a unique genomic landscape, several cellular pathways are mutated in multiple tumor types-transcriptional regulation of differentiation, antigen receptor signaling, tyrosine kinase and Ras signaling, and epigenetic modifications-and individual genes are mutated in multiple tumors, notably TCF3, NOTCH1, MYD88, and BRAF. In addition to providing fundamental insights into tumorigenesis, these studies have also identified potential new markers for diagnosis, risk stratification, and therapeutic intervention. Several genetic alterations are intuitively "druggable" with existing agents, for example, kinase-activating lesions in high-risk B-cell ALL, NOTCH1 in both leukemia and lymphoma, and BRAF in hairy cell leukemia. Future sequencing efforts are required to comprehensively define the genetic basis of all lymphoid malignancies, examine the relative roles of germline and somatic variation, dissect the genetic basis of clonal heterogeneity, and chart a course for clinical sequencing and translation to improved therapeutic outcomes. PMID:24041576

Mullighan, Charles G

2013-12-01

85

Punctuated Evolution of Prostate Cancer Genomes  

PubMed Central

SUMMARY The analysis of exonic DNA from prostate cancers has identified recurrently mutated genes, but the spectrum of genome-wide alterations has not been profiled extensively in this disease. We sequenced the genomes of 57 prostate tumors and matched normal tissues to characterize somatic alterations and to study how they accumulate during oncogenesis and progression. By modeling the genesis of genomic rearrangements, we identified abundant DNA translocations and deletions that arise in a highly interdependent manner. This phenomenon, which we term “chromoplexy”, frequently accounts for the dysregulation of prostate cancer genes and appears to disrupt multiple cancer genes coordinately. Our modeling suggests that chromoplexy may induce considerable genomic derangement over relatively few events in prostate cancer and other neoplasms, supporting a model of punctuated cancer evolution. By characterizing the clonal hierarchy of genomic lesions in prostate tumors, we charted a path of oncogenic events along which chromoplexy may drive prostate carcinogenesis.

Baca, Sylvan C.; Prandi, Davide; Lawrence, Michael S.; Mosquera, Juan Miguel; Romanel, Alessandro; Drier, Yotam; Park, Kyung; Kitabayashi, Naoki; MacDonald, Theresa Y.; Ghandi, Mahmoud; Van Allen, Eliezer; Kryukov, Gregory V.; Sboner, Andrea; Theurillat, Jean-Philippe; Soong, T. David; Nickerson, Elizabeth; Auclair, Daniel; Tewari, Ashutosh; Beltran, Himisha; Onofrio, Robert C.; Boysen, Gunther; Guiducci, Candace; Barbieri, Christopher E.; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L.; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Winckler, Wendy; Cipicchio, Michelle; Ardlie, Kristin; Kantoff, Philip W.; Berger, Michael F.; Gabriel, Stacey B.; Golub, Todd R.; Meyerson, Matthew; Lander, Eric S.; Elemento, Olivier; Getz, Gad; Demichelis, Francesca; Rubin, Mark A.; Garraway, Levi A.

2013-01-01

86

Comprehensive genomic analyses of a metastatic colon cancer to the lung by whole exome sequencing and gene expression analysis.  

PubMed

We performed whole exome sequencing and gene expression analysis on a metastatic colon cancer to the lung, along with the adjacent normal tissue of the lung. Whole exome sequencing uncovered 71 high-confidence non?synonymous mutations. We selected 16 mutation candidates, and 13 out of 16 mutations were validated by targeted deep sequencing using the Ion Torrent PGM customized AmpliSeq panel. By integrating mutation, copy number and gene expression microarray data, we identified a JAZF1 mutation with a gain-of-copy, suggesting its oncogenic potential for the lung metastasis from colon cancer. Our pathway analyses showed that the identified mutations closely reflected characteristics of the metastatic site (lung) while mRNA gene expression patterns kept genetic information of its primary tumor (colon). The most significant gene expression network was the 'Colorectal Cancer Metastasis Signaling', containing 6 (ADCY2, ADCY9, APC, GNB5, K-ras and LRP6) out of the 71 mutated genes. Some of these mutated genes (ADCY9, ADCY2, GNB5, K-ras, HDAC6 and ARHGEF17) also belong to the 'Phospholipase C Signaling' network, which suggests that this pathway and its mutated genes may contribute to a lung metastasis from colon cancer. PMID:24172857

Fang, Li Tai; Lee, Sharon; Choi, Helen; Kim, Hong Kwan; Jew, Gregory; Kang, Hio Chung; Chen, Lin; Jablons, David; Kim, Il-Jin

2014-01-01

87

Genome sequences of 65 Helicobacter pylori strains isolated from asymptomatic individuals and patients with gastric cancer, peptic ulcer disease, or gastritis.  

PubMed

Helicobacter pylori, inhabitant of the gastric mucosa of over half of the world population, with decreasing prevalence in the U.S., has been associated with a variety of gastric pathologies. However, the majority of H. pylori-infected individuals remain asymptomatic, and negative correlations between H. pylori and allergic diseases have been reported. Comprehensive genome characterization of H. pylori populations from different human host backgrounds including healthy individuals provides the exciting potential to generate new insights into the open question whether human health outcome is associated with specific H. pylori genotypes or dependent on other environmental factors. We report the genome sequences of 65 H. pylori isolates from individuals with gastric cancer, preneoplastic lesions, peptic ulcer disease, gastritis, and from asymptomatic adults. Isolates were collected from multiple locations in North America (USA and Canada) as well as from Columbia and Japan. The availability of these H. pylori genome sequences from individuals with distinct clinical presentations provides the research community with a resource for detailed investigations into genetic elements that correlate either positively or negatively with the epidemiology, human host adaptation, and gastric pathogenesis and will aid in the characterization of strains that may favor the development of specific pathology, including gastric cancer. PMID:23661595

Blanchard, Thomas G; Czinn, Steven J; Correa, Pelayo; Nakazawa, Teruko; Keelan, Monika; Morningstar, Lindsay; Santana-Cruz, Ivette; Maroo, Ankit; McCracken, Carri; Shefchek, Kent; Daugherty, Sean; Song, Yang; Fraser, Claire M; Fricke, W Florian

2013-07-01

88

Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer  

Microsoft Academic Search

Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53,

Paul T. Spellman; Laura Heiser; Joe W. Gray

2009-01-01

89

Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project  

Microsoft Academic Search

Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity

Mark D. Adams; Jenny M. Kelley; Jeannine D. Gocayne; Mark Dubnick; Mihael H. Polymeropoulos; Hong Xiao; Carl R. Merril; Andrew Wu; Bjorn Olde; Ruben F. Moreno; Anthony R. Kerlavage; W. Richard McCombie; J. Craig Venter

1991-01-01

90

Genome sequencing and functional genomics approaches in tomato  

Microsoft Academic Search

Tomato genome sequencing has been taking place through an international, 10-year initiative entitled the “International Solanaceae Genome Project” (SOL). The strategy proposed by the SOL consortium is to sequence the approximately 220?Mb of euchromatin that contains the majority of genes, rather than the entire tomato genome. Tomato and other Solanaceae plants have unique developmental aspects, such as the formation of

Daisuke Shibata

2005-01-01

91

The diploid genome sequence of Candida albicans  

PubMed Central

We present the diploid genome sequence of the fungal pathogen Candida albicans. Because C. albicans has no known haploid or homozygous form, sequencing was performed as a whole-genome shotgun of the heterozygous diploid genome in strain SC5314, a clinical isolate that is the parent of strains widely used for molecular analysis. We developed computational methods to assemble a diploid genome sequence in good agreement with available physical mapping data. We provide a whole-genome description of heterozygosity in the organism. Comparative genomic analyses provide important clues about the evolution of the species and its mechanisms of pathogenesis.

Jones, Ted; Federspiel, Nancy A.; Chibana, Hiroji; Dungan, Jan; Kalman, Sue; Magee, B. B.; Newport, George; Thorstenson, Yvonne R.; Agabian, Nina; Magee, P. T.; Davis, Ronald W.; Scherer, Stewart

2004-01-01

92

The UCSC Cancer Genomics Browser: update 2013.  

PubMed

The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a set of web-based tools to display, investigate and analyse cancer genomics data and its associated clinical information. The browser provides whole-genome to base-pair level views of several different types of genomics data, including some next-generation sequencing platforms. The ability to view multiple datasets together allows users to make comparisons across different data and cancer types. Biological pathways, collections of genes, genomic or clinical information can be used to sort, aggregate and zoom into a group of samples. We currently display an expanding set of data from various sources, including 201 datasets from 22 TCGA (The Cancer Genome Atlas) cancers as well as data from Cancer Cell Line Encyclopedia and Stand Up To Cancer. New features include a completely redesigned user interface with an interactive tutorial and updated documentation. We have also added data downloads, additional clinical heatmap features, and an updated Tumor Image Browser based on Google Maps. New security features allow authenticated users access to private datasets hosted by several different consortia through the public website. PMID:23109555

Goldman, Mary; Craft, Brian; Swatloski, Teresa; Ellrott, Kyle; Cline, Melissa; Diekhans, Mark; Ma, Singer; Wilks, Chris; Stuart, Josh; Haussler, David; Zhu, Jingchun

2013-01-01

93

The UCSC Cancer Genomics Browser: update 2013  

PubMed Central

The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a set of web-based tools to display, investigate and analyse cancer genomics data and its associated clinical information. The browser provides whole-genome to base-pair level views of several different types of genomics data, including some next-generation sequencing platforms. The ability to view multiple datasets together allows users to make comparisons across different data and cancer types. Biological pathways, collections of genes, genomic or clinical information can be used to sort, aggregate and zoom into a group of samples. We currently display an expanding set of data from various sources, including 201 datasets from 22 TCGA (The Cancer Genome Atlas) cancers as well as data from Cancer Cell Line Encyclopedia and Stand Up To Cancer. New features include a completely redesigned user interface with an interactive tutorial and updated documentation. We have also added data downloads, additional clinical heatmap features, and an updated Tumor Image Browser based on Google Maps. New security features allow authenticated users access to private datasets hosted by several different consortia through the public website.

Goldman, Mary; Craft, Brian; Swatloski, Teresa; Ellrott, Kyle; Cline, Melissa; Diekhans, Mark; Ma, Singer; Wilks, Chris; Stuart, Josh; Haussler, David; Zhu, Jingchun

2013-01-01

94

Sequencing Intractable DNA to Close Microbial Genomes  

SciTech Connect

Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

2012-01-01

95

Colon cancer-derived oncogenic EGFR G724S mutant identified by whole genome sequence analysis is dependent on asymmetric dimerization and sensitive to cetuximab  

PubMed Central

Background Inhibition of the activated epidermal growth factor receptor (EGFR) with either enzymatic kinase inhibitors or anti-EGFR antibodies such as cetuximab, is an effective modality of treatment for multiple human cancers. Enzymatic EGFR inhibitors are effective for lung adenocarcinomas with somatic kinase domain EGFR mutations while, paradoxically, anti-EGFR antibodies are more effective in colon and head and neck cancers where EGFR mutations occur less frequently. In colorectal cancer, anti-EGFR antibodies are routinely used as second-line therapy of KRAS wild-type tumors. However, detailed mechanisms and genomic predictors for pharmacological response to these antibodies in colon cancer remain unclear. Findings We describe a case of colorectal adenocarcinoma, which was found to harbor a kinase domain mutation, G724S, in EGFR through whole genome sequencing. We show that G724S mutant EGFR is oncogenic and that it differs from classic lung cancer derived EGFR mutants in that it is cetuximab responsive in vitro, yet relatively insensitive to small molecule kinase inhibitors. Through biochemical and cellular pharmacologic studies, we have determined that cells harboring the colon cancer-derived G719S and G724S mutants are responsive to cetuximab therapy in vitro and found that the requirement for asymmetric dimerization of these mutant EGFR to promote cellular transformation may explain their greater inhibition by cetuximab than small-molecule kinase inhibitors. Conclusion The colon-cancer derived G719S and G724S mutants are oncogenic and sensitive in vitro to cetuximab. These data suggest that patients with these mutations may benefit from the use of anti-EGFR antibodies as part of the first-line therapy.

2014-01-01

96

Draft Genome Sequence of Lactobacillus rhamnosus 2166  

PubMed Central

In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

2014-01-01

97

Value of a newly sequenced bacterial genome  

PubMed Central

Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2014-01-01

98

Genome Sequence of Pseudomonas mandelii PD30  

PubMed Central

The genome sequence of Pseudomonas mandelii PD30 is reported in this announcement. The genes for the reduction of nitrate to dinitrogen were identified in the genome assembly and subsequently used in gene expression research.

Formusa, Philip A.; Hsiang, Tom; Habash, Marc B.; Lee, Hung

2014-01-01

99

Genome-Wide Small RNA Sequencing and Gene Expression Analysis Reveals a microRNA Profile of Cancer Susceptibility in ATM-Deficient Human Mammary Epithelial Cells  

PubMed Central

Deficiencies in the ATM gene are the underlying cause for ataxia telangiectasia, a syndrome characterized by neurological, motor and immunological defects, and a predisposition to cancer. MicroRNAs (miRNAs) are useful tools for cancer profiling and prediction of therapeutic responses to clinical regimens. We investigated the consequences of ATM deficiency on miRNA expression and associated gene expression in normal human mammary epithelial cells (HME-CCs). We identified 81 significantly differentially expressed miRNAs in ATM-deficient HME-CCs using small RNA sequencing. Many of these have been implicated in tumorigenesis and proliferation and include down-regulated tumor suppressor miRNAs, such as hsa-miR-29c and hsa-miR-16, as well as over-expressed pro-oncogenic miRNAs, such as hsa-miR-93 and hsa-miR-221. MicroRNA changes were integrated with genome wide gene expression profiles to investigate possible miRNA targets. Predicted mRNA targets of the miRNAs significantly regulated after ATM depletion included many genes associated with cancer formation and progression, such as SOCS1 and the proto-oncogene MAF. While a number of miRNAs have been reported as altered in cancerous cells, there is little understanding as to how these small RNAs might be driving cancer formation or how they might be used as biomarkers for cancer susceptibility. This study provides preliminary data for defining miRNA profiles that may be used as prognostic or predictive biomarkers for breast cancer. Our integrated analysis of miRNA and mRNA expression allows us to gain a better understanding of the signaling involved in breast cancer predisposition and suggests a mechanism for the breast cancer-prone phenotype seen in ATM-deficient patients.

Hesse, Jill E.; Liu, Liwen; Innes, Cynthia L.; Cui, Yuxia; Palii, Stela S.; Paules, Richard S.

2013-01-01

100

Parking Strategies for Genome Sequencing  

PubMed Central

The parking strategy is an iterative approach to DNA sequencing. Each iteration consists of sequencing a novel portion of target DNA that does not overlap any previously sequenced region. Subject to the constraint of no overlap, each new region is chosen randomly. A parking strategy is often ideal in the early stages of a project for rapidly generating unique data. As a project progresses, parking becomes progressively more expensive and eventually prohibitive. We present a mathematical model with a generalization to allow for overlaps. This model predicts multiple parameters, including progress, costs, and the distribution of gap sizes left by a parking strategy. The highly fragmented nature of the gaps left after an initial parking strategy may make it difficult to finish a project efficiently. Therefore, in addition to our parking model, we model gap closing by walking. Our gap-closing model is generalizable to many other strategies. Our discussion includes modified parking strategies and hybrids with other strategies. A hybrid parking strategy has been employed for portions of the Human Genome Project.

Roach, Jared C.; Thorsson, Vesteinn; Siegel, Andrew F.

2000-01-01

101

Genomic strategies to identify mammalian regulatory sequences  

Microsoft Academic Search

With the continuing accomplishments of the human genome project, high-throughput strategies to identify DNA sequences that are important in mammalian gene regulation are becoming increasingly feasible. In contrast to the historic, labour-intensive, wet-laboratory methods for identifying regulatory sequences, many modern approaches are heavily focused on the computational analysis of large genomic data sets. Data from inter-species genomic sequence comparisons and

Len A. Pennacchio; Edward M. Rubin

2001-01-01

102

Whole genome sequencing reveals potential targets for therapy in patients with refractory KRAS mutated metastatic colorectal cancer  

PubMed Central

Background The outcome of patients with metastatic colorectal carcinoma (mCRC) following first line therapy is poor, with median survival of less than one year. The purpose of this study was to identify candidate therapeutically targetable somatic events in mCRC patient samples by whole genome sequencing (WGS), so as to obtain targeted treatment strategies for individual patients. Methods Four patients were recruited, all of whom had received?>?2 prior therapy regimens. Percutaneous needle biopsies of metastases were performed with whole blood collection for the extraction of constitutional DNA. One tumor was not included in this study as the quality of tumor tissue was not sufficient for further analysis. WGS was performed using Illumina paired end chemistry on HiSeq2000 sequencing systems, which yielded coverage of greater than 30X for all samples. NGS data were processed and analyzed to detect somatic genomic alterations including point mutations, indels, copy number alterations, translocations and rearrangements. Results All 3 tumor samples had KRAS mutations, while 2 tumors contained mutations in the APC gene and the PIK3CA gene. Although we did not identify a TCF7L2-VTI1A translocation, we did detect a TCF7L2 mutation in one tumor. Among the other interesting mutated genes was INPPL1, an important gene involved in PI3 kinase signaling. Functional studies demonstrated that inhibition of INPPL1 reduced growth of CRC cells, suggesting that INPPL1 may promote growth in CRC. Conclusions Our study further supports potential molecularly defined therapeutic contexts that might provide insights into treatment strategies for refractory mCRC. New insights into the role of INPPL1 in colon tumor cell growth have also been identified. Continued development of appropriate targeted agents towards specific events may be warranted to help improve outcomes in CRC.

2014-01-01

103

NIH Launches Comprehensive Effort to Explore Cancer Genomics: The Cancer Genome Atlas Begins With Three-Year, $100 Million Pilot  

Cancer.gov

The National Cancer Institute and the National Human Genome Research Institute today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing. Questions and Answers

104

DNA secondary structures and epigenetic determinants of cancer genome evolution  

Microsoft Academic Search

An unstable genome is a hallmark of many cancers. It is unclear, however, whether some mutagenic features driving somatic alterations in cancer are encoded in the genome sequence and whether they can operate in a tissue-specific manner. We performed a genome-wide analysis of 663,446 DNA breakpoints associated with somatic copy-number alterations (SCNAs) from 2,792 cancer samples classified into 26 cancer

Subhajyoti De; Franziska Michor

2011-01-01

105

Evolution of the cancer genome  

PubMed Central

Human tumors result from an evolutionary process operating on somatic cells within tissues, whereby natural selection operates on the phenotypic variability generated by the accumulation of genetic, genomic and epigenetic alterations. This somatic evolution leads to adaptations such as increased proliferative, angiogenic, and invasive phenotypes. In this review we outline how cancer genomes are beginning to be investigated from an evolutionary perspective. We describe recent progress in the cataloging of somatic genetic and genomic alterations, and investigate the contributions of germline as well as epigenetic factors to cancer genome evolution. Finally, we outline the challenges facing researchers who investigate the processes driving the evolution of the cancer genome.

Podlaha, Ondrej; Riester, Markus; De, Subhajyoti; Michor, Franziska

2013-01-01

106

Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with high-throughput sequencing  

PubMed Central

Background Cancer cells undergo massive alterations to their DNA methylation patterns that result in aberrant gene expression and malignant phenotypes. However, the mechanisms that underlie methylome changes are not well understood nor is the genomic distribution of DNA methylation changes well characterized. Results Here, we performed methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq) to obtain whole-genome DNA methylation profiles for eight human breast cancer cell (BCC) lines and for normal human mammary epithelial cells (HMEC). The MeDIP-seq analysis generated non-biased DNA methylation maps by covering almost the entire genome with sufficient depth and resolution. The most prominent feature of the BCC lines compared to HMEC was a massively reduced methylation level particularly in CpG-poor regions. While hypomethylation did not appear to be associated with particular genomic features, hypermethylation preferentially occurred at CpG-rich gene-related regions independently of the distance from transcription start sites. We also investigated methylome alterations during epithelial-to-mesenchymal transition (EMT) in MCF7 cells. EMT induction was associated with specific alterations to the methylation patterns of gene-related CpG-rich regions, although overall methylation levels were not significantly altered. Moreover, approximately 40% of the epithelial cell-specific methylation patterns in gene-related regions were altered to those typical of mesenchymal cells, suggesting a cell-type specific regulation of DNA methylation. Conclusions This study provides the most comprehensive analysis to date of the methylome of human mammary cell lines and has produced novel insights into the mechanisms of methylome alteration during tumorigenesis and the interdependence between DNA methylome alterations and morphological changes.

2010-01-01

107

Marsupial genome sequences: providing insight into evolution and disease.  

PubMed

Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

Deakin, Janine E

2012-01-01

108

The Genome Sequencing Center at NCGR  

SciTech Connect

Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Schilkey, Faye [National Center for Genome Resources

2010-06-02

109

Recurrent targeted genes of hepatitis B virus in the liver cancer genomes identified by a next-generation sequencing-based approach.  

PubMed

Integration of the viral DNA into host chromosomes was found in most of the hepatitis B virus (HBV)-related hepatocellular carcinomas (HCCs). Here we devised a massive anchored parallel sequencing (MAPS) method using next-generation sequencing to isolate and sequence HBV integrants. Applying MAPS to 40 pairs of HBV-related HCC tissues (cancer and adjacent tissues), we identified 296 HBV integration events corresponding to 286 unique integration sites (UISs) with precise HBV-Human DNA junctions. HBV integration favored chromosome 17 and preferentially integrated into human transcript units. HBV targeted genes were enriched in GO terms: cAMP metabolic processes, T cell differentiation and activation, TGF beta receptor pathway, ncRNA catabolic process, and dsRNA fragmentation and cellular response to dsRNA. The HBV targeted genes include 7 genes (PTPRJ, CNTN6, IL12B, MYOM1, FNDC3B, LRFN2, FN1) containing IPR003961 (Fibronectin, type III domain), 7 genes (NRG3, MASP2, NELL1, LRP1B, ADAM21, NRXN1, FN1) containing IPR013032 (EGF-like region, conserved site), and three genes (PDE7A, PDE4B, PDE11A) containing IPR002073 (3', 5'-cyclic-nucleotide phosphodiesterase). Enriched pathways include hsa04512 (ECM-receptor interaction), hsa04510 (Focal adhesion), and hsa04012 (ErbB signaling pathway). Fewer integration events were found in cancers compared to cancer-adjacent tissues, suggesting a clonal expansion model in HCC development. Finally, we identified 8 genes that were recurrent target genes by HBV integration including fibronectin 1 (FN1) and telomerase reverse transcriptase (TERT1), two known recurrent target genes, and additional novel target genes such as SMAD family member 5 (SMAD5), phosphatase and actin regulator 4 (PHACTR4), and RNA binding protein fox-1 homolog (C. elegans) 1 (RBFOX1). Integrating analysis with recently published whole-genome sequencing analysis, we identified 14 additional recurrent HBV target genes, greatly expanding the HBV recurrent target list. This global survey of HBV integration events, together with recently published whole-genome sequencing analyses, furthered our understanding of the HBV-related HCC. PMID:23236287

Ding, Dong; Lou, Xiaoyan; Hua, Dasong; Yu, Wei; Li, Lisha; Wang, Jun; Gao, Feng; Zhao, Na; Ren, Guoping; Li, Lanjuan; Lin, Biaoyang

2012-01-01

110

Recurrent Targeted Genes of Hepatitis B Virus in the Liver Cancer Genomes Identified by a Next-Generation Sequencing-Based Approach  

PubMed Central

Integration of the viral DNA into host chromosomes was found in most of the hepatitis B virus (HBV)–related hepatocellular carcinomas (HCCs). Here we devised a massive anchored parallel sequencing (MAPS) method using next-generation sequencing to isolate and sequence HBV integrants. Applying MAPS to 40 pairs of HBV–related HCC tissues (cancer and adjacent tissues), we identified 296 HBV integration events corresponding to 286 unique integration sites (UISs) with precise HBV–Human DNA junctions. HBV integration favored chromosome 17 and preferentially integrated into human transcript units. HBV targeted genes were enriched in GO terms: cAMP metabolic processes, T cell differentiation and activation, TGF beta receptor pathway, ncRNA catabolic process, and dsRNA fragmentation and cellular response to dsRNA. The HBV targeted genes include 7 genes (PTPRJ, CNTN6, IL12B, MYOM1, FNDC3B, LRFN2, FN1) containing IPR003961 (Fibronectin, type III domain), 7 genes (NRG3, MASP2, NELL1, LRP1B, ADAM21, NRXN1, FN1) containing IPR013032 (EGF-like region, conserved site), and three genes (PDE7A, PDE4B, PDE11A) containing IPR002073 (3?, 5?-cyclic-nucleotide phosphodiesterase). Enriched pathways include hsa04512 (ECM-receptor interaction), hsa04510 (Focal adhesion), and hsa04012 (ErbB signaling pathway). Fewer integration events were found in cancers compared to cancer-adjacent tissues, suggesting a clonal expansion model in HCC development. Finally, we identified 8 genes that were recurrent target genes by HBV integration including fibronectin 1 (FN1) and telomerase reverse transcriptase (TERT1), two known recurrent target genes, and additional novel target genes such as SMAD family member 5 (SMAD5), phosphatase and actin regulator 4 (PHACTR4), and RNA binding protein fox-1 homolog (C. elegans) 1 (RBFOX1). Integrating analysis with recently published whole-genome sequencing analysis, we identified 14 additional recurrent HBV target genes, greatly expanding the HBV recurrent target list. This global survey of HBV integration events, together with recently published whole-genome sequencing analyses, furthered our understanding of the HBV–related HCC.

Ding, Dong; Lou, Xiaoyan; Hua, Dasong; Yu, Wei; Li, Lisha; Wang, Jun; Gao, Feng; Zhao, Na; Ren, Guoping; Li, Lanjuan; Lin, Biaoyang

2012-01-01

111

Visualizing multidimensional cancer genomics data  

PubMed Central

Cancer genomics projects employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Examples include projects carried out by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). A crucial step in the extraction of knowledge from the data is the exploration by experts of the different alterations, as well as the multiple relationships between them. To that end, the use of intuitive visualization tools that can integrate different types of alterations with clinical data is essential to the field of cancer genomics. Here, we review effective and common visualization techniques for exploring oncogenomics data and discuss a selection of tools that allow researchers to effectively visualize multidimensional oncogenomics datasets. The review covers visualization methods employed by tools such as Circos, Gitools, the Integrative Genomics Viewer, Cytoscape, Savant Genome Browser, StratomeX and platforms such as cBio Cancer Genomics Portal, IntOGen, the UCSC Cancer Genomics Browser, the Regulome Explorer and the Cancer Genome Workbench.

2013-01-01

112

Cancer Genomics Research Laboratory  

Cancer.gov

CGR’s high throughput laboratory is equipped with state-of-the-art laboratory equipment and automation systems for a large number of applications. CGR supports DCEG in all stages of cancer research from planning to publishing, including experimental design and project management, sample handling, genotyping and sequencing assay design and execution, development and implementation of bioinformatic pipelines, and downstream scientific research and analytical support.

113

BSMAP: whole genome bisulfite sequence MAPping program  

Microsoft Academic Search

BACKGROUND: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the

Yuanxin Xi; Wei Li

2009-01-01

114

The sequence of the human genome.  

PubMed

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge. PMID:11181995

Venter, J C; Adams, M D; Myers, E W; Li, P W; Mural, R J; Sutton, G G; Smith, H O; Yandell, M; Evans, C A; Holt, R A; Gocayne, J D; Amanatides, P; Ballew, R M; Huson, D H; Wortman, J R; Zhang, Q; Kodira, C D; Zheng, X H; Chen, L; Skupski, M; Subramanian, G; Thomas, P D; Zhang, J; Gabor Miklos, G L; Nelson, C; Broder, S; Clark, A G; Nadeau, J; McKusick, V A; Zinder, N; Levine, A J; Roberts, R J; Simon, M; Slayman, C; Hunkapiller, M; Bolanos, R; Delcher, A; Dew, I; Fasulo, D; Flanigan, M; Florea, L; Halpern, A; Hannenhalli, S; Kravitz, S; Levy, S; Mobarry, C; Reinert, K; Remington, K; Abu-Threideh, J; Beasley, E; Biddick, K; Bonazzi, V; Brandon, R; Cargill, M; Chandramouliswaran, I; Charlab, R; Chaturvedi, K; Deng, Z; Di Francesco, V; Dunn, P; Eilbeck, K; Evangelista, C; Gabrielian, A E; Gan, W; Ge, W; Gong, F; Gu, Z; Guan, P; Heiman, T J; Higgins, M E; Ji, R R; Ke, Z; Ketchum, K A; Lai, Z; Lei, Y; Li, Z; Li, J; Liang, Y; Lin, X; Lu, F; Merkulov, G V; Milshina, N; Moore, H M; Naik, A K; Narayan, V A; Neelam, B; Nusskern, D; Rusch, D B; Salzberg, S; Shao, W; Shue, B; Sun, J; Wang, Z; Wang, A; Wang, X; Wang, J; Wei, M; Wides, R; Xiao, C; Yan, C; Yao, A; Ye, J; Zhan, M; Zhang, W; Zhang, H; Zhao, Q; Zheng, L; Zhong, F; Zhong, W; Zhu, S; Zhao, S; Gilbert, D; Baumhueter, S; Spier, G; Carter, C; Cravchik, A; Woodage, T; Ali, F; An, H; Awe, A; Baldwin, D; Baden, H; Barnstead, M; Barrow, I; Beeson, K; Busam, D; Carver, A; Center, A; Cheng, M L; Curry, L; Danaher, S; Davenport, L; Desilets, R; Dietz, S; Dodson, K; Doup, L; Ferriera, S; Garg, N; Gluecksmann, A; Hart, B; Haynes, J; Haynes, C; Heiner, C; Hladun, S; Hostin, D; Houck, J; Howland, T; Ibegwam, C; Johnson, J; Kalush, F; Kline, L; Koduru, S; Love, A; Mann, F; May, D; McCawley, S; McIntosh, T; McMullen, I; Moy, M; Moy, L; Murphy, B; Nelson, K; Pfannkoch, C; Pratts, E; Puri, V; Qureshi, H; Reardon, M; Rodriguez, R; Rogers, Y H; Romblad, D; Ruhfel, B; Scott, R; Sitter, C; Smallwood, M; Stewart, E; Strong, R; Suh, E; Thomas, R; Tint, N N; Tse, S; Vech, C; Wang, G; Wetter, J; Williams, S; Williams, M; Windsor, S; Winn-Deen, E; Wolfe, K; Zaveri, J; Zaveri, K; Abril, J F; Guigó, R; Campbell, M J; Sjolander, K V; Karlak, B; Kejariwal, A; Mi, H; Lazareva, B; Hatton, T; Narechania, A; Diemer, K; Muruganujan, A; Guo, N; Sato, S; Bafna, V; Istrail, S; Lippert, R; Schwartz, R; Walenz, B; Yooseph, S; Allen, D; Basu, A; Baxendale, J; Blick, L; Caminha, M; Carnes-Stine, J; Caulk, P; Chiang, Y H; Coyne, M; Dahlke, C; Mays, A; Dombroski, M; Donnelly, M; Ely, D; Esparham, S; Fosler, C; Gire, H; Glanowski, S; Glasser, K; Glodek, A; Gorokhov, M; Graham, K; Gropman, B; Harris, M; Heil, J; Henderson, S; Hoover, J; Jennings, D; Jordan, C; Jordan, J; Kasha, J; Kagan, L; Kraft, C; Levitsky, A; Lewis, M; Liu, X; Lopez, J; Ma, D; Majoros, W; McDaniel, J; Murphy, S; Newman, M; Nguyen, T; Nguyen, N; Nodell, M; Pan, S; Peck, J; Peterson, M; Rowe, W; Sanders, R; Scott, J; Simpson, M; Smith, T; Sprague, A; Stockwell, T; Turner, R; Venter, E; Wang, M; Wen, M; Wu, D; Wu, M; Xia, A; Zandieh, A; Zhu, X

2001-02-16

115

Human Genome Sequencing in Health and Disease  

PubMed Central

Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges.

Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

2013-01-01

116

Single cell analysis of cancer genomes.  

PubMed

Genomic studies have provided key insights into how cancers develop, evolve, metastasize and respond to treatment. Cancers result from an interplay between mutation, selection and clonal expansions. In solid tumours, this Darwinian competition between subclones is also influenced by topological factors. Recent advances have made it possible to study cancers at the single cell level. These methods represent important tools to dissect cancer evolution and provide the potential to considerably change both cancer research and clinical practice. Here we discuss state-of-the-art methods for the isolation of a single cell, whole-genome and whole-transcriptome amplification of the cell's nucleic acids, as well as microarray and massively parallel sequencing analysis of such amplification products. We discuss the strengths and the limitations of the techniques, and explore single-cell methodologies for future cancer research, as well as diagnosis and treatment of the disease. PMID:24531336

Van Loo, Peter; Voet, Thierry

2014-02-01

117

The genetics and genomics of cancer  

Microsoft Academic Search

The past decade has seen great strides in our understanding of the genetic basis of human disease. Arguably, the most profound impact has been in the area of cancer genetics, where the explosion of genomic sequence and molecular profiling data has illustrated the complexity of human malignancies. In a tumor cell, dozens of different genes may be aberrant in structure

Joe Gray; Bruce Ponder; Allan Balmain

2003-01-01

118

A framework for sequencing the rice genome.  

PubMed

Rice is an important food crop and a model plant for other cereal genomes. The Clemson University Genomics Institute framework project, begun two years ago in anticipation of the now ongoing international effort to sequence the rice genome, is nearing completion. Two bacterial artificial chromosome (BAC) libraries have been constructed from the Oryza sativa cultivar Nipponbare. Over 100,000 BAC end sequences have been generated from these libraries and, at a current total of 28 Mbp, represent 6.5% of the total rice genome sequence. This sequence information has allowed us to draw first conclusions about unique and redundant rice genomic sequences. In addition, more than 60,000 clones (19 genome equivalents) have been successfully fingerprinted and assembled into contigs using FPC software. Many of these contigs have been anchored to the rice chromosomes using a variety of techniques. Hybridization experiments have shown these contigs to be very robust. Contig assembly and hybridization experiments have revealed some surprising insights into the organization of the rice genome, which will have significant repercussions for the sequencing effort. Integration of BAC end sequence data with anchored contig information has provided unexpected revelations on sequence organization at the chromosomal level. PMID:11387975

Presting, G G; Budiman, M A; Wood, T; Yu, Y; Kim, H R; Goicoechea, J L; Fang, E; Blackman, B; Jiang, J; Woo, S S; Dean, R A; Frisch, D; Wing, R A

2001-01-01

119

The genome sequence of Drosophila melanogaster.  

SciTech Connect

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

NONE

2000-03-24

120

Genome Walking by Next Generation Sequencing Approaches  

PubMed Central

Genome Walking (GW) comprises a number of PCR-based methods for the identification of nucleotide sequences flanking known regions. The different methods have been used for several purposes: from de novo sequencing, useful for the identification of unknown regions, to the characterization of insertion sites for viruses and transposons. In the latter cases Genome Walking methods have been recently boosted by coupling to Next Generation Sequencing technologies. This review will focus on the development of several protocols for the application of Next Generation Sequencing (NGS) technologies to GW, which have been developed in the course of analysis of insertional libraries. These analyses find broad application in protocols for functional genomics and gene therapy. Thanks to the application of NGS technologies, the original vision of GW as a procedure for walking along an unknown genome is now changing into the possibility of observing the parallel marching of hundreds of thousands of primers across the borders of inserted DNA molecules in host genomes.

Volpicella, Mariateresa; Leoni, Claudia; Costanza, Alessandra; Fanizza, Immacolata; Placido, Antonio; Ceci, Luigi R.

2012-01-01

121

Poultry genome sequences: progress and outstanding challenges.  

PubMed

The first build of the chicken genome sequence appeared in March, 2004 - the first genome sequence of any animal agriculture species. That sequence was done primarily by whole genome shotgun Sanger sequencing, along with the use of an extensive BAC contig-based physical map to assemble the sequence contigs and scaffolds and align them to the known chicken chromosomes and linkage groups. Subsequent sequencing and mapping efforts have improved upon that first build, and efforts continue in search of missing and/or unassembled sequence, primarily on the smaller microchromosomes and the sex chromosomes. In the past year, a draft turkey genome sequence of similar quality has been obtained at a much lower cost primarily due to the development of 'next-generation' sequencing techniques. However, assembly and alignment of the sequence contigs and scaffolds still depended on a detailed BAC contig map of the turkey genome that also utilized comparison to the existing chicken sequence. These 2 land fowl (Galliformes) genomes show a remarkable level of similarity, despite an estimated 30-40 million years of separate evolution since their last common ancestor. Among the advantages offered by these sequences are routine re-sequencing of commercial and research lines to identify the genetic correlates of phenotypic change (for example, selective sweeps), a much improved understanding of poultry diversity and linkage disequilibrium, and access to high-density SNP typing and association analysis, detailed transcriptomic and proteomic studies, and the use of genome-wide marker- assisted selection to enhance genetic gain in commercial stocks. PMID:21335957

Dodgson, J B; Delany, M E; Cheng, H H

2011-01-01

122

Next-generation sequencing: applications beyond genomes  

Microsoft Academic Search

The development of DNA sequencing more than 30 years ago has profoundly impacted biological research. In the last couple of years, remarkable technological innovations have emerged that allow the direct and cost-effective sequencing of complex samples at unprecedented scale and speed. These next-generation technologies make it feasible to sequence not only static genomes, but also entire transcriptomes expressed under different

Samuel Marguerat; Jürg Bähler

2008-01-01

123

Multifractal Analysis of Genomic Sequences CGR Images  

Microsoft Academic Search

To describe the fractal feature of chaos game representation (CGR) images of genomic sequences, a multifractal theory is presented in the analysis. With the probability set of CGR images, the general dimension spectrum and the multifractal spectrum are calculated and compared between two sample groups of gene thick sequences and gene black sequences. The experimental result shows that the probability

Weijuan Fu; Yuanyuan Wang; Daru Lu

2005-01-01

124

Sequencing the genome of the Atlantic salmon (Salmo salar)  

PubMed Central

The International Collaboration to Sequence the Atlantic Salmon Genome (ICSASG) will produce a genome sequence that identifies and physically maps all genes in the Atlantic salmon genome and acts as a reference sequence for other salmonids.

2010-01-01

125

Decoding the evolution of a breast cancer genome  

PubMed Central

Shah et al have recently reported the first successful sequencing of the entire genome of a solid tumour (Shah et al, 2009). Philippe Bedard and Christos Sotiriou analyse their findings as well as the challenges of applying the study of cancer genomes to clinical cancer care.

Bedard, Philippe L; Sotiriou, Christos

2010-01-01

126

MapToGenome: A Comparative Genomic Tool that Aligns Transcript Maps to Sequenced Genomes  

PubMed Central

Efforts to generate whole genome assemblies and dense genetic maps have provided a wealth of gene positional information for several vertebrate species. Comparing the relative location of orthologous genes among these genomes provides perspective on genome evolution and can aid in translating genetic information between distantly related organisms. However, large-scale comparisons between genetic maps and genome assemblies can prove challenging because genetic markers are commonly derived from transcribed sequences that are incompletely and variably annotated. We developed the program MapToGenome as a tool for comparing transcript maps and genome assemblies. MapToGenome processes sequence alignments between mapped transcripts and whole genome sequence while accounting for the presence of intronic sequences, and assigns orthology based on user-defined parameters. To illustrate the utility of this program, we used MapToGenome to process alignments between vertebrate genetic maps and genome assemblies 1) self/self alignments for maps and assemblies of the rat and zebrafish genome; 2) alignments between vertebrate transcript maps (rat, salamander, zebrafish, and medaka) and the chicken genome; and 3) alignments of the medaka and zebrafish maps to the pufferfish (Tetraodon nigroviridis) genome. Our results show that map-genome alignments can be improved by combining alignments across presumptive intron breaks and ignoring alignments for simple sequence length polymorphism (SSLP) marker sequences. Comparisons between vertebrate maps and genomes reveal broad patterns of conservation among vertebrate genomes and the differential effects of genome rearrangement over time and across lineages.

Putta, Srikrishna; Smith, Jeramiah J.; Staben, Chuck; Voss, S. Randal

2007-01-01

127

Streptococcal taxonomy based on genome sequence analyses.  

PubMed

The identification of the clinically relevant viridans streptococci group, at species level, is still problematic. The aim of this study was to extract taxonomic information from the complete genome sequences of 67 streptococci, comprising 19 species, by means of genomic analyses, multilocus sequence analysis (MLSA), average amino acid identity (AAI), genomic signatures, genome-to-genome distances (GGD) and codon usage bias. We then attempted to determine the usefulness of these genomic tools for species identification in streptococci. Our results showed that MLSA, AAI and GGD analyses are robust markers to identify streptococci at the species level, for instance, S. pneumoniae, S. mitis, and S. oralis. A Streptococcus species can be defined as a group of strains that share ? 95% DNA similarity in MLSA and AAI, and > 70% DNA identity in GGD. This approach allows an advanced understanding of bacterial diversity. PMID:24358875

Thompson, Cristiane C; Emmel, Vanessa E; Fonseca, Erica L; Marin, Michel A; Vicente, Ana Carolina P

2013-01-01

128

Streptococcal taxonomy based on genome sequence analyses  

PubMed Central

The identification of the clinically relevant viridans streptococci group, at species level, is still problematic. The aim of this study was to extract taxonomic information from the complete genome sequences of 67 streptococci, comprising 19 species, by means of genomic analyses, multilocus sequence analysis (MLSA), average amino acid identity (AAI), genomic signatures, genome-to-genome distances (GGD) and codon usage bias. We then attempted to determine the usefulness of these genomic tools for species identification in streptococci. Our results showed that MLSA, AAI and GGD analyses are robust markers to identify streptococci at the species level, for instance, S. pneumoniae, S. mitis, and S. oralis. A Streptococcus species can be defined as a group of strains that share ? 95% DNA similarity in MLSA and AAI, and > 70% DNA identity in GGD. This approach allows an advanced understanding of bacterial diversity.

2013-01-01

129

Estimation of rearrangement phylogeny for cancer genomes  

PubMed Central

Cancer genomes are complex, carrying thousands of somatic mutations including base substitutions, insertions and deletions, rearrangements, and copy number changes that have been acquired over decades. Recently, technologies have been introduced that allow generation of high-resolution, comprehensive catalogs of somatic alterations in cancer genomes. However, analyses of these data sets generally do not indicate the order in which mutations have occurred, or the resulting karyotype. Here, we introduce a mathematical framework that begins to address this problem. By using samples with accurate data sets, we can reconstruct relatively complex temporal sequences of rearrangements and provide an assembly of genomic segments into digital karyotypes. For cancer genes mutated in rearranged regions, this information can provide a chronological examination of the selective events that have taken place.

Greenman, Chris D.; Pleasance, Erin D.; Newman, Scott; Yang, Fengtang; Fu, Beiyuan; Nik-Zainal, Serena; Jones, David; Lau, King Wai; Carter, Nigel; Edwards, Paul A.W.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.

2012-01-01

130

High-throughput bisulfite sequencing in mammalian genomes  

Microsoft Academic Search

DNA methylation is a critical epigenetic mark that is essential for mammalian development and aberrant in many diseases including cancer. Over the past decade multiple methods have been developed and applied to characterize its genome-wide distribution. Of these, reduced representation bisulfite sequencing (RRBS) generates nucleotide resolution DNA methylation bisulfite sequencing libraries that enrich for CpG-dense regions by methylation-insensitive restriction digestion.

Zachary D. Smith; Hongcang Gu; Christoph Bock; Andreas Gnirke; Alexander Meissner

2009-01-01

131

Genome Sequence of Serratia plymuthica V4.  

PubMed

Serratia spp. are gammaproteobacteria and members of the family Enterobacteriaceae. Here, we announce the genome sequence of Serratia plymuthica strain V4, which produces the siderophore serratiochelin and antimicrobial compounds. PMID:24831138

Cleto, S; Van der Auwera, G; Almeida, C; Vieira, M J; Vlamakis, H; Kolter, R

2014-01-01

132

Genome Sequence of Serratia plymuthica V4  

PubMed Central

Serratia spp. are gammaproteobacteria and members of the family Enterobacteriaceae. Here, we announce the genome sequence of Serratia plymuthica strain V4, which produces the siderophore serratiochelin and antimicrobial compounds.

Cleto, S.; Van der Auwera, G.; Almeida, C.; Vieira, M. J.; Vlamakis, H.

2014-01-01

133

DNA sequencing, automation, and the human genome  

SciTech Connect

DNA sequencing is one of the key analytical operations of modern molecular biology and a crucial element of biotechnology. The principles of DNA sequencing and details of the technologies of both manual, radioisotope-based and automated, fluorescence-based approaches are described. The goals and rationale of the Human Genome Initiative are discussed along with implications for future sequencing technologies. Finally, a glimpse of emerging DNA sequencing technologies is offered.

Trainor, G.L. (E. I. du Pont de Nemours and Co., Inc., Wilmington, DE (USA))

1990-03-01

134

Complete genome sequences of nine mycobacteriophages.  

PubMed

Genome analyses of a large number of mycobacteriophages, bacterial viruses that infect members of the genus Mycobacterium, yielded novel enzymes and tools for the genetic manipulation of mycobacteria. We report here the complete genome sequences of nine mycobacteriophages, including a new singleton, isolated using Mycobacterium smegmatis mc(2)155 as a host strain. PMID:24874666

Franceschelli, Jorgelina Judith; Suarez, Cristian Alejandro; Terán, Lucrecia; Raya, Raúl Ricardo; Morbidoni, Héctor Ricardo

2014-01-01

135

Draft Genome Sequence of Lactobacillus crispatus 2029.  

PubMed

This report describes a draft genome sequence of Lactobacillus crispatus 2029. The reads generated by the Ion Torrent PGM were assembled into contigs with a total size of 2.2 Mb. The data were annotated using the NCBI GenBank and RAST servers. A comparison with the reference strain revealed specific features of the genome. PMID:24558253

Karlyshev, Andrey V; Melnikov, Vyacheslav G; Khlebnikov, Valentin C; Abramov, Vyacheslav M

2014-01-01

136

Complete Genome Sequences of Nine Mycobacteriophages  

PubMed Central

Genome analyses of a large number of mycobacteriophages, bacterial viruses that infect members of the genus Mycobacterium, yielded novel enzymes and tools for the genetic manipulation of mycobacteria. We report here the complete genome sequences of nine mycobacteriophages, including a new singleton, isolated using Mycobacterium smegmatis mc2155 as a host strain.

Franceschelli, Jorgelina Judith; Suarez, Cristian Alejandro; Teran, Lucrecia; Raya, Raul Ricardo

2014-01-01

137

Complementary DNA sequencing: Expressed sequence tags and human genome project  

SciTech Connect

Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.

Adams, M.D.; Kelley, J.M.; Gocayne, J.D.; Dubnick, M.; Wu, A.; Olde, B.; Moreno, R.F.; Kerlavage, A.R.; McCombie, W.R.; Venter, J.C. (National Institutes of Health, Bethesda, MD (United States)); Polymeropoulos, M.H.; Hong Xiao; Merril, C.R. (National Inst. of Mental Health, Washington, DC (United States))

1991-06-21

138

The Genetic Basis of Pancreas Cancer Development and Progression: Insights From Whole-Exome and Whole-Genome Sequencing  

PubMed Central

Pancreatic cancer is caused by inherited and acquired mutations in specific cancer-associated genes. The discovery of the most common genetic alterations in pancreatic cancer has not only provided insight into the fundamental pathways driving the progression from a normal cell, to non-invasive precursor lesions, to widely metastatic disease, but recent genetic discoveries have also opened new opportunities for gene-based approaches to early detection, personalized treatment, and molecular classification of pancreatic neoplasms.

Iacobuzio-Donahue, Christine A.; Velculescu, Victor E.; Wolfgang, Christopher L.; Hruban, Ralph H.

2012-01-01

139

Genomic Instability and Breast Cancer.  

National Technical Information Service (NTIS)

We are continuing our investigation of mechanisms underlying the maintenance of genomic stability and breast cancer development. Our analyses on BRCA1 and DNA damage response have resulted in the identification of several new components involved in DNA da...

J. Chen

2011-01-01

140

COMPLEX LANDSCAPES OF SOMATIC REARRANGEMENT IN HUMAN BREAST CANCER GENOMES  

PubMed Central

SUMMARY Multiple somatic rearrangements are often found in cancer genomes. However, the underlying processes of rearrangement and their contribution to cancer development are poorly characterised. Here, we employed a paired-end sequencing strategy to identify somatic rearrangements in breast cancer genomes. There are more rearrangements in some breast cancers than previously appreciated. Rearrangements are more frequent over gene footprints and most are intrachromosomal. Multiple architectures of rearrangement are present, but tandem duplications are common in some cancers, perhaps reflecting a specific defect in DNA maintenance. Short overlapping sequences at most rearrangement junctions suggest that these have been mediated by non-homologous end-joining DNA repair, although varying sequence patterns indicate that multiple processes of this type are operative. Several expressed in-frame fusion genes were identified but none were recurrent. The study provides a new perspective on cancer genomes, highlighting the diversity of somatic rearrangements and their potential contribution to cancer development.

Stephens, Philip J; McBride, David J; Lin, Meng-Lay; Varela, Ignacio; Pleasance, Erin D; Simpson, Jared T; Stebbings, Lucy A; Leroy, Catherine; Edkins, Sarah; Mudie, Laura J; Greenman, Chris D; Jia, Mingming; Latimer, Calli; Teague, Jon W; Lau, King Wai; Burton, John; Quail, Michael A; Swerdlow, Harold; Churcher, Carol; Natrajan, Rachael; Sieuwerts, Anieta M; Martens, John WM; Silver, Daniel P; Langerod, Anita; Russnes, Hege EG; Foekens, John A; Reis-Filho, Jorge S; van 't Veer, Laura; Richardson, Andrea L; B?rreson-Dale, Anne-Lise; Campbell, Peter J; Futreal, P Andrew; Stratton, Michael R

2012-01-01

141

Genomics at the Ontario Institute for Cancer Research  

SciTech Connect

Johar Ali of the Ontario Institute for Cancer Research discusses genomics and next-gen applications at the OICR on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Ali, Johar [Ontario Institute for Cancer Research

2010-06-02

142

Sorghum Genome Sequencing by Methylation Filtration  

Microsoft Academic Search

Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF) technology. The evidence suggests that 96% of the genes have been sequence tagged,

Joseph A. Bedell; Muhammad A. Budiman; Andrew Nunberg; Robert W. Citek; Dan Robbins; Joshua Jones; Elizabeth Flick; Theresa Rohlfing; Jason Fries; Kourtney Bradford; Jennifer McMenamy; Michael Smith; Heather Holeman; Bruce A. Roe; Graham Wiley; Ian F. Korf; Pablo D. Rabinowicz; Nathan Lakey; W. Richard McCombie; Jeffrey A. Jeddeloh; Robert A. Martienssen

2005-01-01

143

Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships  

PubMed Central

Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia.

2014-01-01

144

Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements  

Microsoft Academic Search

As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments

Aaron C. E. Darling; Bob Mau; Frederick R. Blattner; Nicole T. Perna

2004-01-01

145

Whole Genome Sequence of a Turkish Individual  

PubMed Central

Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ?1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123?2,122,671 or 1?1.5) and transition/transversion ratios (2,383,204?1,154,590 or 2.06?1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1?1.09 insertion/deletion ratio), ranging from ?52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale.

Dogan, Haluk; Can, Handan; Otu, Hasan H.

2014-01-01

146

Finishing the euchromatic sequence of the human genome  

Microsoft Academic Search

The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and

2004-01-01

147

Automated design of bacterial genome sequences  

PubMed Central

Background Organisms have evolved ways of regulating transcription to better adapt to varying environments. Could the current functional genomics data and models support the possibility of engineering a genome with completely rearranged gene organization while the cell maintains its behavior under environmental challenges? How would we proceed to design a full nucleotide sequence for such genomes? Results As a first step towards answering such questions, recent work showed that it is possible to design alternative transcriptomic models showing the same behavior under environmental variations than the wild-type model. A second step would require providing evidence that it is possible to provide a nucleotide sequence for a genome encoding such transcriptional model. We used computational design techniques to design a rewired global transcriptional regulation of Escherichia coli, yet showing a similar transcriptomic response than the wild-type. Afterwards, we “compiled” the transcriptional networks into nucleotide sequences to obtain the final genome sequence. Our computational evolution procedure ensures that we can maintain the genotype-phenotype mapping during the rewiring of the regulatory network. We found that it is theoretically possible to reorganize E. coli genome into 86% fewer regulated operons. Such refactored genomes are constituted by operons that contain sets of genes sharing around the 60% of their biological functions and, if evolved under highly variable environmental conditions, have regulatory networks, which turn out to respond more than 20% faster to multiple external perturbations. Conclusions This work provides the first algorithm for producing a genome sequence encoding a rewired transcriptional regulation with wild-type behavior under alternative environments.

2013-01-01

148

International Rice Genome Sequencing Project: the effort to completely sequence the rice genome.  

PubMed

The International Rice Genome Sequencing Project (IRGSP) involves researchers from ten countries who are working to completely and accurately sequence the rice genome within a short period. Sequencing uses a map-based clone-by-clone shotgun strategy; shared bacterial artificial chromosome/P1-derived artificial chromosome libraries have been constructed from Oryza sativa ssp. japonica variety 'Nipponbare'. End-sequencing, fingerprinting and marker-aided PCR screening are being used to make sequence-ready contigs. Annotated sequences are immediately released for public use and are made available with supplemental information at each IRGSP member's website. The IRGSP works to promote the development of rice and cereal genomics in addition to producing genome sequence data. PMID:10712951

Sasaki, T; Burr, B

2000-04-01

149

Chicken genomics charts a path to the genome sequence.  

PubMed

In this paper, the current status of chicken genomics is reviewed. This is timely given the current intense activity centred on sequencing the complete genome of this model species. The genome project is based on a decade of map building by genetic linkage and cytogenetic methods, which are now being replaced by high-resolution radiation hybrid and bacterial artificial chromosome (BAC) contig maps. Markers for map building have generally depended on labour-intensive screening procedures, but in recent years this has changed with the availability of almost 500,000 chicken expressed sequence tags (ESTs). These resources and tools will be critical in the coming months when the chicken genome sequence is being assembled (eg cross-checked with other maps) and annotated (eg gene structures based on ESTs). The future for chicken genome and biological research is an exciting one, through the integration of these resources. For example, through the proposed chicken Ensembl database, it will be possible to solve challenging scientific questions by exploiting the power of a chicken model. One area of interest is the study of developmental mechanisms and the discovery of regulatory networks throughout the genome. Another is the study of the molecular nature of quantitative genetic variation. No other animal species have been phenotyped and selected so intensively as agricultural animals and thus there is much to be learned in basic and medical biology from this research. PMID:15163360

Burt, David W

2004-04-01

150

Analyzing the cancer methylome through targeted bisulfite sequencing.  

PubMed

Bisulfite conversion of genomic DNA combined with next-generation sequencing (NGS) has become a very effective approach for mapping the whole-genome and sub-genome wide DNA methylation landscapes. However, whole methylome shotgun bisulfite sequencing is still expensive and not suitable for analyzing large numbers of human cancer specimens. Recent advances in the development of targeted bisulfite sequencing approaches offer several attractive alternatives. The characteristics and applications of these methods are discussed in this review article. In addition, the bioinformatic tools that can be used for sequence capture probe design as well as downstream sequence analyses are also addressed. PMID:23200671

Lee, Eun-Joon; Luo, Junfeng; Wilson, James M; Shi, Huidong

2013-11-01

151

Genome sequence of Haemophilus parasuis strain 29755  

PubMed Central

Haemophilus parasuis is a member of the family Pasteurellaceae and is the etiologic agent of Glässer’s disease in pigs, a systemic syndrome associated with only a subset of isolates. The genetic basis for virulence and systemic spread of particular H. parasuis isolates is currently unknown. Strain 29755 is an invasive isolate that has long been used in the study of Glässer’s disease. Accordingly, the genome sequence of strain 29755 is of considerable importance to investigators endeavoring to understand the molecular pathogenesis of H. parasuis. Here we describe the features of the 2,224,137 bp draft genome sequence of strain 29755 generated from 454-FLX pyrosequencing. These data comprise the first publicly available genome sequence for this bacterium.

Mullins, Michael A.; Bayles, Darrell O.; Dyer, David W.; Kuehn, Joanna S.; Phillips, Gregory J.

2011-01-01

152

Genome walking by next generation sequencing approaches.  

PubMed

Genome Walking (GW) comprises a number of PCR-based methods for the identification of nucleotide sequences flanking known regions. The different methods have been used for several purposes: from de novo sequencing, useful for the identification of unknown regions, to the characterization of insertion sites for viruses and transposons. In the latter cases Genome Walking methods have been recently boosted by coupling to Next Generation Sequencing technologies. This review will focus on the development of several protocols for the application of Next Generation Sequencing (NGS) technologies to GW, which have been developed in the course of analysis of insertional libraries. These analyses find broad application in protocols for functional genomics and gene therapy. Thanks to the application of NGS technologies, the original vision of GW as a procedure for walking along an unknown genome is now changing into the possibility of observing the parallel marching of hundreds of thousands of primers across the borders of inserted DNA molecules in host genomes. PMID:24832505

Volpicella, Mariateresa; Leoni, Claudia; Costanza, Alessandra; Fanizza, Immacolata; Placido, Antonio; Ceci, Luigi R

2012-01-01

153

Sequencing and comparative analysis of the gorilla MHC genomic sequence  

PubMed Central

Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

Wilming, Laurens G.; Hart, Elizabeth A.; Coggill, Penny C.; Horton, Roger; Gilbert, James G. R.; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L.

2013-01-01

154

A re-sequence analysis of genomic loci on chromosomes 1q32.1, 5p15.33 and 13q22.1 associated with pancreatic cancer risk  

PubMed Central

Objectives To fine-map common pancreatic cancer susceptibility regions. Methods We conducted targeted Roche-454 re-sequencing across 428 kb in three genomic regions identified in genome-wide association studies (GWAS) of pancreatic cancer, on chromosomes 1q32.1, 5p15.33 and 13q22.1. Results An analytical pipeline for calling genotypes was developed using HapMap samples sequenced on chr5p15.33. Concordance to 1000 Genomes data for chr5p15.33 was >96%. The concordance for chr1q32.1 and chr13q22.1 with pancreatic cancer GWAS data was >99%. Between 9.2–19.0% of variants detected were not present in 1000 Genomes for the respective continental population. The majority of completely novel SNPs were less common (MAF ? 5%) or rare (MAF ? 2%), illustrating the value of enlarging test sets for discovery of less common variants. Using the dataset, we examined haplotype blocks across each region using a tag SNP analysis (r2 >0.8 for MAF ?5%) and determined that at least 196, 243 and 63 SNPs are required for fine-mapping chr1q32.1, chr5p15.33, and chr13q22.1, respectively, in European populations. Conclusions We have characterized germline variation in three regions associated with pancreatic cancer risk and show that targeted re-sequencing leads to the discovery of novel variants and improves the completeness of germline sequence variants for fine-mapping GWAS susceptibility loci.

Parikh, Hemang; Jia, Jinping; Zhang, Xijun; Chung, Charles C.; Jacobs, Kevin B.; Yeager, Meredith; Boland, Joseph; Hutchinson, Amy; Burdett, Laura; Hoskins, Jason; Risch, Harvey A.; Stolzenberg-Solomon, Rachael Z.; Chanock, Stephen J.; Wolpin, Brian M.; Petersen, Gloria M.; Fuchs, Charles S.; Hartge, Patricia; Amundadottir, Laufey

2012-01-01

155

Chapter 14: Cancer Genome Analysis  

PubMed Central

Although there is great promise in the benefits to be obtained by analyzing cancer genomes, numerous challenges hinder different stages of the process, from the problem of sample preparation and the validation of the experimental techniques, to the interpretation of the results. This chapter specifically focuses on the technical issues associated with the bioinformatics analysis of cancer genome data. The main issues addressed are the use of database and software resources, the use of analysis workflows and the presentation of clinically relevant action items. We attempt to aid new developers in the field by describing the different stages of analysis and discussing current approaches, as well as by providing practical advice on how to access and use resources, and how to implement recommendations. Real cases from cancer genome projects are used as examples.

Vazquez, Miguel; de la Torre, Victor; Valencia, Alfonso

2012-01-01

156

Initial genome sequencing and analysis of multiple myeloma  

PubMed Central

Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumor genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the dataset. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-?B signaling was suggested by mutations in 11 members of the NF-?B pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge.

Chapman, Michael A.; Lawrence, Michael S.; Keats, Jonathan J.; Cibulskis, Kristian; Sougnez, Carrie; Schinzel, Anna C.; Harview, Christina L.; Brunet, Jean-Philippe; Ahmann, Gregory J.; Adli, Mazhar; Anderson, Kenneth C.; Ardlie, Kristin G.; Auclair, Daniel; Baker, Angela; Bergsagel, P. Leif; Bernstein, Bradley E.; Drier, Yotam; Fonseca, Rafael; Gabriel, Stacey B.; Hofmeister, Craig C.; Jagannath, Sundar; Jakubowiak, Andrzej J.; Krishnan, Amrita; Levy, Joan; Liefeld, Ted; Lonial, Sagar; Mahan, Scott; Mfuko, Bunmi; Monti, Stefano; Perkins, Louise M.; Onofrio, Robb; Pugh, Trevor J.; Vincent Rajkumar, S.; Ramos, Alex H.; Siegel, David S.; Sivachenko, Andrey; Trudel, Suzanne; Vij, Ravi; Voet, Douglas; Winckler, Wendy; Zimmerman, Todd; Carpten, John; Trent, Jeff; Hahn, William C.; Garraway, Levi A.; Meyerson, Matthew; Lander, Eric S.; Getz, Gad; Golub, Todd R.

2013-01-01

157

Genomic instability in cancer.  

PubMed

One of the fundamental challenges facing the cell is to accurately copy its genetic material to daughter cells. When this process goes awry, genomic instability ensues in which genetic alterations ranging from nucleotide changes to chromosomal translocations and aneuploidy occur. Organisms have developed multiple mechanisms that can be classified into two major classes to ensure the fidelity of DNA replication. The first class includes mechanisms that prevent premature initiation of DNA replication and ensure that the genome is fully replicated once and only once during each division cycle. These include cyclin-dependent kinase (CDK)-dependent mechanisms and CDK-independent mechanisms. Although CDK-dependent mechanisms are largely conserved in eukaryotes, higher eukaryotes have evolved additional mechanisms that seem to play a larger role in preventing aberrant DNA replication and genome instability. The second class ensures that cells are able to respond to various cues that continuously threaten the integrity of the genome by initiating DNA-damage-dependent "checkpoints" and coordinating DNA damage repair mechanisms. Defects in the ability to safeguard against aberrant DNA replication and to respond to DNA damage contribute to genomic instability and the development of human malignancy. In this article, we summarize our current knowledge of how genomic instability arises, with a particular emphasis on how the DNA replication process can give rise to such instability. PMID:23335075

Abbas, Tarek; Keaton, Mignon A; Dutta, Anindya

2013-03-01

158

Sequencing Your Genome: What Does It Mean?  

PubMed Central

The human genome contains approximately 3.2 billion nucleotides and about 23,500 genes. Each gene has protein-coding regions that are referred to as exons. The human genome contains about 180,000 exons, which are collectively called an exome. An exome comprises about 1% of the human genome and hence is about 30 million nucleotides in size. Today’s technologies afford the opportunity to sequence all nucleotides in the human exome and even in the human genome. Given that more than three-quarters of the known disease-causing variants are located in the exome, and considering the cost and technical challenges in analyzing the whole genome sequence data, the focus of present research is primarily on whole exome sequencing (WES). While WES at the medical sequencing level is still expensive, it is becoming more affordable. Cost will not likely be a major barrier in the near future, and the data analysis is becoming less tedious. The most difficult challenge at the heart of medical sequencing is interpreting the findings. Each exome contains about 13,500 single nucleotide variants (SNVs) that affect the amino acid sequence, and a large number are expected to be functional variants. The daunting task is to distinguish the variants that are pathogenic from those that have minimal or no discernible clinical effects. While various algorithms exist, none are sufficiently robust. Thus, in-depth knowledge in genetics and medicine is essential for the proper interpretation of the WES findings. This review will discuss the potential applications of the WES data in the practice of cardiovascular medicine.

2014-01-01

159

Mapping and sequencing the human genome  

SciTech Connect

Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

none,

1988-01-01

160

The complete genome sequence of Mycobacterium bovis  

PubMed Central

Mycobacterium bovis is the causative agent of tuberculosis in a range of animal species and man, with worldwide annual losses to agriculture of $3 billion. The human burden of tuberculosis caused by the bovine tubercle bacillus is still largely unknown. M. bovis was also the progenitor for the M. bovis bacillus Calmette–Guérin vaccine strain, the most widely used human vaccine. Here we describe the 4,345,492-bp genome sequence of M. bovis AF2122/97 and its comparison with the genomes of Mycobacterium tuberculosis and Mycobacterium leprae. Strikingly, the genome sequence of M. bovis is >99.95% identical to that of M. tuberculosis, but deletion of genetic information has led to a reduced genome size. Comparison with M. leprae reveals a number of common gene losses, suggesting the removal of functional redundancy. Cell wall components and secreted proteins show the greatest variation, indicating their potential role in host–bacillus interactions or immune evasion. Furthermore, there are no genes unique to M. bovis, implying that differential gene expression may be the key to the host tropisms of human and bovine bacilli. The genome sequence therefore offers major insight on the evolution, host preference, and pathobiology of M. bovis.

Garnier, Thierry; Eiglmeier, Karin; Camus, Jean-Christophe; Medina, Nadine; Mansoor, Huma; Pryor, Melinda; Duthoy, Stephanie; Grondin, Sophie; Lacroix, Celine; Monsempe, Christel; Simon, Sylvie; Harris, Barbara; Atkin, Rebecca; Doggett, Jon; Mayes, Rebecca; Keating, Lisa; Wheeler, Paul R.; Parkhill, Julian; Barrell, Bart G.; Cole, Stewart T.; Gordon, Stephen V.; Hewinson, R. Glyn

2003-01-01

161

Assessment of Whole Genome Amplification for Sequence Capture and Massively Parallel Sequencing  

PubMed Central

Exome sequence capture and massively parallel sequencing can be combined to achieve inexpensive and rapid global analyses of the functional sections of the genome. The difficulties of working with relatively small quantities of genetic material, as may be necessary when sharing tumor biopsies between collaborators for instance, can be overcome using whole genome amplification. However, the potential drawbacks of using a whole genome amplification technology based on random primers in combination with sequence capture followed by massively parallel sequencing have not yet been examined in detail, especially in the context of mutation discovery in tumor material. In this work, we compare mutations detected in sequence data for unamplified DNA, whole genome amplified DNA, and RNA originating from the same tumor tissue samples from 16 patients diagnosed with non-small cell lung cancer. The results obtained provide a comprehensive overview of the merits of these techniques for mutation analysis. We evaluated the identified genetic variants, and found that most (74%) of them were observed in both the amplified and the unamplified sequence data. Eighty-nine percent of the variations found by WGA were shared with unamplified DNA. We demonstrate a strategy for avoiding allelic bias by including RNA-sequencing information.

Hasmats, Johanna; Green, Henrik; Orear, Cedric; Validire, Pierre; Huss, Mikael; Kaller, Max; Lundeberg, Joakim

2014-01-01

162

Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria  

PubMed Central

Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.

Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Ponten, Thomas; Ussery, David W.; Aarestrup, Frank M.; Lund, Ole

2012-01-01

163

An emerging place for lung cancer genomics in 2013  

PubMed Central

Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could see many of these discoveries translated to the clinic at a rapid pace. We look forward to these advances making a difference for the many patients we treat in the Asia-Pacific region and around the world.

Bowman, Rayleen V.; Yang, Ian A.; Govindan, Ramaswamy; Fong, Kwun M.

2013-01-01

164

An emerging place for lung cancer genomics in 2013.  

PubMed

Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could see many of these discoveries translated to the clinic at a rapid pace. We look forward to these advances making a difference for the many patients we treat in the Asia-Pacific region and around the world. PMID:24163742

Daniels, Marissa G; Bowman, Rayleen V; Yang, Ian A; Govindan, Ramaswamy; Fong, Kwun M

2013-10-01

165

Genome Sequence of Lactobacillus versmoldensis KCTC 3814  

PubMed Central

Lactobacillus versmoldensis KCTC 3814 was isolated from raw fermented poultry salami. The species was present in high numbers and frequently dominated the lactic acid bacteria (LAB) populations of the products. Here, we announce the draft genome sequence of Lactobacillus versmoldensis KCTC 3814, isolated from poultry salami, and describe major findings from its annotation.

Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Kim, Ryong Nam; Nam, Seong-Hyeuk; Kang, Aram; Kim, Aeri; Park, Hong-Seog

2011-01-01

166

Genome Sequencing and Analysis Conference IV.  

National Technical Information Service (NTIS)

J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were pr...

1993-01-01

167

Draft genome sequence of Bacillus oceanisediminis 2691.  

PubMed

Bacillus oceanisediminis 2691 is an aerobic, Gram-positive, spore-forming, and moderately halophilic bacterium that was isolated from marine sediment of the Yellow Sea coast of South Korea. Here, we report the draft genome sequence of B. oceanisediminis 2691 that may have an important role in the bioremediation of marine sediment. PMID:23105082

Lee, Yong-Jik; Lee, Sang-Jae; Jeong, Haeyoung; Kim, Hyun Ju; Ryu, Naeun; Kim, Byoung-Chan; Lee, Han-Seung; Lee, Dong-Woo; Lee, Sang Jun

2012-11-01

168

Draft Genome Sequence of Bacillus oceanisediminis 2691  

PubMed Central

Bacillus oceanisediminis 2691 is an aerobic, Gram-positive, spore-forming, and moderately halophilic bacterium that was isolated from marine sediment of the Yellow Sea coast of South Korea. Here, we report the draft genome sequence of B. oceanisediminis 2691 that may have an important role in the bioremediation of marine sediment.

Lee, Yong-Jik; Lee, Sang-Jae; Jeong, Haeyoung; Kim, Hyun Ju; Ryu, Naeun; Kim, Byoung-Chan; Lee, Han-Seung

2012-01-01

169

Draft Genome Sequence of Streptomyces iranensis  

PubMed Central

Streptomyces iranensis HM 35 has been shown to exhibit 72.7% DNA-DNA similarity to the important drug rapamycin (sirolimus)-producing Streptomyces rapamycinicus NRRL5491. Here, we report the genome sequence of HM 35, which represents a partially overlapping repertoire of secondary metabolite gene clusters with S. rapamycinicus, including the gene cluster for rapamycin biosynthesis.

Horn, Fabian; Netzker, Tina; Guthke, Reinhard; Brakhage, Axel A.

2014-01-01

170

Genomic Instability in Cancer  

PubMed Central

One of the fundamental challenges facing the cell is to accurately copy its genetic material to daughter cells. When this process goes awry, genomic instability ensues where genetic alterations ranging from nucleotide changes to chromosomal translocations and aneuploidy occur. To ensure the fidelity of DNA replication, organisms have developed multiple and often overlapping mechanisms that can be classified into two major classes. The first includes mechanisms that prevent premature initiation of DNA replication and ensure that the genome is fully replicated once and only once during each division cycle. The second ensures that cells are able to respond to various cues that continuously threaten the integrity of the genome by initiating DNA-damage dependent “checkpoints” and coordinating DNA damage repair mechanisms. Defects in the ability to respond to DNA damage and safeguard against aberrant DNA replication contribute to genomic instability and the development of human malignancy. In this chapter, we summarize our current knowledge of how genomic instability arises, with a particular emphasis on how the DNA replication process can give rise to such instability.

Abbas, Tarek; Keaton, Mignon A.; Dutta, Anindya

2013-01-01

171

The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group  

PubMed Central

We present the first Korean individual genome sequence (SJK) and analysis results. The diploid genome of a Korean male was sequenced to 28.95-fold redundancy using the Illumina paired-end sequencing method. SJK covered 99.9% of the NCBI human reference genome. We identified 420,083 novel single nucleotide polymorphisms (SNPs) that are not in the dbSNP database. Despite a close similarity, significant differences were observed between the Chinese genome (YH), the only other Asian genome available, and SJK: (1) 39.87% (1,371,239 out of 3,439,107) SNPs were SJK-specific (49.51% against Venter's, 46.94% against Watson's, and 44.17% against the Yoruba genomes); (2) 99.5% (22,495 out of 22,605) of short indels (< 4 bp) discovered on the same loci had the same size and type as YH; and (3) 11.3% (331 out of 2920) deletion structural variants were SJK-specific. Even after attempting to map unmapped reads of SJK to unanchored NCBI scaffolds, HGSV, and available personal genomes, there were still 5.77% SJK reads that could not be mapped. All these findings indicate that the overall genetic differences among individuals from closely related ethnic groups may be significant. Hence, constructing reference genomes for minor socio-ethnic groups will be useful for massive individual genome sequencing.

Ahn, Sung-Min; Kim, Tae-Hyung; Lee, Sunghoon; Kim, Deokhoon; Ghang, Ho; Kim, Dae-Soo; Kim, Byoung-Chul; Kim, Sang-Yoon; Kim, Woo-Yeon; Kim, Chulhong; Park, Daeui; Lee, Yong Seok; Kim, Sangsoo; Reja, Rohit; Jho, Sungwoong; Kim, Chang Geun; Cha, Ji-Young; Kim, Kyung-Hee; Lee, Bonghee; Bhak, Jong; Kim, Seong-Jin

2009-01-01

172

Genomic Instability and Breast Cancer.  

National Technical Information Service (NTIS)

We are investigating the regulation of genomic stability and how the disruption of such regulation contributes to breast cancer development. We have performed in-depth studies of BRCA1 and the DNA damage response, which allow us to propose a new model of ...

J. Chen

2009-01-01

173

The Genome Sequence of Drosophila melanogaster  

NSDL National Science Digital Library

On Thursday March 23, 2000, a historic milestone was marked as researchers announced they have completed mapping the genome of the fruit fly, Drosophila melanogaster. The achievement, which was announced in a special issue of the journal Science, culminates close to 100 years of research. Drosophila melanogaster is the most complex animal thus far to have its genetic sequence deciphered. The findings have important implications for human medical research and for completing a map of the human genome. Mapping the fruit fly genome has been a broad collaborative effort between academia and industry in several countries. While a foundation was laid by US (Berkeley), European, and Canadian Drosophila Genome Projects, Celera Genomic finished the job over the last year by employing super-computers and state-of-the-art gene-sequencing machines. The techniques learned and used in this last phase of mapping may now be applied to more rapidly decode genes of other organisms, including humans. This week's In The News takes a closer look at this important landmark.

Ramanujan, Krishna.

174

Comparative Analysis of Genome Sequences with VISTA  

DOE Data Explorer

VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

Dubchak, Inna

175

Comparison of Sample Sequences of the Salmonella typhi Genome to the Sequence of the Complete Escherichia coli K-12 Genome  

Microsoft Academic Search

Raw sequence data representing the majority of a bacterial genome can be obtained at a tiny fraction of the cost of a completed sequence. To demonstrate the utility of such a resource, 870 single-stranded M13 clones were sequenced from a shotgun library of the Salmonella typhi Ty2 genome. The sequence reads averaged over 400 bases and sampled the genome with

MICHAEL MCCLELLAND; RICHARD K. WILSON

1998-01-01

176

Computational approaches to identify functional genetic variants in cancer genomes.  

PubMed

The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype. PMID:23900255

Gonzalez-Perez, Abel; Mustonen, Ville; Reva, Boris; Ritchie, Graham R S; Creixell, Pau; Karchin, Rachel; Vazquez, Miguel; Fink, J Lynn; Kassahn, Karin S; Pearson, John V; Bader, Gary D; Boutros, Paul C; Muthuswamy, Lakshmi; Ouellette, B F Francis; Reimand, Jüri; Linding, Rune; Shibata, Tatsuhiro; Valencia, Alfonso; Butler, Adam; Dronov, Serge; Flicek, Paul; Shannon, Nick B; Carter, Hannah; Ding, Li; Sander, Chris; Stuart, Josh M; Stein, Lincoln D; Lopez-Bigas, Nuria

2013-08-01

177

NIH researchers complete whole-exome sequencing of skin cancer;  

Cancer.gov

A team led by researchers at NIH is the first to systematically survey the landscape of the melanoma genome, the DNA code of the deadliest form of skin cancer. The researchers have made surprising new discoveries using whole-exome sequencing, an approach that decodes the 1-2 percent of the genome that contains protein-coding genes.

178

Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux  

PubMed Central

We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ?20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology.

Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

2012-01-01

179

Clinical applications of next-generation sequencing in colorectal cancers  

PubMed Central

Like other solid tumors, colorectal cancer (CRC) is a genomic disorder in which various types of genomic alterations, such as point mutations, genomic rearrangements, gene fusions, or chromosomal copy number alterations, can contribute to the initiation and progression of the disease. The advent of a new DNA sequencing technology known as next-generation sequencing (NGS) has revolutionized the speed and throughput of cataloguing such cancer-related genomic alterations. Now the challenge is how to exploit this advanced technology to better understand the underlying molecular mechanism of colorectal carcinogenesis and to identify clinically relevant genetic biomarkers for diagnosis and personalized therapeutics. In this review, we will introduce NGS-based cancer genomics studies focusing on those of CRC, including a recent large-scale report from the Cancer Genome Atlas. We will mainly discuss how NGS-based exome-, whole genome- and methylome-sequencing have extended our understanding of colorectal carcinogenesis. We will also introduce the unique genomic features of CRC discovered by NGS technologies, such as the relationship with bacterial pathogens and the massive genomic rearrangements of chromothripsis. Finally, we will discuss the necessary steps prior to development of a clinical application of NGS-related findings for the advanced management of patients with CRC.

Kim, Tae-Min; Lee, Sug-Hyung; Chung, Yeun-Jun

2013-01-01

180

A Computer Program for Aligning a cDNA Sequence with a Genomic DNA Sequence  

Microsoft Academic Search

We address the problem of efficiently aligning a transcribed and spliced DNA sequence with a genomic sequence containing that gene, allowing for introns in the genomic sequence and a relatively small number of sequencing errors. A freely available computer program, described herein, solves the problem for a 100-kb genomic sequence in a few seconds on a workstation. With large amounts

Liliana Florea; George Hartzell; Gerald M. Rubin; Webb Miller

1998-01-01

181

Defining Genome Project Standards in a New Era of Sequencing  

SciTech Connect

Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

Chain, Patrick [DOE-JGI

2009-05-27

182

Toxicogenomics and cancer susceptibility: advances with next-generation sequencing.  

PubMed

The aim of this review is to comprehensively summarize the recent achievements in the field of toxicogenomics and cancer research regarding genetic-environmental interactions in carcinogenesis and detection of genetic aberrations in cancer genomes by next-generation sequencing technology. Cancer is primarily a genetic disease in which genetic factors and environmental stimuli interact to cause genetic and epigenetic aberrations in human cells. Mutations in the germline act as either high-penetrance alleles that strongly increase the risk of cancer development, or as low-penetrance alleles that mildly change an individual's susceptibility to cancer. Somatic mutations, resulting from either DNA damage induced by exposure to environmental mutagens or from spontaneous errors in DNA replication or repair are involved in the development or progression of the cancer. Induced or spontaneous changes in the epigenome may also drive carcinogenesis. Advances in next-generation sequencing technology provide us opportunities to accurately, economically, and rapidly identify genetic variants, somatic mutations, gene expression profiles, and epigenetic alterations with single-base resolution. Whole genome sequencing, whole exome sequencing, and RNA sequencing of paired cancer and adjacent normal tissue present a comprehensive picture of the cancer genome. These new findings should benefit public health by providing insights in understanding cancer biology, and in improving cancer diagnosis and therapy. PMID:24875441

Ning, Baitang; Su, Zhenqiang; Mei, Nan; Hong, Huixiao; Deng, Helen; Shi, Leming; Fuscoe, James C; Tolleson, William H

2014-04-01

183

Whole-genome sequencing in bacteriology: state of the art  

PubMed Central

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics.

Dark, Michael J

2013-01-01

184

Draft Genome Sequence of Actinomyces massiliensis Strain 4401292T  

PubMed Central

A draft genome sequence of Actinomyces massiliensis, an anaerobic bacterium isolated from a patient's blood culture, is described here. CRISPR-associated proteins, insertion sequences, and toxin-antitoxin loci were found on the genome.

Robert, Catherine; Gimenez, Gregory; Gharbi, Reem; Raoult, Didier

2012-01-01

185

Genome Sequencing Center Tour Videos and Classroom Activities  

NSDL National Science Digital Library

A video tour of the Washington University Genome Sequencing CenterâÂÂsupplemented by additional films and classroom activitiesâÂÂcan help advanced high school students and college undergraduates understand the classical techniques of genome sequencing.

Sarah Elgin (Washington University;)

2010-05-28

186

Complete Genome Sequence of Lactobacillus helveticus H10 ?  

PubMed Central

Lactobacillus helveticus strain H10 was isolated from traditional fermented milk in Tibet, China. We sequenced the whole genome of strain H10 and compared it to the published genome sequence of Lactobacillus helveticus DPC4571.

Zhao, Wenjing; Chen, Yongfu; Sun, Zhihong; Wang, Jicheng; Zhou, Zhemin; Sun, Tiansong; Wang, Lei; Chen, Wei; Zhang, Heping

2011-01-01

187

The Norway spruce genome sequence and conifer genome evolution.  

PubMed

Conifers have dominated forests for more than 200?million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000?base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding. PMID:23698360

Nystedt, Björn; Street, Nathaniel R; Wetterbom, Anna; Zuccolo, Andrea; Lin, Yao-Cheng; Scofield, Douglas G; Vezzi, Francesco; Delhomme, Nicolas; Giacomello, Stefania; Alexeyenko, Andrey; Vicedomini, Riccardo; Sahlin, Kristoffer; Sherwood, Ellen; Elfstrand, Malin; Gramzow, Lydia; Holmberg, Kristina; Hällman, Jimmie; Keech, Olivier; Klasson, Lisa; Koriabine, Maxim; Kucukoglu, Melis; Käller, Max; Luthman, Johannes; Lysholm, Fredrik; Niittylä, Totte; Olson, Ake; Rilakovic, Nemanja; Ritland, Carol; Rosselló, Josep A; Sena, Juliana; Svensson, Thomas; Talavera-López, Carlos; Theißen, Günter; Tuominen, Hannele; Vanneste, Kevin; Wu, Zhi-Qiang; Zhang, Bo; Zerbe, Philipp; Arvestad, Lars; Bhalerao, Rishikesh; Bohlmann, Joerg; Bousquet, Jean; Garcia Gil, Rosario; Hvidsten, Torgeir R; de Jong, Pieter; MacKay, John; Morgante, Michele; Ritland, Kermit; Sundberg, Björn; Thompson, Stacey Lee; Van de Peer, Yves; Andersson, Björn; Nilsson, Ove; Ingvarsson, Pär K; Lundeberg, Joakim; Jansson, Stefan

2013-05-30

188

Clinical Implications of the Cancer Genome  

PubMed Central

Cancer is a disease of the genome. Most tumors harbor a constellation of structural genomic alterations that may dictate their clinical behavior and treatment response. Whereas elucidating the nature and importance of these genomic alterations has been the goal of cancer biologists for several decades, ongoing global genome characterization efforts are revolutionizing both tumor biology and the optimal paradigm for cancer treatment at an unprecedented scope. The pace of advance has been empowered, in large part, through disruptive technological innovations that render complete cancer genome characterization feasible on a large scale. This article highlights cardinal biologic and clinical insights gleaned from systematic cancer genome characterization. We also discuss how the convergence of cancer genome biology, technology, and targeted therapeutics articulates a cohesive framework for the advent of personalized cancer medicine.

MacConaill, Laura E.; Garraway, Levi A.

2010-01-01

189

The Wellcome Trust Sanger Institute: The Cancer Genome Project  

NSDL National Science Digital Library

Supported by the Wellcome Trust Sanger Institute, the Cancer Genome Project (CGP) "is using the human genome sequence and high throughput mutation detection techniques to identify somatically acquired sequence variants/mutations and hence identify genes critical in the development of human cancers. This initiative will ultimately provide the paradigm for the detection of germline mutations in non-neoplastic human genetic diseases through genome-wide mutation detection approaches." The CGP website links to a number of Data Resources including the Cancer Gene Census, Cancer Cell Line Project, Catalogue of Somatic Mutations in Cancer (reported on in the March 4, 2005 NSDL Scout Report for Life Sciences), Somatic Mutations in Protein Kinase Genes, and more. The site also contains an extensive listing of publications from 1998 to 2004 with links to PubMed Abstracts.

190

The Wellcome Trust Sanger Institute: The Cancer Genome Project  

NSDL National Science Digital Library

Supported by the Wellcome Trust Sanger Institute, the Cancer Genome Project (CGP) "is using the human genome sequence and high throughput mutation detection techniques to identify somatically acquired sequence variants/mutations and hence identify genes critical in the development of human cancers. This initiative will ultimately provide the paradigm for the detection of germline mutations in non-neoplastic human genetic diseases through genome-wide mutation detection approaches." The CGP website links to a number of Data Resources including the Cancer Gene Census, Cancer Cell Line Project, Catalogue of Somatic Mutations in Cancer (reported on in the March 4, 2005 NSDL Scout Report for Life Sciences), Somatic Mutations in Protein Kinase Genes, and more. The site also contains an extensive listing of publications from 1998 to 2004 with links to PubMed Abstracts.

2005-11-11

191

Expanding the computational toolbox for mining cancer genomes.  

PubMed

High-throughput DNA sequencing has revolutionized the study of cancer genomics with numerous discoveries that are relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations, including single-nucleotide variants, insertions and deletions, copy-number aberrations, structural variants and gene fusions. Additional computational techniques have proved useful for defining the mutations, genes and molecular networks that drive diverse cancer phenotypes and that determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic and epigenomic alterations in cancer, and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application. PMID:25001846

Ding, Li; Wendl, Michael C; McMichael, Joshua F; Raphael, Benjamin J

2014-08-01

192

Initial sequencing and comparative analysis of the mouse genome  

Microsoft Academic Search

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

2002-01-01

193

Using inversion signatures to generate draft genome sequence scaffolds  

Microsoft Academic Search

We present a linear-time algorithm that can generate a contig scaffold for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion

Zanoni Dias; Ulisses Dias; João C. Setubal

2011-01-01

194

Complete genome sequence of Pyrobaculum oguniense.  

PubMed

Pyrobaculum oguniense TE7 is an aerobic hyperthermophilic crenarchaeon isolated from a hot spring in Japan. Here we describe its main chromosome of 2,436,033 bp, with three large-scale inversions and an extra-chromosomal element of 16,887 bp. We have annotated 2,800 protein-coding genes and 145 RNA genes in this genome, including nine H/ACA-like small RNA, 83 predicted C/D box small RNA, and 47 transfer RNA genes. Comparative analyses with the closest known relative, the anaerobe Pyrobaculum arsenaticum from Italy, reveals unexpectedly high synteny and nucleotide identity between these two geographically distant species. Deep sequencing of a mixture of genomic DNA from multiple cells has illuminated some of the genome dynamics potentially shared with other species in this genus. PMID:23407329

Bernick, David L; Karplus, Kevin; Lui, Lauren M; Coker, Joanna K C; Murphy, Julie N; Chan, Patricia P; Cozen, Aaron E; Lowe, Todd M

2012-07-30

195

Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences  

Microsoft Academic Search

BACKGROUND: Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX,

Alexander F. Auch; Stefan R. Henz; Barbara R. Holland; Markus Göker

2006-01-01

196

Perspective beyond Cancer Genomics: Bioenergetics of Cancer Stem Cells  

PubMed Central

Although the notion that cancer is a disease caused by genetic and epigenetic alterations is now widely accepted, perhaps more emphasis has been given to the fact that cancer is a genetic disease. It should be noted that in the post-genome sequencing project period of the 21st century, the underlined phenomenon nevertheless could not be discarded towards the complete control of cancer disaster as the whole strategy, and in depth investigation of the factors associated with tumorigenesis is required for achieving it. Otto Warburg has won a Nobel Prize in 1931 for the discovery of tumor bioenergetics, which is now commonly used as the basis of positron emission tomography (PET), a highly sensitive noninvasive technique used in cancer diagnosis. Furthermore, the importance of the cancer stem cell (CSC) hypothesis in therapy-related resistance and metastasis has been recognized during the past 2 decades. Accumulating evidence suggests that tumor bioenergetics plays a critical role in CSC regulation; this finding has opened up a new era of cancer medicine, which goes beyond cancer genomics.

Ishii, Hideshi; Doki, Yuichiro

2010-01-01

197

Cactus: Algorithms for genome multiple sequence alignment  

PubMed Central

Much attention has been given to the problem of creating reliable multiple sequence alignments in a model incorporating substitutions, insertions, and deletions. Far less attention has been paid to the problem of optimizing alignments in the presence of more general rearrangement and copy number variation. Using Cactus graphs, recently introduced for representing sequence alignments, we describe two complementary algorithms for creating genomic alignments. We have implemented these algorithms in the new “Cactus” alignment program. We test Cactus using the Evolver genome evolution simulator, a comprehensive new tool for simulation, and show using these and existing simulations that Cactus significantly outperforms all of its peers. Finally, we make an empirical assessment of Cactus's ability to properly align genes and find interesting cases of intra-gene duplication within the primates.

Paten, Benedict; Earl, Dent; Nguyen, Ngan; Diekhans, Mark; Zerbino, Daniel; Haussler, David

2011-01-01

198

[Prediction of transcription and genomic sequences].  

PubMed

Technological developments have enhanced DNA sequencing at genomic scale. On the basis of the resulting sequences, computational biologists now attempt to localise the most important functional regions, starting with genes, but also importantly the regulatory motifs and conditions controlling their expression. In a recent paper published in Cell, M.A. Beer and S. Tavazoie report the results obtained by combining statistical classifications (clustering) of transcriptome data (DNA chips), software for the discovery of cis-regulatory patterns, together with a probabilistic learning method to infer regulatory rules tentatively accounting for the observed transcriptional profiles. PMID:15525501

Martin, David; Ghattas, Badih; Thieffry, Denis

2004-11-01

199

Complete genome sequence of Candidatus Ruthia magnifica  

PubMed Central

The hydrothermal vent clam Calyptogena magnifica (Bivalvia: Mollusca) is a member of the Vesicomyidae. Species within this family form symbioses with chemosynthetic Gammaproteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a rudimentary gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. The C. magnifica symbiont, Candidatus Ruthia magnifica, was the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced (Newton et al. 2007). Here we expand upon the original report and provide additional details complying with the emerging MIGS/MIMS standards. The complete genome exposed the genetic blueprint of the metabolic capabilities of the symbiont. Genes which were predicted to encode the proteins required for all the metabolic pathways typical of free-living chemoautotrophs were detected in the symbiont genome. These include major pathways including carbon fixation, sulfur oxidation, nitrogen assimilation, as well as amino acid and cofactor/vitamin biosynthesis. This genome sequence is invaluable in the study of these enigmatic associations and provides insights into the origin and evolution of autotrophic endosymbiosis.

Roeselers, Guus; Newton, Irene L. G.; Woyke, Tanja; Auchtung, Thomas A.; Dilly, Geoffrey F.; Dutton, Rachel J.; Fisher, Meredith C.; Fontanez, Kristina M.; Lau, Evan; Stewart, Frank J.; Richardson, Paul M.; Barry, Kerrie W.; Saunders, Elizabeth; Detter, John C.; Wu, Dongying; Eisen, Jonathan A.; Cavanaugh, Colleen M.

2010-01-01

200

The genome sequence of Schizosaccharomyces pombe  

Microsoft Academic Search

We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

2002-01-01

201

Draft Genome Sequence of Rubrivivax gelatinosus CBS  

PubMed Central

Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N2 as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H2. We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium.

Hu, Pingsha; Lang, Juan; Wawrousek, Karen; Yu, Jianping; Maness, Pin-Ching

2012-01-01

202

Genome Sequences of Pseudomonas spp. Isolated from Cereal Crops  

PubMed Central

Compared to those of dicot-infecting bacteria, the available genome sequences of bacteria that infect wheat and barley are limited. Herein, we report the draft genome sequences of four pseudomonads originally isolated from these cereals. These genome sequences provide a useful resource for comparative analyses within the genus and for cross-kingdom analyses of plant pathogenesis.

Stiller, Jiri; Covarelli, Lorenzo; Lindeberg, Magdalen; Shivas, Roger G.; Manners, John M.

2013-01-01

203

Plastid DNA sequence homologies in the tobacco nuclear genome  

Microsoft Academic Search

The tobacco (Nicotiana tabacum) nuclear genome contains long tracts of DNA (i.e. in excess of 18 kb) with high sequence homology to the tobacco plastid genome. Five lambda clones containing these nuclear DNA sequences encompass more than one-third of the tobacco plastid genome. The absolute size of these five integrants is unknown but potentially includes uninterrupted sequences that are as

Michael A. Ayliffe; Jeremy N. Timmis

1992-01-01

204

Draft Genome Sequence of Pseudomonas putida Strain MTCC5279  

PubMed Central

Here we report the genome sequence of a plant-growth-promoting rhizobacterium, Pseudomonas putida strain MTCC5279. The length of the draft genome sequence is approximately 5.2 Mb, with a GC content of 62.5%. The draft genome sequence reveals a number of genes whose products are possibly involved in plant growth promotion and abiotic stress tolerance.

Chaudhry, Vasvi; Asif, Mehar H.; Bag, Sumit; Goel, Ridhi; Mantri, Shrikant S.; Singh, Sunil K.; Chauhan, Puneet S.; Sawant, Samir V.

2013-01-01

205

Draft Genome Sequence of Pseudomonas putida Strain MTCC5279.  

PubMed

Here we report the genome sequence of a plant-growth-promoting rhizobacterium, Pseudomonas putida strain MTCC5279. The length of the draft genome sequence is approximately 5.2 Mb, with a GC content of 62.5%. The draft genome sequence reveals a number of genes whose products are possibly involved in plant growth promotion and abiotic stress tolerance. PMID:23908291

Chaudhry, Vasvi; Asif, Mehar H; Bag, Sumit; Goel, Ridhi; Mantri, Shrikant S; Singh, Sunil K; Chauhan, Puneet S; Sawant, Samir V; Nautiyal, Chandra Shekhar

2013-01-01

206

Mitochondrial Genome Sequence Evolution in Chlamydomonas  

PubMed Central

The mitochondrial genomes of the Chlorophyta exhibit significant diversity with respect to gene content and genome compactness; however, quantitative data on the rates of nucleotide substitution in mitochondrial DNA, which might help explain the origin of this diversity, are lacking. To gain insight into the evolutionary forces responsible for mitochondrial genome diversification, we sequenced to near completion the mitochondrial genome of the chlorophyte Chlamydomonas incerta, estimated the evolutionary divergence between Chlamydomonas reinhardtii and C. incerta mitochondrial protein-coding genes and rRNA-coding regions, and compared the relative evolutionary rates in mitochondrial and nuclear genes. Synonymous and nonsynonymous substitution rates do not differ significantly between the mitochondrial and nuclear protein-coding genes. The mitochondrial rRNA-coding regions, however, are evolving much faster than their nuclear counterparts, and this difference might be explained by relaxed functional constraints on the mitochondrial translational apparatus due to the small number of proteins synthesized in Chlamydomonas mitochondria. Substitution rates at synonymous sites in a nonstandard mitochondrial gene (rtl) and at intronic and synonymous sites in nuclear genes expressed at low levels suggest that the mutation rate is similar in these two genetic compartments. Potential evolutionary forces shaping mitochondrial genome evolution in Chlamydomonas are discussed.

Popescu, Cristina E.; Lee, Robert W.

2007-01-01

207

Whole-genome reconstruction and mutational signatures in gastric cancer  

PubMed Central

Background Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability. Results Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer - against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer. Conclusions These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.

2012-01-01

208

Ovarian cancer genome.  

PubMed

Ovarian cancer (OC) is a relatively frequent malignant disease with a lifetime risk approaching to approximately 1 in 70. As many as 15-25 % OC arise due to known heterozygous germ-line mutations in DNA repair genes, such as BRCA1, BRCA2, RAD51C, NBN (NBS1), BRIP, and PALB2. Sporadic ovarian cancers often phenocopy the features of BRCA1-related hereditary disease (so-called BRCAness), i.e., show biallelic somatic inactivation of the BRCA1 gene. Tumor-specific BRCA1 deficiency renders selective sensitivity of transformed cells to platinating compounds and several other anticancer drugs, which explains high response rates of OC to systemic therapies. High-throughput molecular profiling of OC is instrumental for further progress in identification of novel OC diagnostic markers as well as for the development of new OC-specific treatments. However, interpretation of the huge bulk of incoming data may present a challenge. There is a critical need in the development of bioinformatic tools capable to integrate the multiplicity of available data sets into biologically and medically meaningful pieces of knowledge. PMID:23913204

Imyanitov, Evgeny N

2013-01-01

209

International network of cancer genome projects  

PubMed Central

The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumors from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of over 25,000 cancer genomes at the genomic, epigenomic, and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically-relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies.

2010-01-01

210

Functional genomics of tomato in a post-genome-sequencing phase.  

PubMed

Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv 'Heinz 1706' serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding. PMID:23641177

Aoki, Koh; Ogata, Yoshiyuki; Igarashi, Kaori; Yano, Kentaro; Nagasaki, Hideki; Kaminuma, Eli; Toyoda, Atsushi

2013-03-01

211

Functional genomics of tomato in a post-genome-sequencing phase  

PubMed Central

Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv ‘Heinz 1706’ serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding.

Aoki, Koh; Ogata, Yoshiyuki; Igarashi, Kaori; Yano, Kentaro; Nagasaki, Hideki; Kaminuma, Eli; Toyoda, Atsushi

2013-01-01

212

Why Assembling Plant Genome Sequences Is So Challenging  

PubMed Central

In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed.

Claros, Manuel Gonzalo; Bautista, Rocio; Guerrero-Fernandez, Dario; Benzerki, Hicham; Seoane, Pedro; Fernandez-Pozo, Noe

2012-01-01

213

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence  

Microsoft Academic Search

BACKGROUND: Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS)

Frank M You; Naxin Huo; Karin R Deal; Yong Q Gu; Ming-Cheng Luo; Patrick E McGuire; Jan Dvorak; Olin D Anderson

2011-01-01

214

Making sense of cancer genomic data  

PubMed Central

High-throughput tools for nucleic acid characterization now provide the means to conduct comprehensive analyses of all somatic alterations in the cancer genomes. Both large-scale and focused efforts have identified new targets of translational potential. The deluge of information that emerges from these genome-scale investigations has stimulated a parallel development of new analytical frameworks and tools. The complexity of somatic genomic alterations in cancer genomes also requires the development of robust methods for the interrogation of the function of genes identified by these genomics efforts. Here we provide an overview of the current state of cancer genomics, appraise the current portals and tools for accessing and analyzing cancer genomic data, and discuss emerging approaches to exploring the functions of somatically altered genes in cancer.

Chin, Lynda; Hahn, William C.; Getz, Gad; Meyerson, Matthew

2011-01-01

215

Cancer genetics and genomics: essentials for oncology nurses.  

PubMed

Cancer genetics and genomics are rapidly evolving, with new discoveries emerging in genetic mutations, variants, genomic sequencing, risk-reduction methods, and targeted therapies. To educate patients and families, state-of-the-art care requires nurses to understand terminology, scientific and technological advances, and pharmacogenomics. Clinical application of cancer genetics and genomics involves working in interdisciplinary teams to properly identify patient risk through assessing family history, facilitating genetic testing and counseling services, applying risk-reduction methods, and administering and monitoring targeted therapies. PMID:24867117

Boucher, Jean; Habin, Karleen; Underhill, Meghan

2014-06-01

216

Complete mitochondrial genome sequence of Nectogale elegans.  

PubMed

Abstract The elegant water shrew (Nectogale elegans) belongs to the family Soricidae, and distributes in northern South Asia, central and southern China and northern Southeast Asia. In this study, the complete mitochondrial genome of N. elegans was sequenced. It was determined to be 17,460 bases, and included 13 protein-coding genes (PCGs), 22 tRNA genes, 2 ribosomal RNA genes and one non-coding region, which is similar to other mammalian mitochondrial genomes. Bayesian inference and maximum likelihood methods were used to construct phylogenetic trees based on 12 heavy-strand concatenated PCGs. Phylogenetic analyses further confirmed that Crocidurinae diverged prior to Soricinae, and Sorex unguiculatus differentiated earlier than N. elegans. PMID:23795853

Huang, Ting; Yan, Chaochao; Tan, Zheng; Tu, Feiyun; Yue, Bisong; Zhang, Xiuyue

2014-08-01

217

The UCSC cancer genomics browser: update 2011  

PubMed Central

The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu) comprises a suite of web-based tools to integrate, visualize and analyze cancer genomics and clinical data. The browser displays whole-genome views of genome-wide experimental measurements for multiple samples alongside their associated clinical information. Multiple data sets can be viewed simultaneously as coordinated ‘heatmap tracks’ to compare across studies or different data modalities. Users can order, filter, aggregate, classify and display data interactively based on any given feature set including clinical features, annotated biological pathways and user-contributed collections of genes. Integrated standard statistical tools provide dynamic quantitative analysis within all available data sets. The browser hosts a growing body of publicly available cancer genomics data from a variety of cancer types, including data generated from the Cancer Genome Atlas project. Multiple consortiums use the browser on confidential prepublication data enabled by private installations. Many new features have been added, including the hgMicroscope tumor image viewer, hgSignature for real-time genomic signature evaluation on any browser track, and ‘PARADIGM’ pathway tracks to display integrative pathway activities. The browser is integrated with the UCSC Genome Browser; thus inheriting and integrating the Genome Browser’s rich set of human biology and genetics data that enhances the interpretability of the cancer genomics data.

Sanborn, J. Zachary; Benz, Stephen C.; Craft, Brian; Szeto, Christopher; Kober, Kord M.; Meyer, Laurence; Vaske, Charles J.; Goldman, Mary; Smith, Kayla E.; Kuhn, Robert M.; Karolchik, Donna; Kent, W. James; Stuart, Joshua M.; Haussler, David; Zhu, Jingchun

2011-01-01

218

The International Rice Genome Sequencing Project: progress and prospects  

Microsoft Academic Search

The rice genome sequencing project has been pursued as a national project in Japan since 1998. At the same time, a desire to accelerate the sequenc- ing of the entire rice genome led to the formation of the International Rice Genome Sequencing Project (IRGSP), initially comprising five countries. The sequencing strategy is the conventional clone-by-clone shotgun method us- ing P1-derived

T. Sasaki; T. Matsumoto; T. Baba; K. Yamamoto; J. Wu; Y. Katayose; K. Sakata

219

The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data.  

PubMed

The cBio Cancer Genomics Portal (http://cbioportal.org) is an open-access resource for interactive exploration of multidimensional cancer genomics data sets, currently providing access to data from more than 5,000 tumor samples from 20 cancer studies. The cBio Cancer Genomics Portal significantly lowers the barriers between complex genomic data and cancer researchers who want rapid, intuitive, and high-quality access to molecular profiles and clinical attributes from large-scale cancer genomics projects and empowers researchers to translate these rich data sets into biologic insights and clinical applications. PMID:22588877

Cerami, Ethan; Gao, Jianjiong; Dogrusoz, Ugur; Gross, Benjamin E; Sumer, Selcuk Onur; Aksoy, Bülent Arman; Jacobsen, Anders; Byrne, Caitlin J; Heuer, Michael L; Larsson, Erik; Antipin, Yevgeniy; Reva, Boris; Goldberg, Arthur P; Sander, Chris; Schultz, Nikolaus

2012-05-01

220

The Jackson Laboratory: The Mouse Genome Sequence Project  

NSDL National Science Digital Library

Part of the Mouse Genome Informatics program (last reported on in the NSDL Scout Report for the Life Sciences on March 19, 2004) at the Jackson Laboratory, this website presents The Mouse Genome Sequence (MGS) project. MGS is designed "to integrate emerging mouse genomic sequence data with the genetic and biological data available in MGD and GXD." The site links to Eukaryotic Genome Annotation Projects, as well as Sequence Analysis Tools including MouseBlast and Genome Analysis. The site also offers basic background information about the Mouse Genome Sequencing Initiative, and provides site users with access to groups involved in mouse genome sequencing, the BAC clone library, request forms for targeted sequencing, and more.

221

Draft Genome Sequence of Lactobacillus rossiae DSM 15814T  

PubMed Central

The draft genome sequence of Lactobacillus rossiae DSM 15814T (CS1, ATCC BAA-88) was determined by a whole-genome shotgun approach. Reads were assembled to a 2.9-Mb draft version. RAST genome annotation evidenced 2,723 predicted coding sequences. Many carbohydrate, amino acid, and amino acid derivative subsystem features were found.

Di Cagno, Raffaella; Cattonaro, Federica; Gobbetti, Marco

2012-01-01

222

Complete Genome Sequence of Staphylococcus aureus Siphovirus Phage JS01  

PubMed Central

Staphylococcus aureus is the most prevalent and economically significant pathogen causing bovine mastitis. We isolated and characterized one staphylophage from the milk of mastitis-affected cattle and sequenced its genome. Transmission electron microscopy (TEM) observation shows that it belongs to the family Siphovirus. We announce here its complete genome sequence and report major findings from the genomic analysis.

Jia, Hongying; Bai, Qinqin; Yang, Yongchun

2013-01-01

223

Ethical issues raised by whole genome sequencing.  

PubMed

While there is ongoing discussion about the details of implementation of whole genome sequencing (WGS) and whole exome sequencing (WES), there appears to be a consensus amongst geneticists that the widespread use of these approaches is not only inevitable, but will also be beneficial [1]. However, at the present time, we are unable to anticipate the full range of uses, consequences and impact of implementing WGS and WES. Nevertheless, the already known ethical issues, both in research and in clinical practice are diverse and complex and should be addressed properly presently. Herein, we discuss the ethical aspects of WGS and WES by particularly focussing on three overlapping themes: (1) informed consent, (2) data handling, and (3) the return of results. PMID:24810188

Pinxten, Wim; Howard, Heidi Carmen

2014-04-01

224

Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis  

Microsoft Academic Search

The 4 202 353 bp genome of the alkaliphilic bacterium Bacillus halodurans C-125 contains 4066 predicted protein coding sequences (CDSs), 2141 (52.7%) of which have functional assignments, 1182 (29%) of which are conserved CDSs with unknown function and 743 (18.3%) of which have no match to any protein database. Among the total CDSs, 8.8% match sequences of proteins found only

Hideto Takami; Kaoru Nakasone; Yoshihiro Takaki; Go Maeno; Rumie Sasaki; Noriaki Masui; Fumie Fuji; Chie Hirama; Yuka Nakamura; Naotake Ogasawara; Satoru Kuhara; Koki Horikoshi

2000-01-01

225

Analysis of gains and losses of DNA sequences along all human chromosomes by comparative genomic hybridization implicates 6q and several other chromosomal sites as putative tumor suppressor gene loci in prostate cancer  

SciTech Connect

Genetic changes associated with the development of prostate cancer are poorly known. We sought to identify regions that contain important genes for the development of prostate cancer by using comparative genomic hybridization (CGH) for genome-wide screening of gains and losses of DNA sequences. In CGH, differentially labeled tumor and normal DNAs are co-hybridized to normal metaphase spreads to visualize chromosomal regions with losses and gains of DNA sequences. Analysis of 31 uncultured primary prostate cancers showed that deletions predominated over gains with a ratio of 5:1. The most commonly deleted regions were 8p; 32% (minimal common region p12-pter), 13q; 32% (q21-q31), 6q; 22% (cen-q21), 16q; 19% (cen-q23), 18q; 19% (q22-qter) and 9p; 16%(p23-pter). Gain of the entire long arm of chromosome 8 was found in 6% of cases but no high-level amplifications were found in any of the specimens. Of the aberrations found by CGH, 6q represents a previously unreported, major site for deletion in prostate cancer. Analysis of loss of heterozygosity (LOH) was used to confirm the presence of 6q and other deletions found by CGH. LOH and CGH data showed an about 75% concordance. The significance of genetic aberrations in prostate cancer are being evaluated by correlating CGH findings with clinical outcome as well as by comparing genetic changes observed in the primary tumor with those found in recurrent lesions and metastases of the same patient.

Visakorpi, T.; Karhu, R.; Kallioniemi, A. [Tampere Univ. Hospital (Finland)] [and others

1994-09-01

226

Evaluation of Genome Sequencing Quality in Selected Plant Species Using Expressed Sequence Tags  

PubMed Central

Background With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. Methodology/Principal Finding Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ? 98.28% and 89.02% ? 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. Conclusion The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published.

Shangguan, Lingfei; Han, Jian; Kayesh, Emrul; Sun, Xin; Zhang, Changqing; Pervaiz, Tariq; Wen, Xicheng; Fang, Jinggui

2013-01-01

227

Genomic Sequence Comparisons, 1987-2003 Final Report  

SciTech Connect

This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

George M. Church

2004-07-29

228

Draft Genome Sequence of Bacillus amyloliquefaciens B-1895.  

PubMed

In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

Karlyshev, Andrey V; Melnikov, Vyacheslav G; Chistyakov, Vladimir A

2014-01-01

229

Draft Genome Sequence of Bacillus amyloliquefaciens B-1895  

PubMed Central

In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters.

Melnikov, Vyacheslav G.; Chistyakov, Vladimir A.

2014-01-01

230

Integraated Program in Microbial Genome Sequencing and Analysis.  

National Technical Information Service (NTIS)

The final progress report for this project contains information on nine microbial genome sequencing projects and two functional genomics projects that have been underway since the last report was submitted. The work funded under this award has resulted in...

2005-01-01

231

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

2000-01-01

232

Validation of rice genome sequence by optical mapping  

Microsoft Academic Search

BACKGROUND: Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. RESULTS: To facilitate ongoing sequencing finishing and validation

Shiguo Zhou; Michael C Bechner; Chris P Churas; Louise Pape; Sally A Leong; Rod Runnheim; Dan K Forrest; Steve Goldstein; Miron Livny; David C Schwartz

2007-01-01

233

Next Generation Sequencing at the University of Chicago Genomics Core  

ScienceCinema

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

234

Next Generation Sequencing at the University of Chicago Genomics Core  

SciTech Connect

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

Faber, Pieter [University of Chicago

2013-04-24

235

Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)  

SciTech Connect

Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

2010-01-01

236

Next-generation sequencing strategies for characterizing the turkey genome.  

PubMed

The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry. PMID:24570472

Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

2014-02-01

237

Applications of next-generation sequencing technologies in functional genomics  

Microsoft Academic Search

A new generation of sequencing technologies, from Illumina\\/Solexa, ABI\\/SOLiD, 454\\/Roche, and Helicos, has provided unprecedented opportunities for high-throughput functional genomic research. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted resequencing, discovery of transcription factor binding sites, and noncoding RNA expression profiling. This review discusses applications of next-generation sequencing technologies in functional genomics

Olena Morozova; Marco A. Marra

2008-01-01

238

Synergy between sequence and size in Large-scale genomics  

Microsoft Academic Search

Until recently the study of individual DNA sequences and of total DNA content (the C-value) sat at opposite ends of the spectrum in genome biology. For gene sequencers, the vast stretches of non-coding DNA found in eukaryotic genomes were largely considered to be an annoyance, whereas genome-size researchers attributed little relevance to specific nucleotide sequences. However, the dawn of comprehensive

T. Ryan Gregory

2005-01-01

239

Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology  

Microsoft Academic Search

Organellar DNA sequences are widely used in evolutionary and population genetic studies, how- ever, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified

Richard Cronn; Aaron Liston; Matthew Parks; David S. Gernandt; Rongkun Shen; Todd Mockler

2008-01-01

240

Sequencing and assembly of the 22-gb loblolly pine genome.  

PubMed

Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer "super-reads," rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp. PMID:24653210

Zimin, Aleksey; Stevens, Kristian A; Crepeau, Marc W; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L; de Jong, Pieter J; Neale, David B; Salzberg, Steven L; Yorke, James A; Langley, Charles H

2014-03-01

241

Exome sequencing: the sweet spot before whole genomes.  

PubMed

The development of massively parallel sequencing technologies, coupled with new massively parallel DNA enrichment technologies (genomic capture), has allowed the sequencing of targeted regions of the human genome in rapidly increasing numbers of samples. Genomic capture can target specific areas in the genome, including genes of interest and linkage regions, but this limits the study to what is already known. Exome capture allows an unbiased investigation of the complete protein-coding regions in the genome. Researchers can use exome capture to focus on a critical part of the human genome, allowing larger numbers of samples than are currently practical with whole-genome sequencing. In this review, we briefly describe some of the methodologies currently used for genomic and exome capture and highlight recent applications of this technology. PMID:20705737

Teer, Jamie K; Mullikin, James C

2010-10-15

242

A sequence-based survey of the complex structural organization of tumor genomes  

SciTech Connect

The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

2008-04-03

243

Genome Sequence of Lactobacillus plantarum Strain UCMA 3037  

PubMed Central

Nucleic acid of the strain Lactobacillus plantarum UCMA 3037, isolated from raw milk camembert cheese in our laboratory, was sequenced. We present its draft genome sequence with the aim of studying its functional properties and relationship to the cheese ecosystem.

Naz, Saima; Tareb, Raouf; Bernardeau, Marion; Vaisse, Melissa; Lucchetti-Miganeh, Celine; Rechenmann, Mathias

2013-01-01

244

Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome  

Microsoft Academic Search

Background: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D.

Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

2002-01-01

245

The $1000 Genome: Ethical and Legal Issues in Whole Genome Sequencing of Individuals  

Microsoft Academic Search

Progress in gene sequencing could make rapid whole genome sequencing of individuals affordable to millions of persons and useful for many purposes in a future era of genomic medicine. Using the idea of $1000 genome as a focus, this article reviews the main technical, ethical, and legal issues that must be resolved to make mass genotyping of individuals cost-effective and

John A. Robertson

2003-01-01

246

Insight into the heterogeneity of breast cancer through next-generation sequencing  

PubMed Central

Rapid and sophisticated improvements in molecular analysis have allowed us to sequence whole human genomes as well as cancer genomes, and the findings suggest that we may be approaching the ability to individualize the diagnosis and treatment of cancer. This paradigmatic shift in approach will require clinicians and researchers to overcome several challenges including the huge spectrum of tumor types within a given cancer, as well as the cell-to-cell variations observed within tumors. This review discusses how next-generation sequencing of breast cancer genomes already reveals insight into tumor heterogeneity and how it can contribute to future breast cancer classification and management.

Russnes, Hege G.; Navin, Nicholas; Hicks, James; Borresen-Dale, Anne-Lise

2011-01-01

247

De Novo Next Generation Sequencing of Plant Genomes  

Microsoft Academic Search

The genome sequencing of all major food and bioenergy crops is of critical importance in the race to improve crop production\\u000a to meet the future food and energy security needs of the world. Next generation sequencing technologies have brought about\\u000a great improvements in sequencing throughput and cost, but do not yet allow for de novo sequencing of large repetitive genomes

Steve Rounsley; Pradeep Reddy Marri; Yeisoo Yu; Ruifeng He; Nick Sisneros; Jose Luis Goicoechea; So Jeong Lee; Angelina Angelova; Dave Kudrna; Meizhong Luo; Jason Affourtit; Brian Desany; James Knight; Faheem Niazi; Michael Egholm; Rod A. Wing

2009-01-01

248

The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.  

PubMed

The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer. PMID:18616967

Droege, Marcus; Hill, Brendon

2008-08-31

249

Advancing Precision Medicine for Prostate Cancer Through Genomics  

PubMed Central

Prostate cancer is the most common type of cancer in men and the second leading cause of cancer death in men in the United States. The recent surge of high-throughput sequencing of cancer genomes has supported an expanding molecular classification of prostate cancer. Translation of these basic science studies into clinically valuable biomarkers for diagnosis and prognosis and biomarkers that are predictive for therapy is critical to the development of precision medicine in prostate cancer. We review potential applications aimed at improving screening specificity in prostate cancer and differentiating aggressive versus indolent prostate cancers. Furthermore, we review predictive biomarker candidates involving ETS gene rearrangements, PTEN inactivation, and androgen receptor signaling. These and other putative biomarkers may signify aberrant oncogene pathway activation and provide a rationale for matching patients with molecularly targeted therapies in clinical trials. Lastly, we advocate innovations for clinical trial design to incorporate tumor biopsy and molecular characterization to develop biomarkers and understand mechanisms of resistance.

Roychowdhury, Sameek; Chinnaiyan, Arul M.

2013-01-01

250

Diversity through duplication: Whole-genome sequencing reveals novel gene retrocopies in the human population.  

PubMed

Gene retrocopies are generated by reverse transcription and genomic integration of mRNA. As such, retrocopies present an important exception to the central dogma of molecular biology, and have substantially impacted the functional landscape of the metazoan genome. While an estimated 8,000-17,000 retrocopies exist in the human genome reference sequence, the extent of variation between individuals in terms of retrocopy content has remained largely unexplored. Three recent studies by Abyzov et al., Ewing et al. and Schrider et al. have exploited 1,000 Genomes Project Consortium data, as well as other sources of whole-genome sequencing data, to uncover novel gene retrocopies. Here, we compare the methods and results of these three studies, highlight the impact of retrocopies in human diversity and genome evolution, and speculate on the potential for somatic gene retrocopies to impact cancer etiology and genetic diversity among individual neurons in the mammalian brain. PMID:24615986

Richardson, Sandra R; Salvador-Palomeque, Carmen; Faulkner, Geoffrey J

2014-05-01

251

Finishing The Euchromatic Sequence Of The Human Genome  

SciTech Connect

The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

2004-09-07

252

Identification of Candidate Drosophila Olfactory Receptors from Genomic DNA Sequence  

Microsoft Academic Search

We have taken advantage of the availability of a large amount of Drosophila genomic DNA sequence in the Berkeley Drosophila Genome Project database (?1\\/5 of the genome) to identify a family of novel seven transmembrane domain encoding genes that are putative Drosophila olfactory receptors. Members of the family are expressed in distinct subsets of olfactory neurons, and certain family members

Qian Gao; Andrew Chess

1999-01-01

253

Complete Genome Sequence of the Mesoplasma florum W37 Strain  

PubMed Central

Mesoplasma florum is a small-genome fast-growing mollicute that is an attractive model for systems and synthetic genomics studies. We report the complete 825,824-bp genome sequence of a second representative of this species, M. florum strain W37, which contains 733 predicted open reading frames and 35 stable RNAs.

Baby, Vincent; Matteau, Dominick; Knight, Thomas F.

2013-01-01

254

Complete Genome Sequences of Helicobacter pylori Clarithromycin-Resistant Strains  

PubMed Central

We report the complete genome sequences of two Helicobacter pylori clarithromycin-resistant strains. Clarithromycin (CLR)-resistant strains were obtained under the exposure of H. pylori strain 26695 on agar plates with low clarithromycin concentrations. The genome data provide insights into the genomic changes of H. pylori under selection by clarithromycin in vitro.

Binh, Tran Thanh; Suzuki, Rumiko; Shiota, Seiji; Kwon, Dong Hyeon

2013-01-01

255

Complete Genome Sequence of Mycoplasma wenyonii Strain Massachusetts  

PubMed Central

Mycoplasma wenyonii is a hemotrophic mycoplasma that causes acute and chronic infections in cattle. Here, we announce the first complete genome sequence of this organism. The genome is a single circular chromosome with 650,228 bp and G+C% of 33.9. Analyses of M. wenyonii genome will provide insights into its biology.

Guimaraes, Ana M. S.; do Nascimento, Naila C.; SanMiguel, Phillip J.

2012-01-01

256

Genome Sequence of the Rice Pathogen Pseudomonas fuscovaginae CB98818  

PubMed Central

Pseudomonas fuscovaginae is a phytopathogenic bacterium causing bacterial sheath brown rot of cereal crops. Here, we present the draft genome sequence of P. fuscovaginae CB98818, originally isolated from a diseased rice plant in China. The draft genome will aid in epidemiological studies, comparative genomics, and quarantine of this broad-host-range pathogen.

Xie, Guanlin; Cui, Zhouqi; Tao, Zhongyun; Qiu, Hui; Liu, He; Zhu, Bo; Jin, Gulei; Sun, Guochang; Almoneafy, Abdulwareth

2012-01-01

257

Genome Sequence of Aedes aegypti, a Major Arbovirus Vector  

Microsoft Academic Search

We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at ~1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of ~4 to

Vishvanath Nene; Jennifer R. Wortman; Daniel Lawson; Brian Haas; Chinnappa Kodira; Z. Tu; Brendan Loftus; Zhiyong Xi; Karyn Megy; Manfred Grabherr; Quinghu Ren; E. M. Zdobnov; N. F. Lobo; K. S. Campbell; S. E. Brown; M. F. Bonaldo; Jingsong Zhu; S. P. Sinkins; D. G. Hogenkamp; Paolo Amedeo; Peter Arensburger; P. W. Atkinson; Shelby Bidwell; Jim Biedler; Ewan Birney; Robert V. Bruggner; Javier Costas; M. R. Coy; Jonathan Crabtree; Matt Crawford; Becky deBruyn; David DeCaprio; Karin Eiglmeier; Eric Eisenstadt; Hamza El-Dorry; W. M. Gelbart; S. L. Gomes; Martin Hammond; Linda I. Hannick; M. H. Holmes; J. R. Hogan; David Jaffe; J. S. Johnston; R. C. Kennedy; Hean Koo; Saul Kravitz; Evgenia V. Kriventseva; David Kulp; Kurt LaButti; Eduardo Lee; Song Li; Diane D. Lovin; Chunhong Mao; Evan Mauceli; C. F. M. Menck; J. R. Miller; Philip Montgomery; Akio Mori; A. L. Nascimento; H. F. Naveira; Chad Nusbaum; S. O'Leary; Joshua Orvis; Mihaela Pertea; Hadi Quesneville; K. R. Reidenbach; Yu-Hui Rogers; C. W. Roth; J. R. Schneider; Michael Schatz; Martin Shumway; Mario Stanke; E. O. Stinson; J. M. C. Tubio; J. P. VanZee; Sergio Verjovski-Almeida; Doreen Werner; Owen White; Stefan Wyder; Qiandong Zeng; Qi Zhao; Yongmei Zhao; C. A. Hill; A. S. Raikhel; M. B. Soares; D. L. Knudson; N. H. Lee; James Galagan; S. L. Salzberg; I. T. Paulsen; George Dimopoulos; F. H. Collins; Bruce Birren; C. M. Fraser-Liggett; D. W. Severson

2007-01-01

258

Complete genome sequence of Mycoplasma wenyonii strain Massachusetts.  

PubMed

Mycoplasma wenyonii is a hemotrophic mycoplasma that causes acute and chronic infections in cattle. Here, we announce the first complete genome sequence of this organism. The genome is a single circular chromosome with 650,228 bp and G+C% of 33.9. Analyses of M. wenyonii genome will provide insights into its biology. PMID:22965086

dos Santos, Andrea P; Guimaraes, Ana M S; do Nascimento, Naíla C; SanMiguel, Phillip J; Messick, Joanne B

2012-10-01

259

Strong association between cancer and genomic instability.  

PubMed

After a first wave of radiation-induced chromosomal aberrations, a second wave appears 20-30 cell generations after radiation exposure and persists thereafter. This late effect is usually termed "genomic instability". A better term is "increased genomic instability". This effect has been observed in many cell systems in vitro and in vivo for quite a number of biological endpoints. The radiation-induced increase in genomic instability is apparently a general phenomenon. In the development of cancer, several mutations are involved. With increasing genomic instability, the probability for further mutations is enhanced. Several studies show that genomic instability is increased not only in the cancer cells but also in "normal" cells of cancer patients e.g. peripheral lymphocytes. This has for example been shown in uranium miners with bronchial carcinomas, but also in untreated head and neck cancer patients. The association between cancer and genomic instability is also found in individuals with a genetic predisposition for increased radiosensitivity. Several such syndromes have been found. In all cases, an increased genomic instability, cancer proneness and increased radiosensitivity coincide. In these syndromes, deficiencies in certain DNA-repair pathways occur as well as deregulations of the cell cycle. Especially, mutations are seen in genes encoding proteins, which are involved in the G(1)/S-phase checkpoint. Genomic instability apparently promotes cancer development. In this context, it is interesting that hypoxia, increased genomic instability and cancer are also associated. All these processes are energy dependent. Some strong evidence exists that the structure and length of telomeres is connected to the development of genomic instability. PMID:20033424

Streffer, Christian

2010-05-01

260

Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes.  

PubMed

A principal obstacle to completing maps and analyses of the human genome involves the genome's "inaccessible" regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)-a substantial fraction of the human genome's remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

Genovese, Giulio; Handsaker, Robert E; Li, Heng; Kenny, Eimear E; McCarroll, Steven A

2013-09-01

261

Gene discovery in Plasmodium chabaudi by genome survey sequencing  

Microsoft Academic Search

The first genome survey sequencing of the rodent malaria parasite Plasmodium chabaudi is presented here. In 766 sequences, 131 putative gene sequences have been identified by sequence similarity database searches. Further, 7 potential gene families, four of which have not previously been described, were discovered. These genes may be important in understanding the biology of malaria, as well as offering

Christoph S. Janssen; Michael P. Barrett; Daniel Lawson; Michael A. Quail; David Harris; Sharen Bowman; R. Stephen Phillips; C. Michael R. Turner

2001-01-01

262

Accurate whole human genome sequencing using reversible terminator chemistry  

Microsoft Academic Search

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

2008-01-01

263

MIPS: a database for protein sequences and complete genomes  

Microsoft Academic Search

The MIPS group (Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)) at the Max-Planck- Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis

Hans-werner Mewes; Jean Hani; Friedhelm Pfeiffer; Dmitrij Frishman

1998-01-01

264

Genome Sequence of the Nonpathogenic Pseudomonas aeruginosa Strain ATCC 15442.  

PubMed

Pseudomonas aeruginosa ATCC 15442 is an environmental strain of the Pseudomonas genus. Here, we present a 6.77-Mb assembly of its genome sequence. Besides giving insights into characteristics associated with the pathogenicity of P. aeruginosa, such as virulence, drug resistance, and biofilm formation, the genome sequence may provide some information related to biotechnological utilization of the strain. PMID:24786961

Wang, Yujiao; Li, Chao; Gao, Chao; Ma, Cuiqing; Xu, Ping

2014-01-01

265

Initial sequencing and analysis of the human genome  

Microsoft Academic Search

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

2001-01-01

266

Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis.  

PubMed

Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A; Edwards, Robert A; Klose, Karl E

2013-01-01

267

Genome Sequence of the Nonpathogenic Pseudomonas aeruginosa Strain ATCC 15442  

PubMed Central

Pseudomonas aeruginosa ATCC 15442 is an environmental strain of the Pseudomonas genus. Here, we present a 6.77-Mb assembly of its genome sequence. Besides giving insights into characteristics associated with the pathogenicity of P. aeruginosa, such as virulence, drug resistance, and biofilm formation, the genome sequence may provide some information related to biotechnological utilization of the strain.

Wang, Yujiao; Li, Chao; Ma, Cuiqing; Xu, Ping

2014-01-01

268

Complete Genome Sequence of Aeromonas veronii Strain B565?  

PubMed Central

Aeromonas veronii strain B565 was isolated from aquaculture pond sediment in China. We present here the complete genome sequence of B565 and compare it with 2 published genome sequences of pathogenic strains in the Aeromonas genus. The result represents an independent stepwise acquisition of virulence factors of pathogenic strains in this genus.

Li, Yanxia; Liu, Yuchun; Zhou, Zhemin; Huang, Huoqing; Ren, Yan; Zhang, Yuting; Li, Guannan; Zhou, Zhigang; Wang, Lei

2011-01-01

269

Initial sequencing and analysis of the human genome.  

PubMed

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence. PMID:11237011

Lander, E S; Linton, L M; Birren, B; Nusbaum, C; Zody, M C; Baldwin, J; Devon, K; Dewar, K; Doyle, M; FitzHugh, W; Funke, R; Gage, D; Harris, K; Heaford, A; Howland, J; Kann, L; Lehoczky, J; LeVine, R; McEwan, P; McKernan, K; Meldrim, J; Mesirov, J P; Miranda, C; Morris, W; Naylor, J; Raymond, C; Rosetti, M; Santos, R; Sheridan, A; Sougnez, C; Stange-Thomann, N; Stojanovic, N; Subramanian, A; Wyman, D; Rogers, J; Sulston, J; Ainscough, R; Beck, S; Bentley, D; Burton, J; Clee, C; Carter, N; Coulson, A; Deadman, R; Deloukas, P; Dunham, A; Dunham, I; Durbin, R; French, L; Grafham, D; Gregory, S; Hubbard, T; Humphray, S; Hunt, A; Jones, M; Lloyd, C; McMurray, A; Matthews, L; Mercer, S; Milne, S; Mullikin, J C; Mungall, A; Plumb, R; Ross, M; Shownkeen, R; Sims, S; Waterston, R H; Wilson, R K; Hillier, L W; McPherson, J D; Marra, M A; Mardis, E R; Fulton, L A; Chinwalla, A T; Pepin, K H; Gish, W R; Chissoe, S L; Wendl, M C; Delehaunty, K D; Miner, T L; Delehaunty, A; Kramer, J B; Cook, L L; Fulton, R S; Johnson, D L; Minx, P J; Clifton, S W; Hawkins, T; Branscomb, E; Predki, P; Richardson, P; Wenning, S; Slezak, T; Doggett, N; Cheng, J F; Olsen, A; Lucas, S; Elkin, C; Uberbacher, E; Frazier, M; Gibbs, R A; Muzny, D M; Scherer, S E; Bouck, J B; Sodergren, E J; Worley, K C; Rives, C M; Gorrell, J H; Metzker, M L; Naylor, S L; Kucherlapati, R S; Nelson, D L; Weinstock, G M; Sakaki, Y; Fujiyama, A; Hattori, M; Yada, T; Toyoda, A; Itoh, T; Kawagoe, C; Watanabe, H; Totoki, Y; Taylor, T; Weissenbach, J; Heilig, R; Saurin, W; Artiguenave, F; Brottier, P; Bruls, T; Pelletier, E; Robert, C; Wincker, P; Smith, D R; Doucette-Stamm, L; Rubenfield, M; Weinstock, K; Lee, H M; Dubois, J; Rosenthal, A; Platzer, M; Nyakatura, G; Taudien, S; Rump, A; Yang, H; Yu, J; Wang, J; Huang, G; Gu, J; Hood, L; Rowen, L; Madan, A; Qin, S; Davis, R W; Federspiel, N A; Abola, A P; Proctor, M J; Myers, R M; Schmutz, J; Dickson, M; Grimwood, J; Cox, D R; Olson, M V; Kaul, R; Raymond, C; Shimizu, N; Kawasaki, K; Minoshima, S; Evans, G A; Athanasiou, M; Schultz, R; Roe, B A; Chen, F; Pan, H; Ramser, J; Lehrach, H; Reinhardt, R; McCombie, W R; de la Bastide, M; Dedhia, N; Blöcker, H; Hornischer, K; Nordsiek, G; Agarwala, R; Aravind, L; Bailey, J A; Bateman, A; Batzoglou, S; Birney, E; Bork, P; Brown, D G; Burge, C B; Cerutti, L; Chen, H C; Church, D; Clamp, M; Copley, R R; Doerks, T; Eddy, S R; Eichler, E E; Furey, T S; Galagan, J; Gilbert, J G; Harmon, C; Hayashizaki, Y; Haussler, D; Hermjakob, H; Hokamp, K; Jang, W; Johnson, L S; Jones, T A; Kasif, S; Kaspryzk, A; Kennedy, S; Kent, W J; Kitts, P; Koonin, E V; Korf, I; Kulp, D; Lancet, D; Lowe, T M; McLysaght, A; Mikkelsen, T; Moran, J V; Mulder, N; Pollara, V J; Ponting, C P; Schuler, G; Schultz, J; Slater, G; Smit, A F; Stupka, E; Szustakowski, J; Thierry-Mieg, D; Thierry-Mieg, J; Wagner, L; Wallis, J; Wheeler, R; Williams, A; Wolf, Y I; Wolfe, K H; Yang, S P; Yeh, R F; Collins, F; Guyer, M S; Peterson, J; Felsenfeld, A; Wetterstrand, K A; Patrinos, A; Morgan, M J; de Jong, P; Catanese, J J; Osoegawa, K; Shizuya, H; Choi, S; Chen, Y J; Szustakowki, J

2001-02-15

270

Full genome sequence of a bovine enterovirus isolated in china.  

PubMed

We report the full genome sequence of an isolate of bovine enterovirus type B from China. The virus (BEV-BJ001) was isolated from Beijing, China, from fecal swabs of cattle suffering from severe diarrhea. This genome sequence will give useful insight for future molecular epidemiological studies in China. PMID:24970832

Peng, Xiao-Wei; Dong, Hao; Wu, Qing-Min; Lu, Yan-Li

2014-01-01

271

Draft Genome Sequence of Enterococcus mundtii CRL1656  

PubMed Central

We report the draft genome sequence of Enterococcus mundtii CRL1656, which was isolated from the stripping milk of a clinically healthy adult Holstein dairy cow from a dairy farm of the northwestern region of Tucumán (Argentina). The 3.10-Mb genome sequence consists of 450 large contigs and contains 2,741 predicted protein-coding genes.

Magni, Christian; Espeche, Carolina; Repizo, Guillermo D.; Saavedra, Lucila; Suarez, Cristian A.; Blancato, Victor S.; Espariz, Martin; Esteban, Luis; Raya, Raul R.; Font de Valdez, Graciela; Vignolo, Graciela; Mozzi, Fernanda; Taranto, Maria Pia; Hebert, Elvira M.; Nader-Macias, Maria Elena

2012-01-01

272

Draft genome sequence of Enterococcus mundtii CRL1656.  

PubMed

We report the draft genome sequence of Enterococcus mundtii CRL1656, which was isolated from the stripping milk of a clinically healthy adult Holstein dairy cow from a dairy farm of the northwestern region of Tucumán (Argentina). The 3.10-Mb genome sequence consists of 450 large contigs and contains 2,741 predicted protein-coding genes. PMID:22207752

Magni, Christian; Espeche, Carolina; Repizo, Guillermo D; Saavedra, Lucila; Suárez, Cristian A; Blancato, Víctor S; Espariz, Martín; Esteban, Luis; Raya, Raúl R; Font de Valdez, Graciela; Vignolo, Graciela; Mozzi, Fernanda; Taranto, María Pía; Hebert, Elvira M; Nader-Macías, María Elena; Sesma, Fernando

2012-01-01

273

Full Genome Sequence of Giant Panda Rotavirus Strain CH-1  

PubMed Central

We report here the complete genomic sequence of the giant panda rotavirus strain CH-1. This work is the first to document the complete genomic sequence (segments 1 to 11) of the CH-1 strain, which offers an effective platform for providing authentic research experiences to novice scientists.

Guo, Ling; Yang, Shaolin; Wang, Chengdong; Chen, Shijie; Yang, Xiaonong; Hou, Rong; Quan, Zifang; Hao, Zhongxiang

2013-01-01

274

Full Genome Sequence of a Bovine Enterovirus Isolated in China  

PubMed Central

We report the full genome sequence of an isolate of bovine enterovirus type B from China. The virus (BEV-BJ001) was isolated from Beijing, China, from fecal swabs of cattle suffering from severe diarrhea. This genome sequence will give useful insight for future molecular epidemiological studies in China.

Peng, Xiao-wei; Dong, Hao; Wu, Qing-min

2014-01-01

275

Genome Sequence of the Pathogenic Bacterium Vibrio vulnificus Biotype 3  

PubMed Central

We report the first genome sequence of the pathogenic Vibrio vulnificus biotype 3. This draft genome sequence of the environmental strain VVyb1(BT3), isolated in Israel, provides a representation of this newly emerged clonal group, which reveals higher similarity to the clinical strains of biotype 1 than to the environmental ones.

Danin-Poleg, Yael; Elgavish, Sharona; Raz, Nili; Efimov, Vera

2013-01-01

276

Assessing the Drosophila melanogaster and Anopheles gambiae Genome Annotations Using Genome-Wide Sequence Comparisons  

PubMed Central

We performed genome-wide sequence comparisons at the protein coding level between the genome sequences of Drosophila melanogaster and Anopheles gambiae. Such comparisons detect evolutionarily conserved regions (ecores) that can be used for a qualitative and quantitative evaluation of the available annotations of both genomes. They also provide novel candidate features for annotation. The percentage of ecores mapping outside annotations in the A. gambiae genome is about fourfold higher than in D. melanogaster. The A. gambiae genome assembly also contains a high proportion of duplicated ecores, possibly resulting from artefactual sequence duplications in the genome assembly. The occurrence of 4063 ecores in the D. melanogaster genome outside annotations suggests that some genes are not yet or only partially annotated. The present work illustrates the power of comparative genomics approaches towards an exhaustive and accurate establishment of gene models and gene catalogues in insect genomes.

Jaillon, Olivier; Dossat, Carole; Eckenberg, Ralph; Eiglmeier, Karin; Segurens, Beatrice; Aury, Jean-Marc; Roth, Charles W.; Scarpelli, Claude; Brey, Paul T.; Weissenbach, Jean; Wincker, Patrick

2003-01-01

277

Low-pass sequencing for microbial comparative genomics  

Microsoft Academic Search

BACKGROUND: We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1

Young Ah Goo; Jared Roach; Gustavo Glusman; Nitin S Baliga; Kerry Deutsch; Min Pan; Sean Kennedy; Shiladitya DasSarma; Wailap Victor Ng; Leroy Hood

2004-01-01

278

Genomic sequencing and analysis of Clostera anachoreta granulovirus  

Microsoft Academic Search

The complete genome of Clostera anachoreta granulovirus (ClanGV) from an important pest of poplar, Clostera anachoreta (Lepidoptera: Notodontidae), was sequenced and analyzed. The circular double-stranded genome is 101,487 bp in size with a\\u000a C+G content of 44.4%. It is predicted to contain 123 open reading frames (ORFs), covering 93% of the whole genome sequence.\\u000a One hundred eleven ClanGV ORFs are homologues

Zhenpu Liang; Xiaoxia Zhang; Xinming Yin; Sumei Cao; Feng Xu

2011-01-01

279

A physical map of the papaya genome with integrated genetic map and genome sequence  

PubMed Central

Background Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we constructed a BAC-based physical map using a high information-content fingerprinting approach to assist whole genome shotgun sequence assembly. Results The physical map consists of 963 contigs, representing 9.4× genome equivalents, and was integrated with the genetic map and genome sequence using BAC end sequences and a sequence-tagged high-density genetic map. The estimated genome coverage of the physical map is about 95.8%, while 72.4% of the genome was aligned to the genetic map. A total of 1,181 high quality overgo (overlapping oligonucleotide) probes representing conserved sequences in Arabidopsis and genetically mapped loci in Brassica were anchored on the physical map, which provides a foundation for comparative genomics in the Brassicales. The integrated genetic and physical map aligned with the genome sequence revealed recombination hotspots as well as regions suppressed for recombination across the genome, particularly on the recently evolved sex chromosomes. Suppression of recombination spread to the adjacent region of the male specific region of the Y chromosome (MSY), and recombination rates were recovered gradually and then exceeded the genome average. Recombination hotspots were observed at about 10 Mb away on both sides of the MSY, showing 7-fold increase compared with the genome wide average, demonstrating the dynamics of recombination of the sex chromosomes. Conclusion A BAC-based physical map of papaya was constructed and integrated with the genetic map and genome sequence. The integrated map facilitated the draft genome assembly, and is a valuable resource for comparative genomics and map-based cloning of agronomically and economically important genes and for sex chromosome research.

2009-01-01

280

Using BLAT to Find Sequence Similarity in Closely Related Genomes  

PubMed Central

The BLAST-Like Alignment Tool (BLAT) is used to find genomic sequences that match a protein or DNA sequence submitted by the user. BLAT is typically used for searching similar sequences within the same or closely related species. It was developed to align millions of expressed sequence tags and mouse whole-genome random reads to the human genome at a faster speed (Kent, 2002). It is freely available either on the web or as a downloadable stand-alone program. BLAT search results provide a link for visualization in the University of California, Santa Cruz (UCSC) genome browser where associated biological information may be obtained. Three example protocols are given: using an mRNA sequence to identify the exon-intron locations and associated gene in the genomic sequence of the same species, using a protein sequence to identify the coding regions in a genomic sequence and to search for gene family members in the same species, and using a protein sequence to find homologs in another species. A support protocol is given to visualize multiple nearby matches obtained in a search in one view of the UCSC Genome Browser. Discussion of the technical aspects of BLAT is also provided.

Bhagwat, Medha; Young, Lynn; Robison, Rex R.

2014-01-01

281

Progress in Understanding and Sequencing the Genome of Brassica rapa  

PubMed Central

Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissions, resulted in the present-day “diploid” Brassica species with variation in chromosome number and phenotype. Triplicated genomic segments of B. rapa are collinear to those of A. thaliana with InDels. The genome triplication has led to an approximately 1.7-fold increase in the B. rapa gene number compared to that of A. thaliana. Repetitive DNA of B. rapa has also been extensively amplified and has diverged from that of A. thaliana. For its whole-genome sequencing, the Brassica rapa Genome Sequencing Project (BrGSP) consortium has developed suitable genomic resources and constructed genetic and physical maps. Ten chromosomes of B. rapa are being allocated to BrGSP consortium participants, and each chromosome will be sequenced by a BAC-by-BAC approach. Genome sequencing of B. rapa will offer a new perspective for plant biology and evolution in the context of polyploidization.

Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo

2008-01-01

282

Complete Genome Sequence of Probiotic Strain Lactobacillus acidophilus La-14.  

PubMed

We present the 1,991,830-bp complete genome sequence of Lactobacillus acidophilus strain La-14 (SD-5212). Comparative genomic analysis revealed 99.98% similarity overall to the L. acidophilus NCFM genome. Globally, 111 single nucleotide polymorphisms (SNPs) (95 SNPs, 16 indels) were observed throughout the genome. Also, a 416-bp deletion in the LA14_1146 sugar ABC transporter was identified. PMID:23788546

Stahl, Buffy; Barrangou, Rodolphe

2013-01-01

283

Complete Genome Sequence of Probiotic Strain Lactobacillus acidophilus La-14  

PubMed Central

We present the 1,991,830-bp complete genome sequence of Lactobacillus acidophilus strain La-14 (SD-5212). Comparative genomic analysis revealed 99.98% similarity overall to the L. acidophilus NCFM genome. Globally, 111 single nucleotide polymorphisms (SNPs) (95 SNPs, 16 indels) were observed throughout the genome. Also, a 416-bp deletion in the LA14_1146 sugar ABC transporter was identified.

Stahl, Buffy

2013-01-01

284

Genome Sequence of the Lager Brewing Yeast, an Interspecies Hybrid  

Microsoft Academic Search

This work presents the genome sequencing of the lager brewing yeast (Saccharomyces pastorianus) Weihenstephan 34\\/70, a strain widely used in lager beer brewing. The 25 Mb genome comprises two nuclear sub-genomes originating from Saccharomyces cerevisiae and Saccharomyces bayanus and one cir- cular mitochondrial genome originating from S. bayanus. Thirty-six different types of chromosomes were found including eight chromosomes with translocations

YOSHIHIRO Nakao; TAKESHI Kanamori; T AKEHIKO Itoh; Y. Kodama; S. Rainieri; N. Nakamura; T. Shimonaga; M. Hattori; T. Ashikari

2009-01-01

285

Complete Chloroplast Genome Sequence of Glycine max and Comparative Analyses with other Legume Genomes  

Microsoft Academic Search

Lack of complete chloroplast genome sequences is still one of the major limitations to extending chloroplast genetic engineering technology to useful crops. Therefore, we sequenced the soybean chloroplast genome and compared it to the other completely sequenced legumes, Lotus and Medicago. The chloroplast genome of Glycine is 152,218 basepairs (bp) in length, including a pair of inverted repeats of 25,574 bp

Christopher Saski; Seung-Bum Lee; Henry Daniell; Todd C. Wood; Jeffrey Tomkins; Hyi-Gyung Kim; Robert K. Jansen

2005-01-01

286

Standards for Sequencing Viral Genomes in the Era of High-Throughput Sequencing  

PubMed Central

ABSTRACT Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques.

Beitzel, Brett; Chain, Patrick S. G.; Davenport, Matthew G.; Donaldson, Eric; Frieman, Matthew; Kugelman, Jeffrey; Kuhn, Jens H.; O'Rear, Jules; Sabeti, Pardis C.; Wentworth, David E.; Wiley, Michael R.; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher

2014-01-01

287

Standards for sequencing viral genomes in the era of high-throughput sequencing.  

PubMed

Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five "standard" categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

Ladner, Jason T; Beitzel, Brett; Chain, Patrick S G; Davenport, Matthew G; Donaldson, Eric F; Frieman, Matthew; Kugelman, Jeffrey R; Kuhn, Jens H; O'Rear, Jules; Sabeti, Pardis C; Wentworth, David E; Wiley, Michael R; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher; Palacios, Gustavo

2014-01-01

288

Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective  

PubMed Central

Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead.

Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

2011-01-01

289

Community-wide analysis of microbial genome sequence signatures  

PubMed Central

Background Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape them. Results We analyzed metagenomic sequence data from two acidophilic biofilm communities, including composite genomes reconstructed for nine archaea, three bacteria, and numerous associated viruses, as well as thousands of unassigned fragments from strain variants and low-abundance organisms. Genome signatures, in the form of tetranucleotide frequencies analyzed by emergent self-organizing maps, segregated sequences from all known populations sharing < 50 to 60% average amino acid identity and revealed previously unknown genomic clusters corresponding to low-abundance organisms and a putative plasmid. Signatures were pervasive genome-wide. Clusters were resolved because intra-genome differences resulting from translational selection or protein adaptation to the intracellular (pH ~5) versus extracellular (pH ~1) environment were small relative to inter-genome differences. We found that these genome signatures stem from multiple influences but are primarily manifested through codon composition, which we propose is the result of genome-specific mutational biases. Conclusions An important conclusion is that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities. Thus, genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities.

Dick, Gregory J; Andersson, Anders F; Baker, Brett J; Simmons, Sheri L; Thomas, Brian C; Yelton, A Pepper; Banfield, Jillian F

2009-01-01

290

Complete genomic sequence of bluetongue virus serotype 16 from China.  

PubMed

We report here the complete genomic sequence of the Chinese bluetongue virus serotype 16 (BTV16) strain BN96/16. This work is the first to document the complete genomic sequence (segments 1 to 10) of a BTV16 strain. The sequence information provided herein will help determine the geographic origin of BTV16 and define the phylogenetic relationship of BTV16 to other BTV strains. PMID:22106384

Yang, Tao; Liu, Nihong; Xu, Qingyuan; Sun, Encheng; Qin, Yongli; Zhao, Jin; Wu, Donglai

2011-12-01

291

Sequencing genomes from single cells by polymerase cloning  

Microsoft Academic Search

Genome sequencing currently requires DNA from pools of numerous nearly identical cells (clones), leaving the genome sequences of many difficult-to-culture microorganisms unattainable. We report a sequencing strategy that eliminates culturing of microorganisms by using real-time isothermal amplification to form polymerase clones (plones) from the DNA of single cells. Two Escherichia coli plones, analyzed by Affymetrix chip hybridization, demonstrate that plonal

Adam C Martiny; Nikos B Reppas; Kerrie W Barry; Joel Malek; Sallie W Chisholm; Kun Zhang; George M Church

2006-01-01

292

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

2002-01-01

293

Genome Sequence of the Trichosporon asahii Environmental Strain CBS 8904  

PubMed Central

This is the first report of the genome sequence of Trichosporon asahii environmental strain CBS 8904, which was isolated from maize cobs. Comparison of the genome sequence with that of clinical strain CBS 2479 revealed that they have >99% chromosomal and mitochondrial sequence identity, yet CBS 8904 has 368 specific genes. Analysis of clusters of orthologous groups predicted that 3,307 genes belong to 23 functional categories and 703 genes were predicted to have a general function.

Li, Hai Tao; Zhu, He; Zhou, Guang Peng; Wang, Meng; Wang, Lei

2012-01-01

294

Complete Genome Sequence of Mycoplasma haemofelis, a Hemotropic Mycoplasma?  

PubMed Central

Here, we present the genome sequence of Mycoplasma haemofelis strain Langford 1, representing the first hemotropic mycoplasma (hemoplasma) species to be completely sequenced and annotated. Originally isolated from a cat with hemolytic anemia, this strain induces severe hemolytic anemia when inoculated into specific-pathogen-free-derived cats. The genome sequence has provided insights into the biology of this uncultivatable hemoplasma and has identified potential molecular mechanisms underlying its pathogenicity.

Barker, Emily N.; Helps, Chris R.; Peters, Iain R.; Darby, Alistair C.; Radford, Alan D.; Tasker, Severine

2011-01-01

295

Whole-genome sequencing identifies recurrent mutations in hepatocellular carcinoma  

PubMed Central

Hepatocellular carcinoma (HCC) is one of the most deadly cancers worldwide and has no effective treatment, yet the molecular basis of hepatocarcinogenesis remains largely unknown. Here we report findings from a whole-genome sequencing (WGS) study of 88 matched HCC tumor/normal pairs, 81 of which are Hepatitis B virus (HBV) positive, seeking to identify genetically altered genes and pathways implicated in HBV-associated HCC. We find beta-catenin to be the most frequently mutated oncogene (15.9%) and TP53 the most frequently mutated tumor suppressor (35.2%). The Wnt/beta-catenin and JAK/STAT pathways, altered in 62.5% and 45.5% of cases, respectively, are likely to act as two major oncogenic drivers in HCC. This study also identifies several prevalent and potentially actionable mutations, including activating mutations of Janus kinase 1 (JAK1), in 9.1% of patients and provides a path toward therapeutic intervention of the disease.

Kan, Zhengyan; Zheng, Hancheng; Liu, Xiao; Li, Shuyu; Barber, Thomas D.; Gong, Zhuolin; Gao, Huan; Hao, Ke; Willard, Melinda D.; Xu, Jiangchun; Hauptschein, Robert; Rejto, Paul A.; Fernandez, Julio; Wang, Guan; Zhang, Qinghui; Wang, Bo; Chen, Ronghua; Wang, Jian; Lee, Nikki P.; Zhou, Wei; Lin, Zhao; Peng, Zhiyu; Yi, Kang; Chen, Shengpei; Li, Lin; Fan, Xiaomei; Yang, Jie; Ye, Rui; Ju, Jia; Wang, Kai; Estrella, Heather; Deng, Shibing; Wei, Ping; Qiu, Ming; Wulur, Isabella H.; Liu, Jiangang; Ehsani, Mariam E.; Zhang, Chunsheng; Loboda, Andrey; Sung, Wing Kin; Aggarwal, Amit; Poon, Ronnie T.; Fan, Sheung Tat; Wang, Jun; Hardwick, James; Reinhard, Christoph; Dai, Hongyue; Li, Yingrui; Luk, John M.; Mao, Mao

2013-01-01

296

Development of genomic resources in support of sequencing, assembly, and annotation of the catfish genome.  

PubMed

Major progress has been made in catfish genomics including construction of high-density genetic linkage maps, BAC-based physical maps, and integration of genetic linkage and physical maps. Large numbers of ESTs have been generated from both channel catfish and blue catfish. Microarray platforms have been developed for the analysis of genome expression. Genome repeat structures are studied, laying grounds for whole genome sequencing. USDA recently approved funding of the whole genome sequencing project of catfish using the next generation sequencing technologies. Generation of the whole genome sequence is a historical landmark of catfish research as it opens the real first step of the long march toward genetic enhancement. The research community needs to be focused on aquaculture performance and production traits, take advantage of the unprecedented genome information and technology, and make real progress toward genetic improvements of aquaculture brood stocks. PMID:20430707

Liu, Zhanjiang

2011-03-01

297

Data structures and compression algorithms for genomic sequence data  

PubMed Central

Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data. Results: The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences. These locations and the corresponding differential variants can be encoded into binary strings using various entropy coding methods, from fixed codes such as Golomb and Elias codes, to variables codes, such as Huffman codes. We demonstrate the approach and various tradeoffs using highly variables human mitochondrial genome sequences as a testbed. With only a partial level of optimization, 3615 genome sequences occupying 56 MB in GenBank are compressed down to only 167 KB, achieving a 345-fold compression rate, using the revised Cambridge Reference Sequence as the reference sequence. Using the consensus sequence as the reference sequence, the data can be stored using only 133 KB, corresponding to a 433-fold level of compression, roughly a 23% improvement. Extensions to nuclear genomes and high-throughput sequencing data are discussed. Availability: Data are publicly available from GenBank, the HapMap web site, and the MITOMAP database. Supplementary materials with additional results, statistics, and software implementations are available from http://mammag.web.uci.edu/bin/view/Mitowiki/ProjectDNACompression. Contact: pfbaldi@ics.uci.edu

Brandon, Marty C.; Wallace, Douglas C.; Baldi, Pierre

2009-01-01

298

Savant: genome browser for high-throughput sequencing data  

PubMed Central

Motivation: The advent of high-throughput sequencing (HTS) technologies has made it affordable to sequence many individuals' genomes. Simultaneously the computational analysis of the large volumes of data generated by the new sequencing machines remains a challenge. While a plethora of tools are available to map the resulting reads to a reference genome, and to conduct primary analysis of the mappings, it is often necessary to visually examine the results and underlying data to confirm predictions and understand the functional effects, especially in the context of other datasets. Results: We introduce Savant, the Sequence Annotation, Visualization and ANalysis Tool, a desktop visualization and analysis browser for genomic data. Savant was developed for visualizing and analyzing HTS data, with special care taken to enable dynamic visualization in the presence of gigabases of genomic reads and references the size of the human genome. Savant supports the visualization of genome-based sequence, point, interval and continuous datasets, and multiple visualization modes that enable easy identification of genomic variants (including single nucleotide polymorphisms, structural and copy number variants), and functional genomic information (e.g. peaks in ChIP-seq data) in the context of genomic annotations. Availability: Savant is freely available at http://compbio.cs.toronto.edu/savant Contact: savant@cs.toronto.edu

Fiume, Marc; Williams, Vanessa; Brook, Andrew; Brudno, Michael

2010-01-01

299

Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.  

PubMed

Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources. PMID:23593174

Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

2013-01-01

300

Identification of Optimum Sequencing Depth Especially for De Novo Genome Assembly of Small Genomes Using Next Generation Sequencing Data  

PubMed Central

Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

2013-01-01

301

Whole-exome targeted sequencing of the uncharacterized pine genome.  

PubMed

The large genome size of many species hinders the development and application of genomic tools to study them. For instance, loblolly pine (Pinus taeda L.), an ecologically and economically important conifer, has a large and yet uncharacterized genome of 21.7 Gbp. To characterize the pine genome, we performed exome capture and sequencing of 14 729 genes derived from an assembly of expressed sequence tags. Efficiency of sequence capture was evaluated and shown to be similar across samples with increasing levels of complexity, including haploid cDNA, haploid genomic DNA and diploid genomic DNA. However, this efficiency was severely reduced for probes that overlapped multiple exons, presumably because intron sequences hindered probe:exon hybridizations. Such regions could not be entirely avoided during probe design, because of the lack of a reference sequence. To improve the throughput and reduce the cost of sequence capture, a method to multiplex the analysis of up to eight samples was developed. Sequence data showed that multiplexed capture was reproducible among 24 haploid samples, and can be applied for high-throughput analysis of targeted genes in large populations. Captured sequences were de novo assembled, resulting in 11 396 expanded and annotated gene models, significantly improving the knowledge about the pine gene space. Interspecific capture was also evaluated with over 98% of all probes designed from P. taeda that were efficient in sequence capture, were also suitable for analysis of the related species Pinus elliottii Engelm. PMID:23551702

Neves, Leandro G; Davis, John M; Barbazuk, William B; Kirst, Matias

2013-07-01

302

Genome-Wide Epigenetic Modifications in Cancer  

Microsoft Academic Search

\\u000a Epigenetic alterations in cancer include changes in DNA methylation and associated histone modifications that influence the\\u000a chromatin states and impact gene expression patterns. Due to recent technological advantages, the scientific community is\\u000a now obtaining a better picture of the genome-wide epigenetic changes that occur in a cancer genome. These epigenetic alterations\\u000a are associated with chromosomal instability and changes in transcriptional

Yoon Jung Park; Rainer Claus; Dieter Weichenhan; Christoph Plass

303

Understanding Cancer Series: Genome-Wide Profiling  

Cancer.gov

A Locally Focused Search Chromosome Continent Country U.S. State World Genome Cell Chemical bases Gene A G T C Single-gene tests focus on a specific, known location in a patient’s genome. Using this approach, scientists have looked for single genes linked to cancer. This research has revealed some important discoveries such as gene changes called mutations located within the BRCA1 or BRCA2 genes that may confer a significantly increased risk of breast and ovarian cancer.

304

Nucleotide sequence and genome organization of carnation mottle virus RNA.  

PubMed Central

The complete nucleotide sequence of carnation mottle genomic RNA (4003 nucleotides) is presented. The sequence was determined for cloned cDNA copies of viral RNA containing over 99% of the sequence and was completed by direct sequence analysis of RNA and cDNA transcripts. The sequence contains two long open reading frames which together can account for observed translation products. One translation product would arise by suppression of an amber termination codon and the sequence raises the possibility that a second suppression event could also occur. Sequence homology exists between a portion of the carnation mottle virus sequence and that of putative RNA polymerases from other RNA viruses. Images

Guilley, H; Carrington, J C; Balazs, E; Jonard, G; Richards, K; Morris, T J

1985-01-01

305

Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer  

SciTech Connect

Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes to cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also reveal the molecular differences between cancer and normal that may be exploited to therapeutic benefit or that provide targets for molecular assays that may enable early cancer detection, and predict individual disease progression or response to treatment. This chapter reviews current and future directions in genome analysis and summarizes studies that provide insights into breast cancer pathophysiology or that suggest strategies to improve breast cancer management.

Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

2009-06-18

306

Complete genome sequence of Spirosoma linguale type strain (1T)  

PubMed Central

Spirosoma linguale Migula 1894 is the type species of the genus. S. linguale is a free-living and non-pathogenic organism, known for its peculiar ringlike and horseshoe-shaped cell morphology. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is only the third completed genome sequence of a member of the family Cytophagaceae. The 8,491,258 bp long genome with its eight plasmids, 7,069 protein-coding and 60 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Lail, Kathleen; Sikorski, Johannes; Saunders, Elizabeth; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Chain, Patrick; Brettin, Thomas; Detter, John C.; Schutze, Andrea; Rohde, Manfred; Tindall, Brian J.; Goker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Chen, Feng

2010-01-01

307

Genome sequence of the Nocardia bacteriophage NBR1.  

PubMed

We here characterize a novel bacteriophage (NBR1) that is lytic for Nocardia otitidiscaviarum and N. brasiliensis. NBR1 is a member of the family Siphoviridae and appears to have a structurally more complex tail than previously reported Siphoviridae phages. NBR1 has a linear genome of 46,140 bp and a sequence that appears novel when compared to those of other phage sequences in GenBank. Annotation of the genome reveals 68 putative open reading frames. The phage genome organization appears to be similar to other Siphoviridae phage genomes in that it has a modular arrangement. PMID:23913189

Petrovski, Steve; Seviour, Robert J; Tillett, Daniel

2014-01-01

308

Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)  

PubMed Central

Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia C.; Chain, Patrick; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

2009-01-01

309

Complete genome sequence of Sulfurospirillum deleyianum type strain (5175T)  

SciTech Connect

Sulfurospirillum deleyianum Schumacher et al. 1993 is the type species of the genus Sulfurospirillum. S. deleyianum is a model organism for studying sulfur reduction and dissimilatory nitrate reduction as energy source for growth. Also, it is a prominent model organism for studying the structural and functional characteristics of the cytochrome c nitrite reductase. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the genus Sulfurospirillum. The 2,306,351 bp long genome with its 2291 protein-coding and 52 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Lang, Elke [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

2010-01-01

310

Complete genome sequence of Thermomonospora curvata type strain (B9)  

SciTech Connect

Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Chertkov, Olga [Los Alamos National Laboratory (LANL); Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California

2011-01-01

311

Complete genome sequence of Spirosoma linguale type strain (1T)  

SciTech Connect

Spirosoma linguale Migula 1894 is the type species of the genus. S. linguale is a free-living and non-pathogenic organism, known for its peculiar ringlike and horseshoe-shaped cell morphology. Here we describe the features of this organism, together with the complete ge-nome sequence and annotation. This is only the third completed genome sequence of a member of the family Cytophagaceae. The 8,491,258 bp long genome with its eight plas-mids, 7,069 protein-coding and 60 RNA genes is part of the Genomic Encyclopedia of Bacte-ria and Archaea project.

Lail, Kathleen [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Schutze, Andrea [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chen, Feng [U.S. Department of Energy, Joint Genome Institute

2010-01-01

312

Comparative Analysis of Rice Genome Sequence to Understand the Molecular Basis of Genome Evolution  

Microsoft Academic Search

Accurate sequencing of the rice genome has ignited a passion for elucidating mechanism for sequence diversity among rice varieties\\u000a and species, both in protein-coding regions and in genomic regions that are important for chromosome functions. Here, we have\\u000a shown examples of sequence diversity in genic and non-genic regions. Sequence analysis of chromosome ends has revealed that\\u000a there is diversity in

Jianzhong Wu; Hiroshi Mizuno; Takuji Sasaki; Takashi Matsumoto

2008-01-01

313

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, devel- ops and maintains genome oriented databases. It is commonplace that the amount of sequence data avail- able increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. There- fore, our strategy aims to cope with the data stream by the comprehensive application of

Hans-werner Mewes; Klaus Heumann; Andreas Kaps; Klaus F. X. Mayer; Friedhelm Pfeiffer; S. Stocker; Dmitrij Frishman

1999-01-01

314

Complete Genome Sequence of a Polyomavirus Isolated from Horses  

PubMed Central

A polyomavirus was isolated from the eyes of horses, and the sequence was determined. A nearly identical VP1 sequence was amplified from the kidney of another animal. We report the complete genome sequence of the first polyomavirus to be isolated from a horse. Analysis shows it to be most closely related overall to human and nonhuman primate polyomaviruses.

Wise, Annabel G.; Maes, Roger K.

2012-01-01

315

Genome sequencing: a systematic review of health economic evidence  

PubMed Central

Recently the sequencing of the human genome has become a major biological and clinical research field. However, the public health impact of this new technology with focus on the financial effect is not yet to be foreseen. To provide an overview of the current health economic evidence for genome sequencing, we conducted a thorough systematic review of the literature from 17 databases. In addition, we conducted a hand search. Starting with 5 520 records we ultimately included five full-text publications and one internet source, all focused on cost calculations. The results were very heterogeneous and, therefore, difficult to compare. Furthermore, because the methodology of the publications was quite poor, the reliability and validity of the results were questionable. The real costs for the whole sequencing workflow, including data management and analysis, remain unknown. Overall, our review indicates that the current health economic evidence for genome sequencing is quite poor. Therefore, we listed aspects that needed to be considered when conducting health economic analyses of genome sequencing. Thereby, specifics regarding the overall aim, technology, population, indication, comparator, alternatives after sequencing, outcomes, probabilities, and costs with respect to genome sequencing are discussed. For further research, at the outset, a comprehensive cost calculation of genome sequencing is needed, because all further health economic studies rely on valid cost data. The results will serve as an input parameter for budget-impact analyses or cost-effectiveness analyses.

2013-01-01

316

Exploring Microbial Genome Sequences to Identify Protein Families on the Grid.  

National Technical Information Service (NTIS)

The analysis of microbial genome sequences can identify protein families that provide potential drug targets for new antibiotics. With the rapid accumulation of newly sequenced genomes, the analysis of complete genome sequences has become a computationall...

Y. Sun A. Wipat M. Pocock P. Lee K. Flanagan J. Worthington

2005-01-01

317

AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome  

PubMed Central

Background To complement next-generation sequencing technologies, there is a pressing need for efficient pre-sequencing capture methods with reduced costs and DNA requirement. The Alu family of short interspersed nucleotide elements is the most abundant type of transposable elements in the human genome and a recognized source of genome instability. With over one million Alu elements distributed throughout the genome, they are well positioned to facilitate genome-wide sequence amplification and capture of regions likely to harbor genetic variation hotspots of biological relevance. Results Here we report on the use of inter-Alu PCR with an enhanced range of amplicons in conjunction with next-generation sequencing to generate an Alu-anchored scan, or 'AluScan', of DNA sequences between Alu transposons, where Alu consensus sequence-based 'H-type' PCR primers that elongate outward from the head of an Alu element are combined with 'T-type' primers elongating from the poly-A containing tail to achieve huge amplicon range. To illustrate the method, glioma DNA was compared with white blood cell control DNA of the same patient by means of AluScan. The over 10 Mb sequences obtained, derived from more than 8,000 genes spread over all the chromosomes, revealed a highly reproducible capture of genomic sequences enriched in genic sequences and cancer candidate gene regions. Requiring only sub-micrograms of sample DNA, the power of AluScan as a discovery tool for genetic variations was demonstrated by the identification of 357 instances of loss of heterozygosity, 341 somatic indels, 274 somatic SNVs, and seven potential somatic SNV hotspots between control and glioma DNA. Conclusions AluScan, implemented with just a small number of H-type and T-type inter-Alu PCR primers, provides an effective capture of a diversity of genome-wide sequences for analysis. The method, by enabling an examination of gene-enriched regions containing exons, introns, and intergenic sequences with modest capture and sequencing costs, computation workload and DNA sample requirement is particularly well suited for accelerating the discovery of somatic mutations, as well as analysis of disease-predisposing germline polymorphisms, by making possible the comparative genome-wide scanning of DNA sequences from large human cohorts.

2011-01-01

318

Isolation and bioinformatics analysis of differentially methylated genomic fragments in human gastric cancer  

Microsoft Academic Search

AIM: To isolate and analyze the DNA sequences which are methylated differentially between gastric cancer and normal gastric mucosa. METHODS: The differentially methylated DNA sequences between gastric cancer and normal gastric mucosa were isolated by methylation-sensitive representational difference analysis (MS-RDA). Similarities between the separated fragments and the human genomic DNA were analyzed with Basic Local Alignment Search Tool (BLAST). RESULTS:

Ai-Jun Liao; Qi Su; Xun Wang; Bin Zeng; Wei Shi

319

Complete genome sequence of Staphylothermus hellenicus P8T  

SciTech Connect

Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phy- lum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shal- low hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the com- plete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein- coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) La- boratory Sequencing Program (LSP) project.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Davenport, Karen W. [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

2011-01-01

320

Complete genome sequence of Staphylothermus hellenicus P8.  

PubMed

Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phylum Crenarchaeota. Strain P8(T) is the type strain of the species and was isolated from a shallow hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the complete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein-coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) Laboratory Sequencing Program (LSP) project. PMID:22180806

Anderson, Iain; Wirth, Reinhard; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Fang; Goodwin, Lynne; Pitluck, Samuel; Davenport, Karen; Detter, John C; Han, Cliff; Tapia, Roxanne; Land, Miriam; Hauser, Loren; Pati, Amrita; Mikhailova, Natalia; Woyke, Tanja; Klenk, Hans-Peter; Kyrpides, Nikos; Ivanova, Natalia

2011-10-15

321

Management of Incidental Findings in Clinical Genomic Sequencing  

PubMed Central

Genomic sequencing is becoming accurate, fast, and inexpensive, and is rapidly being incorporated into clinical practice. Incidental findings, which result in large numbers from genomic sequencing, are a potential barrier to the utility of this new technology due to their high prevalence and the lack of evidence or guidelines available to guide their clinical interpretation. This unit reviews the definition, classification, and management of incidental findings from genomic sequencing. The unit focuses on the clinical aspects of handling incidental findings, with an emphasis on the key role of clinical context in defining incidental findings and determining their clinical relevance and utility.

Krier, Joel B.; Green, Robert C.

2013-01-01

322

Sequencing approach evaluates all 24 genes implicated in breast cancer  

Cancer.gov

Since 1994, many thousands of women with breast cancer from families severely affected with the disease have been tested for inherited mutations in BRCA1 and BRCA2. The vast majority of those patients were told that their gene sequences were normal. With the development of modern genomics sequencing tools, the discovery of additional genes implicated in breast cancer and the change in the legal status of genetic testing for BRCA1 and BRCA2, it is now possible to determine how often families in these circumstances actually do carry cancer-predisposing mutations in BRCA1, BRCA2, or another gene implicated in breast cancer, despite the results of their previous genetic tests. The results were presented Oct. 24, by researchers from the University of Washington (which is affiliated with the Fred Hutchinson Cancer Research Center) at the American Society of Human Genetics 2013 meeting in Boston.

323

Genomics of Squamous Cell Lung Cancer  

PubMed Central

Approximately 30% of patients with non-small cell lung cancer have the squamous cell carcinoma (SQCC) histological subtype. Although targeted therapies have improved outcomes in patients with adenocarcinoma, no agents are currently approved specifically for use in SQCC. The Cancer Genome Atlas (TCGA) recently published the results of comprehensive genomic analyses of tumor samples from 178 patients with SQCC of the lung. In this review, we briefly discuss key molecular aberrations reported by TCGA and other investigators and their potential therapeutic implications. Carefully designed preclinical and clinical studies based on these large-scale genomic analyses are critical to improve the outcomes of patients with SQCC of lung in the near future.

Rooney, Melissa; Devarakonda, Siddhartha

2013-01-01

324

Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens  

PubMed Central

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well.

Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, Jozsef; Richardson, James E.; Bakker, Freek T.

2013-01-01

325

Sequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations  

PubMed Central

Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences—which we call recombinant population genome construction—that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross.

Hahn, Matthew W.; Zhang, Simo V.; Moyle, Leonie C.

2014-01-01

326

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change  

SciTech Connect

In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

2011-04-29

327

A Genome-Wide Analysis of FRT-Like Sequences in the Human Genome  

PubMed Central

Efficient and precise genome manipulations can be achieved by the Flp/FRT system of site-specific DNA recombination. Applications of this system are limited, however, to cases when target sites for Flp recombinase, FRT sites, are pre-introduced into a genome locale of interest. To expand use of the Flp/FRT system in genome engineering, variants of Flp recombinase can be evolved to recognize pre-existing genomic sequences that resemble FRT and thus can serve as recombination sites. To understand the distribution and sequence properties of genomic FRT-like sites, we performed a genome-wide analysis of FRT-like sites in the human genome using the experimentally-derived parameters. Out of 642,151 identified FRT-like sequences, 581,157 sequences were unique and 12,452 sequences had at least one exact duplicate. Duplicated FRT-like sequences are located mostly within LINE1, but also within LTRs of endogenous retroviruses, Alu repeats and other repetitive DNA sequences. The unique FRT-like sequences were classified based on the number of matches to FRT within the first four proximal bases pairs of the Flp binding elements of FRT and the nature of mismatched base pairs in the same region. The data obtained will be useful for the emerging field of genome engineering.

Shultz, Jeffry L.; Voziyanova, Eugenia; Konieczka, Jay H.; Voziyanov, Yuri

2011-01-01

328

Mechanisms of base substitution mutagenesis in cancer genomes.  

PubMed

Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules. PMID:24705290

Bacolla, Albino; Cooper, David N; Vasquez, Karen M

2014-01-01

329

Mechanisms of Base Substitution Mutagenesis in Cancer Genomes  

PubMed Central

Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules.

Bacolla, Albino; Cooper, David N.; Vasquez, Karen M.

2014-01-01

330

Genome Sequence of Pseudomonas brassicacearum DF41  

PubMed Central

Pseudomonas brassicacearum DF41, a Gram-negative soil bacterium, is able to suppress the fungal pathogen Sclerotinia sclerotiorum through a process known as biological control. Here, we present a 6.8-Mb assembly of its genome, which is the second fully assembled genome of a P. brassicacearum strain.

Loewen, Peter C.; Switala, Jack; Fernando, W. G. Dilantha

2014-01-01

331

Research Resources for Cancer Epidemiology and Genomics  

Cancer.gov

The Epidemiology and Genomics Research Program (EGRP) has developed a list with links to a number of cancer-related research resources available through EGRP-supported cohorts, consortia, and initiatives; other research programs in the Division of Cancer Control and Population Sciences and NCI; and partners elsewhere at NIH and other research organizations.

332

Assembly of large genomes using second-generation sequencing  

PubMed Central

Second-generation sequencing technology can now be used to sequence an entire human genome in a matter of days and at low cost. Sequence read lengths, initially very short, have rapidly increased since the technology first appeared, and we now are seeing a growing number of efforts to sequence large genomes de novo from these short reads. In this Perspective, we describe the issues associated with short-read assembly, the different types of data produced by second-gen sequencers, and the latest assembly algorithms designed for these data. We also review the genomes that have been assembled recently from short reads and make recommendations for sequencing strategies that will yield a high-quality assembly.

Schatz, Michael C.; Delcher, Arthur L.; Salzberg, Steven L.

2010-01-01

333

Genome sequencing and analysis of the model grass Brachypodium distachyon  

SciTech Connect

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

2010-01-01

334

Genome sequencing and analysis of the model grass Brachypodium distachyon.  

PubMed

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

2010-02-11

335

Complete genome sequence of Cellulomonas flavigena type strain (134T)  

SciTech Connect

Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Foster, Brian [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

2010-01-01

336

A comprehensive catalogue of somatic mutations from a human cancer genome.  

PubMed

All cancers carry somatic mutations. A subset of these somatic alterations, termed driver mutations, confer selective growth advantage and are implicated in cancer development, whereas the remainder are passengers. Here we have sequenced the genomes of a malignant melanoma and a lymphoblastoid cell line from the same person, providing the first comprehensive catalogue of somatic mutations from an individual cancer. The catalogue provides remarkable insights into the forces that have shaped this cancer genome. The dominant mutational signature reflects DNA damage due to ultraviolet light exposure, a known risk factor for malignant melanoma, whereas the uneven distribution of mutations across the genome, with a lower prevalence in gene footprints, indicates that DNA repair has been preferentially deployed towards transcribed regions. The results illustrate the power of a cancer genome sequence to reveal traces of the DNA damage, repair, mutation and selection processes that were operative years before the cancer became symptomatic. PMID:20016485

Pleasance, Erin D; Cheetham, R Keira; Stephens, Philip J; McBride, David J; Humphray, Sean J; Greenman, Chris D; Varela, Ignacio; Lin, Meng-Lay; Ordóñez, Gonzalo R; Bignell, Graham R; Ye, Kai; Alipaz, Julie; Bauer, Markus J; Beare, David; Butler, Adam; Carter, Richard J; Chen, Lina; Cox, Anthony J; Edkins, Sarah; Kokko-Gonzales, Paula I; Gormley, Niall A; Grocock, Russell J; Haudenschild, Christian D; Hims, Matthew M; James, Terena; Jia, Mingming; Kingsbury, Zoya; Leroy, Catherine; Marshall, John; Menzies, Andrew; Mudie, Laura J; Ning, Zemin; Royce, Tom; Schulz-Trieglaff, Ole B; Spiridou, Anastassia; Stebbings, Lucy A; Szajkowski, Lukasz; Teague, Jon; Williamson, David; Chin, Lynda; Ross, Mark T; Campbell, Peter J; Bentley, David R; Futreal, P Andrew; Stratton, Michael R

2010-01-14

337

Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato.  

PubMed

Here, we report the genome sequence of a novel iflavirus strain recovered from the neotropical butterfly Heliconius erato. The coding DNA sequence (CDS) of the iflavirus genome was 8,895 nucleotides in length, encoding a polyprotein that was 2,965 amino acids long. PMID:24831145

Smith, Gilbert; Macias-Muñoz, Aide; Briscoe, Adriana D

2014-01-01

338

Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato  

PubMed Central

Here, we report the genome sequence of a novel iflavirus strain recovered from the neotropical butterfly Heliconius erato. The coding DNA sequence (CDS) of the iflavirus genome was 8,895 nucleotides in length, encoding a polyprotein that was 2,965 amino acids long.

Macias-Munoz, Aide; Briscoe, Adriana D.

2014-01-01

339

Sequencing techniques uncover mutations in genes that can increase cancer risk  

Cancer.gov

Now that the findings from the Human Genome Project are widely available, scientists are working to put that data to work to understand the genetic causes of many diseases, including cancer, by using the latest sequencing techniques.

340

Diverse mechanisms of somatic structural variations in human cancer genomes  

PubMed Central

Summary Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.

Yang, Lixing; Luquette, Lovelace J.; Gehlenborg, Nils; Xi, Ruibin; Haseley, Psalm S.; Hsieh, Chih-Heng; Zhang, Chengsheng; Ren, Xiaojia; Protopopov, Alexei; Chin, Lynda; Kucherlapati, Raju; Lee, Charles; Park, Peter J.

2013-01-01

341

Complete Genome Sequences of Six Strains of the Genus Methylobacterium  

SciTech Connect

The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; UI Hague, Muhammad Farhan [University of Strasbourg; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanov, Pavel S. [University of Wyoming, Laramie; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

2012-01-01

342

Draft Genome Sequence of Lactobacillus animalis 381-IL-28.  

PubMed

Lactobacillus animalis 381-IL-28 is an integral component of a multistrain commercial culture with food biopreservative and pathogen biocontrol functionality. A draft sequence of the L. animalis 381-IL-28 genome is described in this paper. PMID:24874675

Sturino, Joseph M; Rajendran, Mahitha; Altermann, Eric

2014-01-01

343

Draft Genome Sequence of Staphylococcus massiliensis Strain 5402776T  

PubMed Central

A draft genome sequence of Staphylococcus massiliensis, Gram-positive cocci isolated from a human brain abscess sample, is described here. One clustered regularly interspaced short palindromic repeat, three transposases, six putative transposases, and one potential provirus were characterized.

Robert, Catherine; Gimenez, Gregory; Raoult, Didier

2012-01-01

344

Genome Sequence of the Immunomodulatory Strain Bifidobacterium bifidum LMG 13195  

PubMed Central

In this work, we report the genome sequences of Bifidobacterium bifidum strain LMG13195. Results from our research group show that this strain is able to interact with human immune cells, generating functional regulatory T cells.

Gueimonde, Miguel; Ventura, Marco; Margolles, Abelardo

2012-01-01

345

Complete genome sequences of six strains of the genus Methylobacterium.  

PubMed

The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints. PMID:22887658

Marx, Christopher J; Bringel, Françoise; Chistoserdova, Ludmila; Moulin, Lionel; Farhan Ul Haque, Muhammad; Fleischman, Darrell E; Gruffaz, Christelle; Jourand, Philippe; Knief, Claudia; Lee, Ming-Chun; Muller, Emilie E L; Nadalig, Thierry; Peyraud, Rémi; Roselli, Sandro; Russ, Lina; Goodwin, Lynne A; Ivanova, Natalia; Kyrpides, Nikos; Lajus, Aurélie; Land, Miriam L; Médigue, Claudine; Mikhailova, Natalia; Nolan, Matt; Woyke, Tanja; Stolyar, Sergey; Vorholt, Julia A; Vuilleumier, Stéphane

2012-09-01

346

Significant Sequences: Genomics Activities for Advanced Biology Students  

NSDL National Science Digital Library

Significant Sequences, developed by Washington UniversityâÂÂs Science Outreach Program and written by faculty and high school teachers, is a publication that focuses on the importance of genomic data and how the data are discovered and used.

Kathryn Gail Miller (Washington University;)

2010-06-17

347

Fulfilling the Promise of a Sequenced Human Genome – Part II  

SciTech Connect

Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

Green, Eric [National Human Genome Research Institute

2009-05-27

348

Fulfilling the Promise of a Sequenced Human Genome – Part I  

SciTech Connect

Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

Green, Eric [National Human Genome Research Institute

2009-05-27

349

Draft Genome Sequence of Lactobacillus animalis 381-IL-28  

PubMed Central

Lactobacillus animalis 381-IL-28 is an integral component of a multistrain commercial culture with food biopreservative and pathogen biocontrol functionality. A draft sequence of the L. animalis 381-IL-28 genome is described in this paper.

Rajendran, Mahitha; Altermann, Eric

2014-01-01

350

Complete Genome Sequence of Porcine Encephalomyocarditis Virus Strain BD2  

PubMed Central

Encephalomyocarditis virus (EMCV) causes acute myocarditis in young pigs or reproductive failure in sows, and it is divided into two main groups. Here, we report the complete genome sequence of EMCV strain BD2, which belongs to group I.

Yuan, Wanzhe; Zhang, Xiuyuan

2013-01-01

351

Genome Sequence of Bacillus thuringiensis subsp. kurstaki Strain HD-1  

PubMed Central

We report here the complete genome sequence of Bacillus thuringiensis subsp. kurstaki strain HD-1, which serves as the primary U.S. reference standard for all commercial insecticidal formulations of B. thuringiensis manufactured around the world.

Day, Michael; Ibrahim, Mohamed; Dyer, David

2014-01-01

352

Complete Genome Sequence of Pseudomonas denitrificans ATCC 13867.  

PubMed

Pseudomonas denitrificans ATCC 13867, a Gram-negative facultative anaerobic bacterium, is known to produce vitamin B12 under aerobic conditions. This paper reports the annotated whole-genome sequence of the circular chromosome of this organism. PMID:23723394

Ainala, Satish Kumar; Somasundar, Ashok; Park, Sunghoon

2013-01-01

353

Next-Generation Sequence Analysis of Cancer Xenograft Models  

PubMed Central

Next-generation sequencing (NGS) studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC), a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations.

Rossello, Fernando J.; Tothill, Richard W.; Britt, Kara; Marini, Kieren D.; Falzon, Jeanette; Thomas, David M.; Peacock, Craig D.; Marchionni, Luigi; Li, Jason; Bennett, Samara; Tantoso, Erwin; Brown, Tracey; Chan, Philip; Martelotto, Luciano G.; Watkins, D. Neil

2013-01-01

354

A comprehensive catalogue of somatic mutations from a human cancer genome  

Microsoft Academic Search

All cancers carry somatic mutations. A subset of these somatic alterations, termed driver mutations, confer selective growth advantage and are implicated in cancer development, whereas the remainder are passengers. Here we have sequenced the genomes of a malignant melanoma and a lymphoblastoid cell line from the same person, providing the first comprehensive catalogue of somatic mutations from an individual cancer.

Erin D. Pleasance; R. Keira Cheetham; Philip J. Stephens; David J. McBride; Sean J. Humphray; Chris D. Greenman; Ignacio Varela; Meng-Lay Lin; Gonzalo R. Ordóñez; Graham R. Bignell; Kai Ye; Julie Alipaz; Markus J. Bauer; David Beare; Adam Butler; Richard J. Carter; Lina Chen; Anthony J. Cox; Sarah Edkins; Paula I. Kokko-Gonzales; Niall A. Gormley; Russell J. Grocock; Christian D. Haudenschild; Matthew M. Hims; Terena James; Mingming Jia; Zoya Kingsbury; Catherine Leroy; John Marshall; Andrew Menzies; Laura J. Mudie; Zemin Ning; Tom Royce; Ole B. Schulz-Trieglaff; Anastassia Spiridou; Lucy A. Stebbings; Lukasz Szajkowski; Jon Teague; David Williamson; Lynda Chin; Mark T. Ross; Peter J. Campbell; David R. Bentley; P. Andrew Futreal; Michael R. Stratton

2010-01-01

355

Complete Genome Sequence of Methanomassiliicoccus luminyensis, the Largest Genome of a Human-Associated Archaea Species  

PubMed Central

The present study describes the complete and annotated genome sequence of Methanomassiliicoccus luminyensis strain B10 (DSM 24529T, CSUR P135), which was isolated from human feces. The 2.6-Mb genome represents the largest genome of a methanogenic euryarchaeon isolated from humans. The genome data of M. luminyensis reveal unique features and horizontal gene transfer events, which might have occurred during its adaptation and/or evolution in the human ecosystem.

Gorlas, Aurore; Robert, Catherine; Gimenez, Gregory; Drancourt, Michel

2012-01-01

356

Genome Science and Personalized Cancer Treatment  

SciTech Connect

August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

Joe Gray

2009-08-07

357

Genome Science and Personalized Cancer Treatment  

ScienceCinema

August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

Joe Gray

2010-01-08

358

Genome Sequence of Fusarium graminearum Isolate CS3005.  

PubMed

Fusarium graminearum is one of the most important fungal pathogens of wheat, barley, and maize worldwide. This announcement reports the genome sequence of a highly virulent Australian isolate of this species to supplement the existing genome of the North American F. graminearum isolate Ph1. PMID:24744326

Gardiner, Donald M; Stiller, Jiri; Kazan, Kemal

2014-01-01

359

Complete Genome Sequence of Marinobacter sp. BSs20148.  

PubMed

Marinobacter sp. BSs20148 was isolated from marine sediment collected from the Arctic Ocean at a water depth of 3,800 m. Here we report the complete genome sequence of Marinobacter sp. BSs20148. This genomic information will facilitate the study of the physiological metabolism, ecological roles, and evolution of the Marinobacter species. PMID:23682144

Song, Lai; Ren, Lufeng; Li, Xingang; Yu, Dan; Yu, Yong; Wang, Xumin; Liu, Guiming

2013-01-01

360

Genome Sequence of a Salinibacterium sp. Isolated from Antarctic Soil  

PubMed Central

The draft genome of Salinibacterium sp. PAMC 21357, isolated from permafrost soil of Antarctica, was determined. Here we present a 3.1-Mb draft genome sequence of Salinibacterium sp. that could provide further insight into the genetic determination of its cold-adaptive properties.

Shin, Seung Chul; Kim, Su Jin; Ahn, Do Hwan; Lee, Jong Kyu; Lee, Hyoungseok; Lee, Jungeun; Hong, Soon Gyu; Lee, Yung Mi

2012-01-01

361

Draft Genome Sequence of Mycobacterium mageritense DSM 44476T  

PubMed Central

We report the draft genome sequence of Mycobacterium mageritense strain DSM 44476T (CIP 104973), a nontuberculosis species responsible for various infections. The genome described here is composed of 7,966,608 bp, with a G+C content of 66.95%, and contains 7,675 protein-coding genes and 120 predicted RNA genes.

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

362

Complete Genome Sequence of Lactococcus lactis subsp. cremoris A76  

PubMed Central

We report the complete genome sequence of Lactococcus lactis subsp. cremoris A76, a dairy strain isolated from a cheese production outfit. Genome analysis detected two contiguous islands fitting to the L. lactis subsp. lactis rather than to the L. lactis subsp. cremoris lineage. This indicates the existence of genetic exchange between the diverse subspecies, presumably related to the technological process.

Quinquis, Benoit; Ehrlich, Stanislas Dusko; Sorokin, Alexei

2012-01-01

363

The genome sequence of the rice blast fungus Magnaporthe grisea  

Microsoft Academic Search

Magnaporthe grisea is the most destructive pathogen of rice worldwide and the principal model organism for elucidating the molecular basis of fungal disease of plants. Here, we report the draft sequence of the M. grisea genome. Analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease. The genome encodes a large and

Ralph A. Dean; Nicholas J. Talbot; Daniel J. Ebbole; Mark L. Farman; Thomas K. Mitchell; Marc J. Orbach; Michael Thon; Resham Kulkarni; Jin-Rong Xu; Huaqin Pan; Nick D. Read; Yong-Hwan Lee; Ignazio Carbone; Doug Brown; Yeon Yee Oh; Nicole Donofrio; Jun Seop Jeong; Darren M. Soanes; Slavica Djonovic; Elena Kolomiets; Cathryn Rehmeyer; Weixi Li; Michael Harding; Soonok Kim; Marc-Henri Lebrun; Heidi Bohnert; Sean Coughlan; Jonathan Butler; Sarah Calvo; Li-Jun Ma; Robert Nicol; Seth Purcell; Chad Nusbaum; James E. Galagan; Bruce W. Birren

2005-01-01

364

Draft Genome Sequence of Avibacterium paragallinarum Strain 221  

PubMed Central

Avibacterium paragallinarum is the causative agent of infectious coryza. Here we report the draft genome sequence of reference strain 221 of A. paragallinarum serovar A. The genome is composed of 135 contigs for 2,685,568 bp with a 41% G+C content.

Xu, Fuzhou; Miao, Deyuan; Du, Yu; Chen, Xiaoling; Zhang, Peijun

2013-01-01

365

Draft Genome Sequence of Enterobacter cloacae Strain JD6301.  

PubMed

Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

Wilson, Jessica G; French, William T; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W; Champlin, Franklin R; Donaldson, Janet R

2014-01-01

366

Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila  

Microsoft Academic Search

The soil actinomycete Kocuria rhizophila belongs to the suborder Micrococcineae, a divergent bacterial group for which only a limited amount of genomic information is currently available. K. rhizophila is also important in industrial applications; e.g., it is commonly used as a standard quality control strain for antimicrobial susceptibility testing. Sequencing and annotation of the genome of K. rhizophila DC2201 (NBRC

Hiromi Takarada; Mitsuo Sekine; Hiroki Kosugi; Yasunori Matsuo; Takatomo Fujisawa; Seiha Omata; Emi Kishi; Ai Shimizu; Naofumi Tsukatani; Satoshi Tanikawa; Nobuyuki Fujita; Shigeaki Harayama

2008-01-01

367

Complete Genome Sequence of Pediococcus pentosaceus Strain SL4.  

PubMed

Pediococcus pentosaceus SL4 was isolated from a Korean fermented vegetable product, kimchi. We report here the whole-genome sequence (WGS) of P. pentosaceus SL4. The genome consists of a 1.79-Mb circular chromosome (G+C content of 37.3%) and seven distinct plasmids ranging in size from 4 kb to 50 kb. PMID:24371205

Dantoft, Shruti Harnal; Bielak, Eliza Maria; Seo, Jae-Gu; Chung, Myung-Jun; Jensen, Peter Ruhdal

2013-01-01

368

Draft Genome Sequences of Two Clinical Isolates of Streptococcus mutans.  

PubMed

We report the draft genome sequences of PKUSS-HG01 and PKUSS-LG01, two clinical isolates of Streptococcus mutans from human dental plaque. The genomics information will facilitate the study of the mechanisms of pathogenicity and evolution of S. mutans. PMID:24926045

Zheng, Hui; Guo, Lihong; Du, Ning; Lin, Jiuxiang; Song, Lai; Liu, Guiming; Chen, Feng

2014-01-01

369

Complete genome sequence of Riemerella anatipestifer reference strain.  

PubMed

Riemerella anatipestifer is an infectious pathogen causing serositis in ducks. We had the genome of the R. anatipestifer reference strain ATCC 11845 sequenced. The completed draft genome consists of one circular chromosome with 2,164,087 bp. There are 2,101 genes in the draft, and its GC content is 35.01%. PMID:22628503

Wang, Xiaojia; Zhu, DeKang; Wang, MingShu; Cheng, AnChun; Jia, RenYong; Zhou, Yi; Chen, Zhengli; Luo, QiHui; Liu, Fei; Wang, Yin; Chen, Xiao Yue

2012-06-01

370

Complete Genome Sequence of Riemerella anatipestifer Reference Strain  

PubMed Central

Riemerella anatipestifer is an infectious pathogen causing serositis in ducks. We had the genome of the R. anatipestifer reference strain ATCC 11845 sequenced. The completed draft genome consists of one circular chromosome with 2,164,087 bp. There are 2,101 genes in the draft, and its GC content is 35.01%.

Wang, Xiaojia; Zhu, DeKang; Wang, MingShu; Jia, RenYong; Zhou, Yi; Chen, Zhengli; Luo, QiHui; Liu, Fei; Wang, Yin; Chen, Xiao Yue

2012-01-01

371

Complete Genome Sequence of Antarctic Bacterium Psychrobacter sp. Strain G.  

PubMed

Here, we report the complete genome sequence of Psychrobacter sp. strain G, isolated from King George Island, Antarctica, which can produce lipolytic enzymes at low temperatures. The genomics information of this strain will facilitate the study of the physiology, cold adaptation properties, and evolution of this genus. PMID:24051316

Che, Shuai; Song, Lai; Song, Weizhi; Yang, Meng; Liu, Guiming; Lin, Xuezheng

2013-01-01

372

Draft Genome Sequence of Enterobacter cloacae Strain JD6301  

PubMed Central

Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes.

Wilson, Jessica G.; French, William T.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W.; Champlin, Franklin R.

2014-01-01

373

Genome Sequences of Five B1 Subcluster Mycobacteriophages  

PubMed Central

Mycobacteriophages infect members of the Mycobacterium genus in the phylum Actinobacteria and exhibit remarkable diversity. Genome analysis groups the thousands of known mycobacteriophages into clusters, of which the B1 subcluster is currently the third most populous. We report the complete genome sequences of five additional members of the B1 subcluster.

Barrus, E. Zane; Benedict, Alex B.; Brighton, Alicia K.; Fisher, Joshua N. B.; Gardner, Adam V.; Kartchner, Brittany J.; Ladle, Kara C.; Lunt, Bryce L.; Merrill, Bryan D.; Morrell, John D.; Burnett, Sandra H.

2013-01-01

374

Complete genome sequence of pronghorn virus, a pestivirus.  

PubMed

The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

2014-01-01

375

Complete genome sequence of Lactococcus lactis subsp. cremoris A76.  

PubMed

We report the complete genome sequence of Lactococcus lactis subsp. cremoris A76, a dairy strain isolated from a cheese production outfit. Genome analysis detected two contiguous islands fitting to the L. lactis subsp. lactis rather than to the L. lactis subsp. cremoris lineage. This indicates the existence of genetic exchange between the diverse subspecies, presumably related to the technological process. PMID:22328746

Bolotin, Alexander; Quinquis, Benoit; Ehrlich, Stanislas Dusko; Sorokin, Alexei

2012-03-01

376

Draft Genome Sequence of Mycobacterium vulneris DSM 45247T  

PubMed Central

We report the draft genome sequence of Mycobacterium vulneris DSM 45247T strain, an emerging, opportunistic pathogen of the Mycobacterium avium complex. The genome described here is composed of 6,981,439 bp (with a G+C content of 67.14%) and has 6,653 protein-coding genes and 84 predicted RNA genes.

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

377

Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis  

Microsoft Academic Search

We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the ~160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated

J. M. Carlton; R. P. Hirt; J. C. Silva; A. L. Delcher; Michael Schatz; Qi Zhao; J. R. Wortman; S. L. Bidwell; U. C. M. Alsmark; Sébastien Besteiro; Thomas Sicheritz-Ponten; C. J. Noel; J. B. Dacks; P. G. Foster; Cedric Simillion; Y. Van de Peer; Diego Miranda-Saavedra; G. J. Barton; G. D. Westrop; S. Muller; Daniele Dessi; P. L. Fiori; Qinghu Ren; Ian Paulsen; Hanbang Zhang; F. D. Bastida-Corcuera; Augusto Simoes-Barbosa; M. T. Brown; R. D. Hayes; Mandira Mukherjee; C. Y. Okumura; Rachel Schneider; A. J. Smith; Stepanka Vanacova; Maria Villalvazo; B. J. Haas; Mihaela Pertea; Tamara V. Feldblyum; T. R. Utterback; Chung-Li Shu; Kazutoyo Osoegawa; P. J. de Jong; Ivan Hrdy; Lenka Horvathova; Zuzana Zubacova; Pavel Dolezal; Shehre-Banoo Malik; J. M. Logsdon; Katrin Henze; Arti Gupta; Ching C. Wang; R. L. Dunne; J. A. Upcroft; Peter Upcroft; Owen White; S. L. Salzberg; Petrus Tang; Cheng-Hsun Chiu; Ying-Shiung Lee; T. M. Embley; G. H. Coombs; J. C. Mottram; Jan Tachezy; C. M. Fraser-Liggett; P. J. Johnson

2007-01-01

378

Draft Genome Sequence of Mycobacterium triplex DSM 44626  

PubMed Central

We announce the draft genome sequence of Mycobacterium triplex strain DSM 44626, a nontuberculosis species responsible for opportunistic infections. The genome described here is composed of 6,382,840 bp, with a G+C content of 66.57%, and contains 5,988 protein-coding genes and 81 RNA genes.

Sassi, Mohamed; Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

379

Sequence Analysis of the Genome of the Neodiprion sertifer Nucleopolyhedrovirus  

Microsoft Academic Search

The genome of the Neodiprion sertifer nucleopolyhedrovirus (NeseNPV), which infects the European pine sawfly, N. sertifer (Hymenoptera: Diprionidae), was sequenced and analyzed. The genome was 86,462 bp in size. The CG content of 34% was lower than that of the majority of baculoviruses. A total of 90 methionine- initiated open reading frames (ORFs) with more than 50 amino acids and

Alejandra Garcia-Maruniak; James E. Maruniak; Paolo M. A. Zanotto; Aissa E. Doumbouya; Jaw-Ching Liu; Thomas M. Merritt; Jennifer S. Lanoie

2004-01-01

380

Draft Genome Sequence of Entomopathogenic Serratia liquefaciens Strain FK01  

PubMed Central

In the present study, we determined the draft genome sequence of the entomopathogenic bacterium Serratia liquefaciens FK01, which is highly virulent to the silkworm. The draft genome is ~5.28 Mb in size, and the G+C content is 55.8%.

Taira, Erika; Mon, Hiroaki; Mori, Kazuki; Akasaka, Taiki; Tashiro, Kousuke; Yasunaga-Aoki, Chisa; Lee, Jae Man; Kusakabe, Takahiro

2014-01-01

381

Draft Genome Sequences of Two Clinical Isolates of Streptococcus mutans  

PubMed Central

We report the draft genome sequences of PKUSS-HG01 and PKUSS-LG01, two clinical isolates of Streptococcus mutans from human dental plaque. The genomics information will facilitate the study of the mechanisms of pathogenicity and evolution of S. mutans.

Zheng, Hui; Guo, Lihong; Du, Ning; Lin, Jiuxiang; Song, Lai

2014-01-01

382

Complete Genome Sequence of Pronghorn Virus, a Pestivirus  

PubMed Central

The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids.

Ridpath, Julia F.; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

2014-01-01

383

Complete Genome Sequence of Cronobacter sakazakii Strain CMCC 45402.  

PubMed

Cronobacter sakazakii is considered to be an important pathogen involved in life-threatening neonatal infections. Here, we report the annotated complete genome sequence of C. sakazakii strain CMCC 45402, obtained from a milk sample in China. The major findings from the genomic analysis provide a better understanding of the isolates from China. PMID:24435860

Zhao, Zhijing; Wang, Lei; Wang, Bin; Liang, Haoyu; Ye, Qiang; Zeng, Ming

2014-01-01

384

Draft Genome Sequence of Mycobacterium triplex DSM 44626.  

PubMed

We announce the draft genome sequence of Mycobacterium triplex strain DSM 44626, a nontuberculosis species responsible for opportunistic infections. The genome described here is composed of 6,382,840 bp, with a G+C content of 66.57%, and contains 5,988 protein-coding genes and 81 RNA genes. PMID:24874681

Sassi, Mohamed; Croce, Olivier; Robert, Catherine; Raoult, Didier; Drancourt, Michel

2014-01-01

385

Simple sequence repeats in different genome sequences of Shigella and comparison with high GC and AT-rich genomes.  

PubMed

Simple sequence repeats (SSRs) are omnipresent in prokaryotes and eukaryotes, and are found anywhere in the genome in both protein encoding and noncoding regions. In present study the whole genome sequences of seven chromosomes (Shigella flexneri 2a str301 and 2457T, Shigella sonnei, Escherichia coli k12, Mycobacterium tuberculosis, Mycobacterium leprae and Staphylococcus saprophyticus) have downloaded from the GenBank database for identifying abundance, distribution and composition of SSRs and also to determine difference between the tandem repeats in real genome and randomness genome (using sequence shuffling tool) of the organisms included in this study. The data obtained in the present study show that: (i) tandem repeats are widely distributed throughout the genomes; (ii) SSRs are differentially distributed among coding and noncoding regions in investigated Shigella genomes; (iii) total frequency of SSRs in noncoding regions are higher than coding regions; (iv) in all investigated chromosomes ratio of Trinucleotide SSRs in real genomes are much higher than randomness genomes and Di nucleotide SSRs are lower; (v) Ratio of total and mononucleotide SSRs in real genome is higher than randomness genomes in E. coli K12, S. flexneri str 301 and S. saprophyticus, while it is lower in S. flexneri str 2457T, S.sonnei and M. tuberculosis and it is approximately same in M. leprae; (vi) frequency of codon repetitions are vary considerably depending on the type of encoded amino acids. PMID:18464038

Hosseini, Ashraf; Ranade, Suvidya H; Ghosh, Indira; Khandekar, Pramod

2008-06-01

386

Genome sequence of the biocontrol strain Pseudomonas fluorescens F113.  

PubMed

Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A; Giddens, Stephen R; Coppoolse, Eric R; Muriel, Candela; Stiekema, Willem J; Rainey, Paul B; Dowling, David; O'Gara, Fergal; Martín, Marta; Rivilla, Rafael

2012-03-01

387

Complete genome sequence of Bacillus cereus bacteriophage PBC1.  

PubMed

Bacillus cereus is a ubiquitous, spore-forming bacterium associated with food poisoning cases. To develop an efficient biocontrol agent against B. cereus, we isolated lytic phage PBC1 and sequenced its genome. PBC1 showed a very low degree of homology to previously reported phages, implying that it is novel. Here we report the complete genome sequence of PBC1 and describe major findings from our analysis. PMID:22570248

Kong, Minsuk; Kim, Minsik; Ryu, Sangryeol

2012-06-01

388

Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii  

Microsoft Academic Search

The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

1996-01-01

389

Complete Sequence and Genomic Analysis of Murine Gammaherpesvirus 68  

Microsoft Academic Search

Murine gammaherpesvirus 68 (gHV68) infects mice, thus providing a tractable small-animal model for analysis of the acute and chronic pathogenesis of gammaherpesviruses. To facilitate molecular analysis of gHV68 pathogenesis, we have sequenced the gHV68 genome. The genome contains 118,237 bp of unique sequence flanked by multiple copies of a 1,213-bp terminal repeat. The GC content of the unique portion of

HERBERT W. VIRGIN; PHILIP LATREILLE; PAMELA WAMSLEY; KYMBERLIE HALLSWORTH; KAREN E. WECK; ALBERT J. DAL CANTO; SAMUEL H. SPECK

1997-01-01

390

Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113  

PubMed Central

Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms.

Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martinez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sanchez-Contreras, Maria; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martin, Marta

2012-01-01

391

Non-invasive whole genome sequencing of a human fetus  

PubMed Central

Analysis of cell-free fetal DNA in maternal plasma holds great promise for the development of non-invasive prenatal genetic diagnostics. However, previous studies have been restricted to detection of fetal trisomies (1, 2) or specific, paternally inherited mutations (3), or to genotyping common polymorphisms using invasively sampled material (4). Here, we combine genome sequencing of two parents, genome-wide maternal haplotyping (5), and deep sequencing of maternal plasma to non-invasively determine the genome sequence of a human fetus at 18.5 weeks gestation. Inheritance was predicted at 2.8×106 parentally heterozygous sites with 98.1% accuracy. Furthermore, 39 of 44 de novo point mutations in the fetal genome were detected, albeit with limited specificity. Subsampling these data and analyzing a second family trio by the same approach indicate that ~300 kilobase parental haplotype blocks combined with shallow sequencing of maternal plasma are sufficient to substantially determine the inherited complement of a fetal genome. However, ultra-deep sequencing of maternal plasma is necessary for the practical detection of fetal de novo mutations genome-wide. Although technical and analytical challenges remain, we anticipate that non-invasive analysis of inherited variation and de novo mutations in fetal genomes will facilitate the comprehensive prenatal diagnosis of both recessive and dominant Mendelian disorders.

Kitzman, Jacob O.; Snyder, Matthew W.; Ventura, Mario; Lewis, Alexandra P.; Qiu, Ruolan; Simmons, LaVone E.; Gammill, Hilary S.; Rubens, Craig E.; Santillan, Donna A.; Murray, Jeffrey C.; Tabor, Holly K.; Bamshad, Michael J.; Eichler, Evan E.; Shendure, Jay

2012-01-01

392

Use of Whole Genome Sequence Data To Infer Baculovirus Phylogeny  

PubMed Central

Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance analysis and a novel method developed here, termed neighbor pair analysis. The third set recorded gene content by scoring gene presence or absence in each genome. All three data sets yielded phylogenies supporting the separation of the Nucleopolyhedrovirus (NPV) and Granulovirus (GV) genera, the division of the NPVs into groups I and II, and species relationships within group I NPVs. Generation of phylogenies based on the combined sequences of all 63 shared genes proved to be the most effective approach to resolving the relationships among the group II NPVs and the GVs. The history of gene acquisitions and losses that have accompanied baculovirus diversification was visualized by mapping the gene content data onto the phylogenetic tree. This analysis highlighted the fluid nature of baculovirus genomes, with evidence of frequent genome rearrangements and multiple gene content changes during their evolution. Of more than 416 genes identified in the genomes analyzed, only 63 are present in all nine genomes, and 200 genes are found only in a single genome. Despite this fluidity, the whole genome-based methods we describe are sufficiently powerful to recover the underlying phylogeny of the viruses.

Herniou, Elisabeth A.; Luque, Teresa; Chen, Xinwen; Vlak, Just M.; Winstanley, Doreen; Cory, Jennifer S.; O'Reilly, David R.

2001-01-01

393

Genomic Rearrangements of PTEN in Prostate Cancer  

PubMed Central

The phosphatase and tensin homolog gene (PTEN) on chromosome 10q23.3 is a negative regulator of the PIK3/Akt survival pathway and is the most frequently deleted tumor suppressor gene in prostate cancer. Monoallelic loss of PTEN is present in up to 60% of localized prostate cancers and complete loss of PTEN in prostate cancer is linked to metastasis and androgen-independent progression. Studies on the genomic status of PTEN in prostate cancer initially used a two-color fluorescence in situ hybridization (FISH) assay for PTEN copy number detection in formalin fixed paraffin embedded tissue preparations. More recently, a four-color FISH assay containing two additional control probes flanking the PTEN locus with a lower false-positive rate was reported. Combined with the detection of other critical genomic biomarkers for prostate cancer such as ERG, androgen receptor, and MYC, the evaluation of PTEN genomic status has proven to be invaluable for patient stratification and management. Although less frequent than allelic deletions, point mutations in the gene and epigenetic silencing are also known to contribute to loss of PTEN function, and ultimately to prostate cancer initiation. Overall, it is clear that PTEN is a powerful biomarker for prostate cancer. Used as a companion diagnostic for emerging therapeutic drugs, FISH analysis of PTEN is promisingly moving human prostate cancer closer to more effective cancer management and therapies.

Phin, Sopheap; Moore, Mathew W.; Cotter, Philip D.

2013-01-01

394

Sequencing viral genomes from a single isolated plaque  

PubMed Central

Background Whole genome sequencing of viruses and bacteriophages is often hindered because of the need for large quantities of genomic material. A method is described that combines single plaque sequencing with an optimization of Sequence Independent Single Primer Amplification (SISPA). This method can be used for de novo whole genome next-generation sequencing of any cultivable virus without the need for large-scale production of viral stocks or viral purification using centrifugal techniques. Methods A single viral plaque of a variant of the 2009 pandemic H1N1 human Influenza A virus was isolated and amplified using the optimized SISPA protocol. The sensitivity of the SISPA protocol presented here was tested with bacteriophage F_HA0480sp/Pa1651 DNA. The amplified products were sequenced with 454 and Illumina HiSeq platforms. Mapping and de novo assemblies were performed to analyze the quality of data produced from this optimized method. Results Analysis of the sequence data demonstrated that from a single viral plaque of Influenza A, a mapping assembly with 3590-fold average coverage representing 100% of the genome could be produced. The de novo assembled data produced contigs with 30-fold average sequence coverage, representing 96.5% of the genome. Using only 10 pg of starting DNA from bacteriophage F_HA0480sp/Pa1651 in the SISPA protocol resulted in sequencing data that gave a mapping assembly with 3488-fold average sequence coverage, representing 99.9% of the reference and a de novo assembly with 45-fold average sequence coverage, representing 98.1% of the genome. Conclusions The optimized SISPA protocol presented here produces amplified product that when sequenced will give high quality data that can be used for de novo assembly. The protocol requires only a single viral plaque or as little as 10 pg of DNA template, which will facilitate rapid identification of viruses during an outbreak and viruses that are difficult to propagate.

2013-01-01

395

Large-Scale Sequencing: The Future of Genomic Sciences Colloquium  

SciTech Connect

Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin, since not only are their genomes available, but they are also accompanied by data on environment and physiology that can be used to understand the resulting data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into this effort. Efforts to sequence cultivated isolates should target characterized isolates from culture collections for which biochemical data are available, as well as other cultures of lasting value from personal collections. The genomes of type strains should be among the first targets for sequencing, but creative culture methods, novel cell isolation, and sorting methods would all be helpful in obtaining organisms we have not yet been able to cultivate for sequencing. The data that should be provided for strains targeted for sequencing will depend on the phylogenetic context of the organism and the amount of information available about its nearest relatives. Annotation is an important part of transforming genome sequences into useful resources, but it represents the most significant bottleneck to the field of comparative genomics right now and must be addressed. Furthermore, there is a need for more consistency in both annotation and achieving annotation data. As new annotation tools become available over time, re-annotation of genomes should be implemented, taking advantage of advancements in annotation techniques in order to capitalize on the genome sequences and increase both the societal and scientific benefit of genomics work. Given the proper resources, the knowledge and ability exist to be able to select model systems, some simple, some less so, and dissect them so that we may understand the processes and interactions at work in them. Colloquium participants suggest a five-pronged, coordinated initiative to exhaustively describe six different microbial ecosystems, designed to describe all the gene diversity, across genomes. In this effort, sequencing should be complemented by other experimental data, particularly transcriptomics and metabolomics data, all of which

Margaret Riley; Merry Buckley

2009-01-01

396

Genome sequence of the date palm Phoenix dactylifera L  

PubMed Central

Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm’s unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants.

Al-Mssallem, Ibrahim S.; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M.; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O.; Jia, Shangang; Yin, An; Alhuzimi, Eman M.; Alsaihati, Burair A.; Al-Owayyed, Saad A.; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A.; Sun, Gaoyuan; Majrashi, Majed A.; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A.; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F.; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R.; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

2013-01-01

397

Genome sequence of the date palm Phoenix dactylifera L.  

PubMed

Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm's unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

Al-Mssallem, Ibrahim S; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O; Jia, Shangang; Yin, An; Alhuzimi, Eman M; Alsaihati, Burair A; Al-Owayyed, Saad A; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A; Sun, Gaoyuan; Majrashi, Majed A; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

2013-01-01

398

Identification of genomic alterations in oesophageal squamous cell cancer.  

PubMed

Oesophageal cancer is one of the most aggressive cancers and is the sixth leading cause of cancer death worldwide. Approximately 70% of global oesophageal cancer cases occur in China, with oesophageal squamous cell carcinoma (ESCC) being the histopathological form in the vast majority of cases (>90%). Currently, there are limited clinical approaches for the early diagnosis and treatment of ESCC, resulting in a 10% five-year survival rate for patients. However, the full repertoire of genomic events leading to the pathogenesis of ESCC remains unclear. Here we describe a comprehensive genomic analysis of 158 ESCC cases, as part of the International Cancer Genome Consortium research project. We conducted whole-genome sequencing in 17 ESCC cases and whole-exome sequencing in 71 cases, of which 53 cases, plus an additional 70 ESCC cases not used in the whole-genome and whole-exome sequencing, were subjected to array comparative genomic hybridization analysis. We identified eight significantly mutated genes, of which six are well known tumour-associated genes (TP53, RB1, CDKN2A, PIK3CA, NOTCH1, NFE2L2), and two have not previously been described in ESCC (ADAM29 and FAM135B). Notably, FAM135B is identified as a novel cancer-implicated gene as assayed for its ability to promote malignancy of ESCC cells. Additionally, MIR548K, a microRNA encoded in the amplified 11q13.3-13.4 region, is characterized as a novel oncogene, and functional assays demonstrate that MIR548K enhances malignant phenotypes of ESCC cells. Moreover, we have found that several important histone regulator genes (MLL2 (also called KMT2D), ASH1L, MLL3 (KMT2C), SETD1B, CREBBP and EP300) are frequently altered in ESCC. Pathway assessment reveals that somatic aberrations are mainly involved in the Wnt, cell cycle and Notch pathways. Genomic analyses suggest that ESCC and head and neck squamous cell carcinoma share some common pathogenic mechanisms, and ESCC development is associated with alcohol drinking. This study has explored novel biological markers and tumorigenic pathways that would greatly improve therapeutic strategies for ESCC. PMID:24670651

Song, Yongmei; Li, Lin; Ou, Yunwei; Gao, Zhibo; Li, Enmin; Li, Xiangchun; Zhang, Weimin; Wang, Jiaqian; Xu, Liyan; Zhou, Yong; Ma, Xiaojuan; Liu, Lingyan; Zhao, Zitong; Huang, Xuanlin; Fan, Jing; Dong, Lijia; Chen, Gang; Ma, Liying; Yang, Jie; Chen, Longyun; He, Minghui; Li, Miao; Zhuang, Xuehan; Huang, Kai; Qiu, Kunlong; Yin, Guangliang; Guo, Guangwu; Feng, Qiang; Chen, Peishan; Wu, Zhiyong; Wu, Jianyi; Ma, Ling; Zhao, Jinyang; Luo, Longhai; Fu, Ming; Xu, Bainan; Chen, Bo; Li, Yingrui; Tong, Tong; Wang, Mingrong; Liu, Zhihua; Lin, Dongxin; Zhang, Xiuqing; Yang, Huanming; Wang, Jun; Zhan, Qimin

2014-05-01

399

Complete chloroplast genome sequences of Solanum bulbocastanum , Solanum lycopersicum and comparative analyses with other Solanaceae genomes  

Microsoft Academic Search

Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and

Henry Daniell; Seung-Bum Lee; Justin Grevich; Christopher Saski; Tania Quesada-Vargas; Chittibabu Guda; Jeffrey Tomkins; Robert K. Jansen

2006-01-01

400

Decoding the genome beyond sequencing: the new phase of genomic research.  

PubMed

While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing. PMID:21640814

Heng, Henry H Q; Liu, Guo; Stevens, Joshua B; Bremer, Steven W; Ye, Karen J; Abdallah, Batoul Y; Horne, Steven D; Ye, Christine J

2011-10-01

401

Characterizing the walnut genome through analyses of BAC end sequences.  

PubMed

Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2 Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut. PMID:22101470

Wu, Jiajie; Gu, Yong Q; Hu, Yuqin; You, Frank M; Dandekar, Abhaya M; Leslie, Charles A; Aradhya, Mallikarjuna; Dvorak, Jan; Luo, Ming-Cheng

2012-01-01

402

Draft Genome Sequence of Campylobacter ureolyticus Strain CIT007, the First Whole-Genome Sequence of a Clinical Isolate.  

PubMed

Herein, we present the draft genome sequence of Campylobacter ureolyticus. Strain CIT007 was isolated from a stool sample from an elderly female presenting with diarrheal illness and end-stage chronic renal disease. PMID:24723712

Lucid, Alan; Bullman, Susan; Koziel, Monika; Corcoran, Gerard D; Cotter, Paul D; Sleator, Roy D; Lucey, Brigid

2014-01-01