Sample records for cancer genome sequences

  1. Cancer genome-sequencing study design.

    PubMed

    Mwenifumbo, Jill C; Marra, Marco A

    2013-05-01

    Discoveries from cancer genome sequencing have the potential to translate into advances in cancer prevention, diagnostics, prognostics, treatment and basic biology. Given the diversity of downstream applications, cancer genome-sequencing studies need to be designed to best fulfil specific aims. Knowledge of second-generation cancer genome-sequencing study design also facilitates assessment of the validity and importance of the rapidly growing number of published studies. In this Review, we focus on the practical application of second-generation sequencing technology (also known as next-generation sequencing) to cancer genomics and discuss how aspects of study design and methodological considerations - such as the size and composition of the discovery cohort - can be tailored to serve specific research aims. PMID:23594910

  2. Science Originals: Sequencing Cancer Genomes: Targeted Cancer Therapies

    NSDL National Science Digital Library

    Robert Frederick (AAAS; )

    2011-03-25

    Applying DNA sequencing to cancer genomes is providing insights that have allowed researchers to turn some cancers into chronic diseases rather than deadly ones. Still, the ultimate goal is to kill the cancer.

  3. Advances in understanding cancer genomes through second-generation sequencing

    Microsoft Academic Search

    Stacey Gabriel; Gad Getz; Matthew Meyerson

    2010-01-01

    Cancers are caused by the accumulation of genomic alterations. Therefore, analyses of cancer genome sequences and structures provide insights for understanding cancer biology, diagnosis and therapy. The application of second-generation DNA sequencing technologies (also known as next-generation sequencing) — through whole-genome, whole-exome and whole-transcriptome approaches — is allowing substantial advances in cancer genomics. These methods are facilitating an increase in

  4. Comprehensive Genome Sequence Analysis of a Breast Cancer Amplicon

    PubMed Central

    Collins, Colin; Volik, Stanislav; Kowbel, David; Ginzinger, David; Ylstra, Bauke; Cloutier, Thomas; Hawkins, Trevor; Predki, Paul; Martin, Christopher; Wernick, Meredith; Kuo, Wen-Lin; Alberts, Arthur; Gray, Joe W.

    2001-01-01

    Gene amplification occurs in most solid tumors and is associated with poor prognosis. Amplification of 20q13.2 is common to several tumor types including breast cancer. The 1 Mb of sequence spanning the 20q13.2 breast cancer amplicon is one of the most exhaustively studied segments of the human genome. These studies have included amplicon mapping by comparative genomic hybridization (CGH), fluorescent in-situ hybridization (FISH), array-CGH, quantitative microsatellite analysis (QUMA), and functional genomic studies. Together these studies revealed a complex amplicon structure suggesting the presence of at least two driver genes in some tumors. One of these, ZNF217, is capable of immortalizing human mammary epithelial cells (HMEC) when overexpressed. In addition, we now report the sequencing of this region in human and mouse, and on quantitative expression studies in tumors. Amplicon localization now is straightforward and the availability of human and mouse genomic sequence facilitates their functional analysis. However, comprehensive annotation of megabase-scale regions requires integration of vast amounts of information. We present a system for integrative analysis and demonstrate its utility on 1.2 Mb of sequence spanning the 20q13.2 breast cancer amplicon and 865 kb of syntenic murine sequence. We integrate tumor genome copy number measurements with exhaustive genome landscape mapping, showing that amplicon boundaries are associated with maxima in repetitive element density and a region of evolutionary instability. This integration of comprehensive sequence annotation, quantitative expression analysis, and tumor amplicon boundaries provide evidence for an additional driver gene prefoldin 4 (PFDN4), coregulated genes, conserved noncoding regions, and associate repetitive elements with regions of genomic instability at this locus. PMID:11381030

  5. Genome Sequencing Centers

    Cancer.gov

    The Cancer Genome Atlas (TCGA) Genome Sequencing Centers (GSCs) perform large-scale DNA sequencing using the latest sequencing technologies. Supported by the National Human Genome Research Institute (NHGRI) large-scale sequencing program, the GSCs generate the enormous volume of data required by TCGA, while continually improving existing technologies and methods to expand the frontier of what can be achieved in cancer genome sequencing.

  6. Returning individual research results for genome sequences of pancreatic cancer

    PubMed Central

    2014-01-01

    Background Disclosure of individual results to participants in genomic research is a complex and contentious issue. There are many existing commentaries and opinion pieces on the topic, but little empirical data concerning actual cases describing how individual results have been returned. Thus, the real life risks and benefits of disclosing individual research results to participants are rarely if ever presented as part of this debate. Methods The Australian Pancreatic Cancer Genome Initiative (APGI) is an Australian contribution to the International Cancer Genome Consortium (ICGC), that involves prospective sequencing of tumor and normal genomes of study participants with pancreatic cancer in Australia. We present three examples that illustrate different facets of how research results may arise, and how they may be returned to individuals within an ethically defensible and clinically practical framework. This framework includes the necessary elements identified by others including consent, determination of the significance of results and which to return, delineation of the responsibility for communication and the clinical pathway for managing the consequences of returning results. Results Of 285 recruited patients, we returned results to a total of 25 with no adverse events to date. These included four that were classified as medically actionable, nine as clinically significant and eight that were returned at the request of the treating clinician. Case studies presented depict instances where research results impacted on cancer susceptibility, current treatment and diagnosis, and illustrate key practical challenges of developing an effective framework. Conclusions We suggest that return of individual results is both feasible and ethically defensible but only within the context of a robust framework that involves a close relationship between researchers and clinicians. PMID:24963353

  7. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable

    E-print Network

    Raphael, Ben J.

    REVIEW Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine Benjamin J Raphael1,2* , Jason R Dobson1,2,3 , Layla Oesper1 and Fabio Vandin1,2 Abstract mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences

  8. Will we cure cancer by sequencing thousands of genomes?

    PubMed Central

    2013-01-01

    The promise to understand cancer and develop efficacious therapies by sequencing thousands of cancers has not occurred. Mutations in specific genes termed oncogenes and tumor suppressor genes are extremely heterogeneous amongst the same type of cancer as well as between cancers. They provide little selective advantage to the cancer and in functional tests have yet to be shown to be sufficient for transformation. Here I discuss the karyotyptic theory of cancer and ask if it is time for a new approach to understanding and ultimately treating cancer. PMID:24330806

  9. Integrated Analysis of Whole Genome and Transcriptome Sequencing Reveals Diverse Transcriptomic Aberrations Driven by Somatic Genomic Changes in Liver Cancers

    PubMed Central

    Shiraishi, Yuichi; Fujimoto, Akihiro; Furuta, Mayuko; Tanaka, Hiroko; Chiba, Ken-ichi; Boroevich, Keith A.; Abe, Tetsuo; Kawakami, Yoshiiku; Ueno, Masaki; Gotoh, Kunihito; Ariizumi, Shun-ichi; Shibuya, Tetsuo; Nakano, Kaoru; Sasaki, Aya; Maejima, Kazuhiro; Kitada, Rina; Hayami, Shinya; Shigekawa, Yoshinobu; Marubashi, Shigeru; Yamada, Terumasa; Kubo, Michiaki; Ishikawa, Osamu; Aikata, Hiroshi; Arihiro, Koji; Ohdan, Hideki; Yamamoto, Masakazu; Yamaue, Hiroki; Chayama, Kazuaki; Tsunoda, Tatsuhiko; Miyano, Satoru; Nakagawa, Hidewaki

    2014-01-01

    Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV)-related hepatocellular carcinomas (HCCs) and their matched controls. Comparison of whole genome sequence (WGS) and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3), and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome. PMID:25526364

  10. Whole genome sequencing as a means to assess pathogenic mutations in medical genetics and cancer.

    PubMed

    Royer-Bertrand, Beryl; Rivolta, Carlo

    2015-04-01

    The past decade has seen the emergence of next-generation sequencing (NGS) technologies, which have revolutionized the field of human molecular genetics. With NGS, significant portions of the human genome can now be assessed by direct sequence analysis, highlighting normal and pathological variants of our DNA. Recent advances have also allowed the sequencing of complete genomes, by a method referred to as whole genome sequencing (WGS). In this work, we review the use of WGS in medical genetics, with specific emphasis on the benefits and the disadvantages of this technique for detecting genomic alterations leading to Mendelian human diseases and to cancer. PMID:25548800

  11. U87MG Decoded: The Genomic Sequence of a Cytogenetically Aberrant Human Cancer Cell Line

    Microsoft Academic Search

    Michael James Clark; Nils Homer; Brian D. OConnor; Zugen Chen; Ascia Eskin; Hane Lee; Barry Merriman; Stanley F. Nelson

    2010-01-01

    U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30× genomic sequence coverage using a novel 50-base mate paired strategy

  12. Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer

    PubMed Central

    Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

    2012-01-01

    Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip PMID:22341448

  13. Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.

    PubMed

    Moncunill, Valentí; Gonzalez, Santi; Beà, Sílvia; Andrieux, Lise O; Salaverria, Itziar; Royo, Cristina; Martinez, Laura; Puiggròs, Montserrat; Segura-Wang, Maia; Stütz, Adrian M; Navarro, Alba; Royo, Romina; Gelpí, Josep L; Gut, Ivo G; López-Otín, Carlos; Orozco, Modesto; Korbel, Jan O; Campo, Elias; Puente, Xose S; Torrents, David

    2014-11-01

    The development of high-throughput sequencing technologies has advanced our understanding of cancer. However, characterizing somatic structural variants in tumor genomes is still challenging because current strategies depend on the initial alignment of reads to a reference genome. Here, we describe SMUFIN (somatic mutation finder), a single program that directly compares sequence reads from normal and tumor genomes to accurately identify and characterize a range of somatic sequence variation, from single-nucleotide variants (SNV) to large structural variants at base pair resolution. Performance tests on modeled tumor genomes showed average sensitivity of 92% and 74% for SNVs and structural variants, with specificities of 95% and 91%, respectively. Analyses of aggressive forms of solid and hematological tumors revealed that SMUFIN identifies breakpoints associated with chromothripsis and chromoplexy with high specificity. SMUFIN provides an integrated solution for the accurate, fast and comprehensive characterization of somatic sequence variation in cancer. PMID:25344728

  14. Detection and Mapping of Amplified DNA Sequences in Breast Cancer by Comparative Genomic Hybridization

    Microsoft Academic Search

    Anne Kallioniemi; Olli-Pekka Kallioniemi; Jim Piper; Minna Tanner; Trond Stokke; Ling Chen; Helene S. Smith; Dan Pinkel; Joe W. Gray; Frederic M. Waldman

    1994-01-01

    Comparative genomic hybridization was applied to 5 breast cancer cell lines and 33 primary tumors to discover and map regions of the genome with increased DNA-sequence copy-number. Two-thirds of primary tumors and almost all cell lines showed increased DNA-sequence copy-number affecting a total of 26 chromosomal subregions. Most of these loci were distinct from those of currently known amplified genes

  15. Discrepancies in cancer genomic sequencing highlight opportunities for driver mutation discovery.

    PubMed

    Hudson, Andrew M; Yates, Tim; Li, Yaoyong; Trotter, Eleanor W; Fawdar, Shameem; Chapman, Phil; Lorigan, Paul; Biankin, Andrew; Miller, Crispin J; Brognard, John

    2014-11-15

    Cancer genome sequencing is being used at an increasing rate to identify actionable driver mutations that can inform therapeutic intervention strategies. A comparison of two of the most prominent cancer genome sequencing databases from different institutes (Cancer Cell Line Encyclopedia and Catalogue of Somatic Mutations in Cancer) revealed marked discrepancies in the detection of missense mutations in identical cell lines (57.38% conformity). The main reason for this discrepancy is inadequate sequencing of GC-rich areas of the exome. We have therefore mapped over 400 regions of consistent inadequate sequencing (cold-spots) in known cancer-causing genes and kinases, in 368 of which neither institute finds mutations. We demonstrate, using a newly identified PAK4 mutation as proof of principle, that specific targeting and sequencing of these GC-rich cold-spot regions can lead to the identification of novel driver mutations in known tumor suppressors and oncogenes. We highlight that cross-referencing between genomic databases is required to comprehensively assess genomic alterations in commonly used cell lines and that there are still significant opportunities to identify novel drivers of tumorigenesis in poorly sequenced areas of the exome. Finally, we assess other reasons for the observed discrepancy, such as variations in dbSNP filtering and the acquisition/loss of mutations, to give explanations as to why there is a discrepancy in pharmacogenomic studies, given recent concerns with poor reproducibility of data. PMID:25256751

  16. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  17. Massively Parallel Validation of Cancer Mutations and Other Variants Identified by Whole Cancer Genome and Exome Sequencing - Georges Natsoulis, TCGA Scientific Symposium 2011

    Cancer.gov

    Home News and Events Multimedia Library Videos Parallel Validation of Cancer Mutations and Other Variants - Georges Natsoulis Massively Parallel Validation of Cancer Mutations and Other Variants Identified by Whole Cancer Genome and Exome Sequencing

  18. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing

    Microsoft Academic Search

    Peter J Campbell; Philip J Stephens; Erin D Pleasance; Sarah O'Meara; Heng Li; Thomas Santarius; Lucy A Stebbings; Catherine Leroy; Sarah Edkins; Claire Hardy; Jon W Teague; Andrew Menzies; Ian Goodhead; Daniel J Turner; Christopher M Clee; Michael A Quail; Antony Cox; Clive Brown; Richard Durbin; Matthew E Hurles; Paul A W Edwards; Graham R Bignell; Michael R Stratton; P Andrew Futreal

    2008-01-01

    Human cancers often carry many somatically acquired genomic rearrangements, some of which may be implicated in cancer development. However, conventional strategies for characterizing rearrangements are laborious and low-throughput and have low sensitivity or poor resolution. We used massively parallel sequencing to generate sequence reads from both ends of short DNA fragments derived from the genomes of two individuals with lung

  19. The Cancer Genome Atlas (TCGA)

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  20. Mate Pair Sequencing of Whole-Genome-Amplified DNA Following Laser Capture Microdissection of Prostate Cancer

    PubMed Central

    Murphy, Stephen J.; Cheville, John C.; Zarei, Shabnam; Johnson, Sarah H.; Sikkink, Robert A.; Kosari, Farhad; Feldman, Andrew L.; Eckloff, Bruce W.; Karnes, R. Jeffrey; Vasmatzis, George

    2012-01-01

    High-throughput next-generation sequencing provides a revolutionary platform to unravel the precise DNA aberrations concealed within subgroups of tumour cells. However, in many instances, the limited number of cells makes the application of this technology in tumour heterogeneity studies a challenge. In order to address these limitations, we present a novel methodology to partner laser capture microdissection (LCM) with sequencing platforms, through a whole-genome amplification (WGA) protocol performed in situ directly on LCM engrafted cells. We further adapted current Illumina mate pair (MP) sequencing protocols to the input of WGA DNA and used this technology to investigate large genomic rearrangements in adjacent Gleason Pattern 3 and 4 prostate tumours separately collected by LCM. Sequencing data predicted genome coverage and depths similar to unamplified genomic DNA, with limited repetition and bias predicted in WGA protocols. Mapping algorithms developed in our laboratory predicted high-confidence rearrangements and selected events each demonstrated the predicted fusion junctions upon validation. Rearrangements were additionally confirmed in unamplified tissue and evaluated in adjacent benign-appearing tissues. A detailed understanding of gene fusions that characterize cancer will be critical in the development of biomarkers to predict the clinical outcome. The described methodology provides a mechanism of efficiently defining these events in limited pure populations of tumour tissue, aiding in the derivation of genomic aberrations that initiate cancer and drive cancer progression. PMID:22991452

  1. Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing

    PubMed Central

    Walsh, Tom; Lee, Ming K.; Casadei, Silvia; Thornton, Anne M.; Stray, Sunday M.; Pennil, Christopher; Nord, Alex S.; Mandell, Jessica B.; Swisher, Elizabeth M.; King, Mary-Claire

    2010-01-01

    Inherited loss-of-function mutations in the tumor suppressor genes BRCA1, BRCA2, and multiple other genes predispose to high risks of breast and/or ovarian cancer. Cancer-associated inherited mutations in these genes are collectively quite common, but individually rare or even private. Genetic testing for BRCA1 and BRCA2 mutations has become an integral part of clinical practice, but testing is generally limited to these two genes and to women with severe family histories of breast or ovarian cancer. To determine whether massively parallel, “next-generation” sequencing would enable accurate, thorough, and cost-effective identification of inherited mutations for breast and ovarian cancer, we developed a genomic assay to capture, sequence, and detect all mutations in 21 genes, including BRCA1 and BRCA2, with inherited mutations that predispose to breast or ovarian cancer. Constitutional genomic DNA from subjects with known inherited mutations, ranging in size from 1 to >100,000 bp, was hybridized to custom oligonucleotides and then sequenced using a genome analyzer. Analysis was carried out blind to the mutation in each sample. Average coverage was >1200 reads per base pair. After filtering sequences for quality and number of reads, all single-nucleotide substitutions, small insertion and deletion mutations, and large genomic duplications and deletions were detected. There were zero false-positive calls of nonsense mutations, frameshift mutations, or genomic rearrangements for any gene in any of the test samples. This approach enables widespread genetic testing and personalized risk assessment for breast and ovarian cancer. PMID:20616022

  2. Return of Results from Genomic Sequencing: A Policy Discussion of Secondary Findings for Cancer Predisposition.

    PubMed

    Johnson, Kimberly J; Gehlert, Sarah

    2014-09-01

    Advances in DNA sequencing technology now allow for the rapid genome-wide identification of inherited and acquired genetic variants including those that have been identified as pathogenic alleles for a number of diseases including cancer. Whole genome and exome sequencing are increasingly becoming a part of both clinical practice and research studies. In 2013 the American College of Medical Genetics and Genomics (ACMG) recommended that results of pathogenic genetic variants in 56 genes, nearly half of which comprise cancer genes (including BRCA1, BRCA2, TP53, MLH1, MLH2, MSH6, PMS2, and APC),be returned to patients who have their genome sequenced independent of the purpose for the test. This recommendation has been highly controversial for several reasons, particularly the recommendation that individuals be returned secondary findings of disease causing variants for adult onset conditions regardless of age and without consideration of patient preferences. In addition, the policy regarding returning results of secondary findings from genomic sequencing studies in research settings is currently unclear. In response to these emerging ethical issues, the Washington University Brown School in St. Louis, MO, United Stateshosted a policy forum entitled "First do no harm: Genetic privacy in the age of genomic sequencing" on February 25(th), 2014. The forum included a panel of experts to discuss their views on ethical issues related to return of results in both the clinical and research settings. In this report, we highlight key issues related to return of results from genome sequencing tests that emerged during the forum. PMID:25229012

  3. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine

    PubMed Central

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer. PMID:24479672

  4. Clinical genomics information management software linking cancer genome sequence and clinical decisions.

    PubMed

    Watt, Stuart; Jiao, Wei; Brown, Andrew M K; Petrocelli, Teresa; Tran, Ben; Zhang, Tong; McPherson, John D; Kamel-Reid, Suzanne; Bedard, Philippe L; Onetto, Nicole; Hudson, Thomas J; Dancey, Janet; Siu, Lillian L; Stein, Lincoln; Ferretti, Vincent

    2013-09-01

    Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician. PMID:23603536

  5. Identification of somatic mutations in cancer through Bayesian-based analysis of sequenced genome pairs

    PubMed Central

    2013-01-01

    Background The field of cancer genomics has rapidly adopted next-generation sequencing (NGS) in order to study and characterize malignant tumors with unprecedented resolution. In particular for cancer, one is often trying to identify somatic mutations – changes specific to a tumor and not within an individual’s germline. However, false positive and false negative detections often result from lack of sufficient variant evidence, contamination of the biopsy by stromal tissue, sequencing errors, and the erroneous classification of germline variation as tumor-specific. Results We have developed a generalized Bayesian analysis framework for matched tumor/normal samples with the purpose of identifying tumor-specific alterations such as single nucleotide mutations, small insertions/deletions, and structural variation. We describe our methodology, and discuss its application to other types of paired-tissue analysis such as the detection of loss of heterozygosity as well as allelic imbalance. We also demonstrate the high level of sensitivity and specificity in discovering simulated somatic mutations, for various combinations of a) genomic coverage and b) emulated heterogeneity. Conclusion We present a Java-based implementation of our methods named Seurat, which is made available for free academic use. We have demonstrated and reported on the discovery of different types of somatic change by applying Seurat to an experimentally-derived cancer dataset using our methods; and have discussed considerations and practices regarding the accurate detection of somatic events in cancer genomes. Seurat is available at https://sites.google.com/site/seuratsomatic. PMID:23642077

  6. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing

    PubMed Central

    Helman, Elena; Lawrence, Michael S.; Stewart, Chip; Sougnez, Carrie; Getz, Gad; Meyerson, Matthew

    2014-01-01

    Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon insertions from sequencing data, to whole genomes from 200 tumor/normal pairs across 11 tumor types as part of The Cancer Genome Atlas (TCGA) Pan-Cancer Project. In addition to novel germline polymorphisms, we find 810 somatic retrotransposon insertions primarily in lung squamous, head and neck, colorectal, and endometrial carcinomas. Many somatic retrotransposon insertions occur in known cancer genes. We find that high somatic retrotransposition rates in tumors are associated with high rates of genomic rearrangement and somatic mutation. Finally, we developed TranspoSeq-Exome to interrogate an additional 767 tumor samples with hybrid-capture exome data and discovered 35 novel somatic retrotransposon insertions into exonic regions, including an insertion into an exon of the PTEN tumor suppressor gene. The results of this large-scale, comprehensive analysis of retrotransposon movement across tumor types suggest that somatic retrotransposon insertions may represent an important class of structural variation in cancer. PMID:24823667

  7. TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence

    Cancer.gov

    Carolyn Hutter, Ph.D., Program Director of NHGRI's Division of Genomic Medicine, discusses the expansion of TCGA's Pan-Cancer efforts to include the Pan-Cancer Analysis of Whole Genomes (PAWG) project.

  8. Breast cancer genomics from microarrays to massively parallel sequencing: paradigms and new insights.

    PubMed

    Ng, Charlotte K Y; Schultheis, Anne M; Bidard, Francois-Clement; Weigelt, Britta; Reis-Filho, Jorge S

    2015-01-01

    Rapid advancements in massively parallel sequencing methods have enabled the analysis of breast cancer genomes at an unprecedented resolution, which have revealed the remarkable heterogeneity of the disease. As a result, we now accept that despite originating in the breast, estrogen receptor (ER)-positive and ER-negative breast cancers are completely different diseases at the molecular level. It has become apparent that there are very few highly recurrently mutated genes such as TP53, PIK3CA, and GATA3, that no two breast cancers display an identical repertoire of somatic genetic alterations at base-pair resolution and that there might not be a single highly recurrently mutated gene that defines each of the "intrinsic" subtypes of breast cancer (ie, basal-like, HER2-enriched, luminal A, and luminal B). Breast cancer heterogeneity, however, extends beyond the diversity between tumors. There is burgeoning evidence to demonstrate that at least some primary breast cancers are composed of multiple, genetically diverse clones at diagnosis and that metastatic lesions may differ in their repertoire of somatic genetic alterations when compared with their respective primary tumors. Several biological phenomena may shape the reported intratumor genetic heterogeneity observed in breast cancers, including the different mutational processes and multiple types of genomic instability. Harnessing the emerging concepts of the diversity of breast cancer genomes and the phenomenon of intratumor genetic heterogeneity will be essential for the development of optimal methods for diagnosis, disease monitoring, and the matching of patients to the drugs that would benefit them the most. PMID:25713166

  9. The cancer genome

    PubMed Central

    Stratton, Michael R.; Campbell, Peter J.; Futreal, P. Andrew

    2010-01-01

    All cancers arise as a result of changes that have occurred in the DNA sequence of the genomes of cancer cells. Over the past quarter of a century much has been learnt about these mutations and the abnormal genes that operate in human cancers. We are now, however, moving into an era in which it will be possible to obtain the complete DNA sequence of large numbers of cancer genomes. These studies will provide us with a detailed and comprehensive perspective on how individual cancers have developed. PMID:19360079

  10. Whole Genome Sequence Analysis Suggests Intratumoral Heterogeneity in Dissemination of Breast Cancer to Lymph Nodes

    PubMed Central

    Blighe, Kevin; Kenny, Laura; Patel, Naina; Guttery, David S.; Page, Karen; Gronau, Julian H.; Golshani, Cyrus; Stebbing, Justin; Coombes, R. Charles; Shaw, Jacqueline A.

    2014-01-01

    Background Intratumoral heterogeneity may help drive resistance to targeted therapies in cancer. In breast cancer, the presence of nodal metastases is a key indicator of poorer overall survival. The aim of this study was to identify somatic genetic alterations in early dissemination of breast cancer by whole genome next generation sequencing (NGS) of a primary breast tumor, a matched locally-involved axillary lymph node and healthy normal DNA from blood. Methods Whole genome NGS was performed on 12 µg (range 11.1–13.3 µg) of DNA isolated from fresh-frozen primary breast tumor, axillary lymph node and peripheral blood following the DNA nanoball sequencing protocol. Single nucleotide variants, insertions, deletions, and substitutions were identified through a bioinformatic pipeline and compared to CIN25, a key set of genes associated with tumor metastasis. Results Whole genome sequencing revealed overlapping variants between the tumor and node, but also variants that were unique to each. Novel mutations unique to the node included those found in two CIN25 targets, TGIF2 and CCNB2, which are related to transcription cyclin activity and chromosomal stability, respectively, and a unique frameshift in PDS5B, which is required for accurate sister chromatid segregation during cell division. We also identified dominant clonal variants that progressed from tumor to node, including SNVs in TP53 and ARAP3, which mediates rearrangements to the cytoskeleton and cell shape, and an insertion in TOP2A, the expression of which is significantly associated with tumor proliferation and can segregate breast cancers by outcome. Conclusion This case study provides preliminary evidence that primary tumor and early nodal metastasis have largely overlapping somatic genetic alterations. There were very few mutations unique to the involved node. However, significant conclusions regarding early dissemination needs analysis of a larger number of patient samples. PMID:25546409

  11. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.

    PubMed

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2012-05-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology. PMID:22294438

  12. The next steps in next-gen sequencing of cancer genomes.

    PubMed

    Hayes, D Neil; Kim, William Y

    2015-02-01

    The necessary infrastructure to carry out genomics-driven oncology is now widely available and has resulted in the exponential increase in characterized cancer genomes. While a subset of genomic alterations is clinically actionable, the majority of somatic events remain classified as variants of unknown significance and will require functional characterization. A careful cataloging of the genomic alterations and their response to therapeutic intervention should allow the compilation of an "actionability atlas" and the creation of a genomic taxonomy stratified by tumor type and oncogenic pathway activation. The next phase of genomic medicine will therefore require talented bioinformaticians, genomic navigators, and multidisciplinary approaches to decode complex cancer genomes and guide potential therapy. Equally important will be the ethical and interpretable return of results to practicing oncologists. Finally, the integration of genomics into clinical trials is likely to speed the development of predictive biomarkers of response to targeted therapy as well as define pathways to acquired resistance. PMID:25642706

  13. The Cancer Genome Atlas - TCGA - Home Page

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  14. Genomic Datasets for Cancer Research

    Cancer.gov

    A variety of datasets from genome-wide association studies of cancer and other genotype-phenotype studies, including sequencing and molecular diagnostic assays, are available to approved investigators through the Extramural National Cancer Institute (NCI) Data Access Committee (DAC).

  15. Cancer Genome Anatomy Project

    NSDL National Science Digital Library

    The National Cancer Institute has launched the Cancer Genome Anatomy Project to "achieve a comprehensive molecular characterization of normal, precancerous, and malignant cells." Sequenced genes are held as library entries in a database and are available for downloading (fasta format). Each cDNA library entry may include biological source, number of sequences, and library construction detail information. Thousands of gene sequences are available for over 15 cancers, including breast, colon, and prostrate. Contact information for donating or obtaining tissue samples for research purposes is provided.

  16. Flexible positions, managed hopes: The promissory bioeconomy of a whole genome sequencing cancer study.

    PubMed

    Haase, Rachel; Michie, Marsha; Skinner, Debra

    2015-04-01

    Genomic research has rapidly expanded its scope and ambition over the past decade, promoted by both public and private sectors as having the potential to revolutionize clinical medicine. This promissory bioeconomy of genomic research and technology is generated by, and in turn generates, the hopes and expectations shared by investors, researchers and clinicians, patients, and the general public alike. Examinations of such bioeconomies have often focused on the public discourse, media representations, and capital investments that fuel these "regimes of hope," but also crucial are the more intimate contexts of small-scale medical research, and the private hopes, dreams, and disappointments of those involved. Here we examine one local site of production in a university-based clinical research project that sought to identify novel cancer predisposition genes through whole genome sequencing in individuals at high risk for cancer. In-depth interviews with 24 adults who donated samples to the study revealed an ability to shift flexibly between positioning themselves as research participants on the one hand, and as patients or as family members of patients, on the other. Similarly, interviews with members of the research team highlighted the dual nature of their positions as researchers and as clinicians. For both parties, this dual positioning shaped their investment in the project and valuing of its possible outcomes. In their narratives, all parties shifted between these different relational positions as they managed hopes and expectations for the research project. We suggest that this flexibility facilitated study implementation and participation in the face of potential and probable disappointment on one or more fronts, and acted as a key element in the resilience of this local promissory bioeconomy. We conclude that these multiple dimensions of relationality and positionality are inherent and essential in the creation of any complex economy, "bio" or otherwise. PMID:25697637

  17. Germline Variation in Cancer-Susceptibility Genes in a Healthy, Ancestrally Diverse Cohort: Implications for Individual Genome Sequencing

    PubMed Central

    Bodian, Dale L.; McCutcheon, Justine N.; Kothiyal, Prachi; Huddleston, Kathi C.; Iyer, Ramaswamy K.; Vockley, Joseph G.; Niederhuber, John E.

    2014-01-01

    Technological advances coupled with decreasing costs are bringing whole genome and whole exome sequencing closer to routine clinical use. One of the hurdles to clinical implementation is the high number of variants of unknown significance. For cancer-susceptibility genes, the difficulty in interpreting the clinical relevance of the genomic variants is compounded by the fact that most of what is known about these variants comes from the study of highly selected populations, such as cancer patients or individuals with a family history of cancer. The genetic variation in known cancer-susceptibility genes in the general population has not been well characterized to date. To address this gap, we profiled the nonsynonymous genomic variation in 158 genes causally implicated in carcinogenesis using high-quality whole genome sequences from an ancestrally diverse cohort of 681 healthy individuals. We found that all individuals carry multiple variants that may impact cancer susceptibility, with an average of 68 variants per individual. Of the 2,688 allelic variants identified within the cohort, most are very rare, with 75% found in only 1 or 2 individuals in our population. Allele frequencies vary between ancestral groups, and there are 21 variants for which the minor allele in one population is the major allele in another. Detailed analysis of a selected subset of 5 clinically important cancer genes, BRCA1, BRCA2, KRAS, TP53, and PTEN, highlights differences between germline variants and reported somatic mutations. The dataset can serve a resource of genetic variation in cancer-susceptibility genes in 6 ancestry groups, an important foundation for the interpretation of cancer risk from personal genome sequences. PMID:24728327

  18. Whole Genome Sequencing

    MedlinePLUS

    ... Most of the information you get from a genomic test tells you about your risk for disease, ... A health forecast: Understanding disease risk from whole genomic sequencing Weather forecasting tries to predict what the ...

  19. Sequencing technologies and genome sequencing

    Microsoft Academic Search

    Chandra Shekhar Pareek; Rafal Smoczynski; Andrzej Tretyn

    The high-throughput - next generation sequencing (HT-NGS) technologies are currently the hottest topic in the field of human\\u000a and animals genomics researches, which can produce over 100 times more data compared to the most sophisticated capillary sequencers\\u000a based on the Sanger method. With the ongoing developments of high throughput sequencing machines and advancement of modern\\u000a bioinformatics tools at unprecedented pace,

  20. POSTDOC POSITION IN COMPUTATIONAL CANCER GENOMICS Barcelona Biomedical Genomics Lab

    E-print Network

    Pompeu Fabra, Universitat

    POSTDOC POSITION IN COMPUTATIONAL CANCER GENOMICS Barcelona Biomedical Genomics Lab Job description of pan-cancer genomics data, including tumour genome and transcriptome sequencing data, focused postdocs, 2 PhD students and 2 software engineers, with complementary expertise in biology, medicine

  1. Draft Genome Sequence of a Helicobacter pylori Strain Isolated from a Patient with Diffuse Gastritis from a Region of High Cancer Risk in Colombia

    PubMed Central

    Bayona Rojas, Martin; Barragán Vidal, Carlos; Trujillo, Clara Esperanza; Bravo, María Mercedes

    2015-01-01

    The draft genome sequence of one Colombian Helicobacter pylori strain is presented. This strain was isolated from a patient with diffuse gastritis from Tibaná, Boyacá, a region with high gastric cancer risk. PMID:25858838

  2. Draft Genome Sequence of a Helicobacter pylori Strain Isolated from a Patient with Diffuse Gastritis from a Region of High Cancer Risk in Colombia.

    PubMed

    Gutiérrez-Escobar, Andrés J; Bayona Rojas, Martin; Barragán Vidal, Carlos; Trujillo, Clara Esperanza; Bravo, María Mercedes

    2015-01-01

    The draft genome sequence of one Colombian Helicobacter pylori strain is presented. This strain was isolated from a patient with diffuse gastritis from Tibaná, Boyacá, a region with high gastric cancer risk. PMID:25858838

  3. The Cancer Genome Atlas completes detailed ovarian cancer analysis:

    Cancer.gov

    An analysis of genomic changes in ovarian cancer has provided the most comprehensive and integrated view of cancer genes for any cancer type to date. Ovarian serous adenocarcinoma tumors from 500 patients were examined by The Cancer Genome Atlas (TCGA) Research Network. TCGA researchers completed whole-exome sequencing, which examines the protein-coding regions of the genome, on an unprecedented 316 tumors.

  4. From human genome to cancer genome: The first decade

    PubMed Central

    Wheeler, David A.; Wang, Linghua

    2013-01-01

    The realization that cancer progression required the participation of cellular genes provided one of several key rationales, in 1986, for embarking on the human genome project. Only with a reference genome sequence could the full spectrum of somatic changes leading to cancer be understood. Since its completion in 2003, the human reference genome sequence has fulfilled its promise as a foundational tool to illuminate the pathogenesis of cancer. Herein, we review the key historical milestones in cancer genomics since the completion of the genome, and some of the novel discoveries that are shaping our current understanding of cancer. PMID:23817046

  5. Porcine Genomic Sequencing Initiative

    Microsoft Academic Search

    Gary Rohrer; Jonathan E. Beever; Max F. Rothschild; Lawrence Schook; Richard Gibbs; George Weinstock; W. Gregory

    A. Specific biological rationales for the utility of the porcine sequence information Rationale and Objectives. Completion of the human genome sequence provides the starting point for understanding the genetic complexity of humans and how genetic variation contributes to diverse phenotypes and disease. It is clear that model organisms have played an invaluable role in the synthesis of this understanding. It

  6. Cancer Genomics Overview

    Cancer.gov

    Cancer Genomics Overview Cancer is a group of diseases caused by changes in a person's genome that allow a tumor to form. These changes can be inherited from one's parents, caused by environmental factors, or occur during natural processes such as cell division. The field of cancer genomics studies these changes. Once the cancer-causing changes are identified, scientists may be able to develop drugs to target these changes, resulting in a better understanding of cancer as well as improved treatments.

  7. Draft Genome Sequence of Erythromycin-Resistant Streptococcus gallolyticus subsp. gallolyticus NTS 31106099 Isolated from a Patient with Infective Endocarditis and Colorectal Cancer.

    PubMed

    Kambarev, Stanimir; Caté, Clément; Corvec, Stéphane; Pecorari, Frédéric

    2015-01-01

    Streptococcus gallolyticus subsp. gallolyticus is known for its close association with infective endocarditis and colorectal cancer in humans. Here, we report the draft genome sequence of highly erythromycin-resistant strain NTS 31106099 isolated from a patient with infective endocarditis and colorectal cancer. PMID:25908147

  8. Prenatal Whole Genome Sequencing

    PubMed Central

    Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

    2014-01-01

    With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means. PMID:22777977

  9. Genome Sequence of \\

    Microsoft Academic Search

    Thomas Persson; David R Benson; Philippe Normand; Brian Vanden Heuvel; Petar Pujic; Olga Chertkov; Hazuki Teshima; David Bruce; J. Chris Detter; Roxanne Tapia; Cliff Han; James Han; Tanja Woyke; Sam Pitluck; Len Pennacchio; Matt Nolan; N Ivanova; Amrita Pati; Miriam L Land; Katharina Pawlowski; Alison M Berry

    2011-01-01

    Members of the noncultured clade of Frankia enter into root nodule symbioses with actinorhizal species from the orders Cucurbitales and Rosales. We report the genome sequence of a member of this clade originally from Pakistan but obtained from root nodules of the American plant Datisca glomerata without isolation in culture.

  10. Wheat and Barley Genome Sequencing

    Microsoft Academic Search

    Kellye Eversole; Andreas Graner; Nils Stein

    A high quality reference genome sequence is a prerequisite resource for accessing any gene, driving genomics-based approaches\\u000a to systems biology, and for efficient exploitation of natural and induced genetic diversity of an organism. Wheat and barley\\u000a possess genomes of a size that was long presumed to be not amenable for whole genome sequencing. So far, only limited genomic\\u000a sequencing of

  11. Towards Sequencing Cotton (Gossypium) Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly. Generating larger amounts of sequence data more quickly does not address the difficulties of sequencing and assembling complex genomes de novo. The cotton genomes represent a...

  12. Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden

    PubMed Central

    Cazier, J.-B.; Rao, S.R.; McLean, C.M.; Walker, A.L.; Wright, B.J.; Jaeger, E.E.M.; Kartsonaki, C.; Marsden, L.; Yau, C.; Camps, C.; Kaisaki, P.; Allan, Christopher; Attar, Moustafa; Bell, John; Bentley, David; Broxholme, John; Buck, David; Cazier, Jean-Baptiste; Copley, Richard; Cornall, Richard; Donnelly, Peter; Fiddy, Simon; Green, Angie; Gregory, Lorna; Grocock, Russell; Hatton, Edouard; Holmes, Chris; Hughes, Linda; Humburg, Peter; Humphray, Sean; Kanapin, Alexander; Kingsbury, Zoya; Knight, Julian; Lamble, Sarah; Lise, Stefano; Lonie, Lorne; Lunter, Gerton; Martin, Hilary; Murray, Lisa; McCarthy, Davis; McVean, Gil; Pagnamenta, Alistair; Piazza, Paolo; Polanco, Guadelupe; Ratcliffe, Peter; Rimmer, Andy; Sahgal, Natasha; Taylor, Jenny; Tomlinson, Ian; Trebes, Amy; Wilkie, Andrew; Wright, Ben; Yau, Chris; Taylor, J.; Catto, J.W.; Tomlinson, I.P.M.; Kiltie, A.E.; Hamdy, F.C.

    2014-01-01

    Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression. PMID:24777035

  13. Genome Sequence Databases (Overview): Sequencing and Assembly

    SciTech Connect

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  14. Comparative effectiveness of next generation genomic sequencing for disease diagnosis: design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes.

    PubMed

    Gallego, Carlos J; Bennette, Caroline S; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M; Bennett, Robin L; Dorschner, Michael O; Tarczy-Hornoch, Peter; Grady, William M; Fullerton, S Malia; Trinidad, Susan B; Regier, Dean A; Nickerson, Deborah A; Burke, Wylie; Patrick, Donald L; Jarvik, Gail P; Veenstra, David L

    2014-09-01

    Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of results visit, and findings not associated with colorectal cancer (incidental findings) during a second return of results visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

  15. Comparative effectiveness of next generation genomic sequencing for disease diagnosis: Design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes?

    PubMed Central

    Gallego, Carlos J.; Bennette, Caroline S.; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M.; Bennett, Robin L.; Dorschner, Michael O.; Tarczy-Hornoch, Peter; Grady, William M.; Fullerton, S. Malia; Trinidad, Susan B.; Regier, Dean A.; Nickerson, Deborah A.; Burke, Wylie; Patrick, Donald L.; Jarvik, Gail P.; Veenstra, David L.

    2014-01-01

    Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of result visit, and findings not associated with colorectal cancer (incidental findings) during a second return of result visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

  16. Next-generation sequencing for the diagnosis of hereditary breast and ovarian cancer using genomic capture targeting multiple candidate genes.

    PubMed

    Castéra, Laurent; Krieger, Sophie; Rousselin, Antoine; Legros, Angélina; Baumann, Jean-Jacques; Bruet, Olivia; Brault, Baptiste; Fouillet, Robin; Goardon, Nicolas; Letac, Olivier; Baert-Desurmont, Stéphanie; Tinat, Julie; Bera, Odile; Dugast, Catherine; Berthet, Pascaline; Polycarpe, Florence; Layet, Valérie; Hardouin, Agnes; Frébourg, Thierry; Vaur, Dominique

    2014-11-01

    To optimize the molecular diagnosis of hereditary breast and ovarian cancer (HBOC), we developed a next-generation sequencing (NGS)-based screening based on the capture of a panel of genes involved, or suspected to be involved in HBOC, on pooling of indexed DNA and on paired-end sequencing in an Illumina GAIIx platform, followed by confirmation by Sanger sequencing or MLPA/QMPSF. The bioinformatic pipeline included CASAVA, NextGENe, CNVseq and Alamut-HT. We validated this procedure by the analysis of 59 patients' DNAs harbouring SNVs, indels or large genomic rearrangements of BRCA1 or BRCA2. We also conducted a blind study in 168 patients comparing NGS versus Sanger sequencing or MLPA analyses of BRCA1 and BRCA2. All mutations detected by conventional procedures were detected by NGS. We then screened, using three different versions of the capture set, a large series of 708 consecutive patients. We detected in these patients 69 germline deleterious alterations within BRCA1 and BRCA2, and 4 TP53 mutations in 468 patients also tested for this gene. We also found 36 variations inducing either a premature codon stop or a splicing defect among other genes: 5/708 in CHEK2, 3/708 in RAD51C, 1/708 in RAD50, 7/708 in PALB2, 3/708 in MRE11A, 5/708 in ATM, 3/708 in NBS1, 1/708 in CDH1, 3/468 in MSH2, 2/468 in PMS2, 1/708 in BARD1, 1/468 in PMS1 and 1/468 in MLH3. These results demonstrate the efficiency of NGS in performing molecular diagnosis of HBOC. Detection of mutations within other genes than BRCA1 and BRCA2 highlights the genetic heterogeneity of HBOC. PMID:24549055

  17. PERSPECTIVE Personal genome sequencing: current

    E-print Network

    Gerstein, Mark

    of this information for discovery and medicine is enormous. Fourteen genome sequences have been reported to datePERSPECTIVE Personal genome sequencing: current approaches and challenges Michael Snyder,1,5 Jiang Du,2 and Mark Gerstein2,3,4 1 Department of Genetics, Stanford University School of Medicine

  18. Testing personalized medicine: patient and physician expectations of next-generation genomic sequencing in late-stage cancer care.

    PubMed

    Miller, Fiona A; Hayeems, Robin Z; Bytautas, Jessica P; Bedard, Philippe L; Ernst, Scott; Hirte, Hal; Hotte, Sebastien; Oza, Amit; Razak, Albiruni; Welch, Stephen; Winquist, Eric; Dancey, Janet; Siu, Lillian L

    2014-03-01

    Developments in genomics, including next-generation sequencing technologies, are expected to enable a more personalized approach to clinical care, with improved risk stratification and treatment selection. In oncology, personalized medicine is particularly advanced and increasingly used to identify oncogenic variants in tumor tissue that predict responsiveness to specific drugs. Yet, the translational research needed to validate these technologies will be conducted in patients with late-stage cancer and is expected to produce results of variable clinical significance and incidentally identify genetic risks. To explore the experiential context in which much of personalized cancer care will be developed and evaluated, we conducted a qualitative interview study alongside a pilot feasibility study of targeted DNA sequencing of metastatic tumor biopsies in adult patients with advanced solid malignancies. We recruited 29/73 patients and 14/17 physicians; transcripts from semi-structured interviews were analyzed for thematic patterns using an interpretive descriptive approach. Patient hopes of benefit from research participation were enhanced by the promise of novel and targeted treatment but challenged by non-findings or by limited access to relevant trials. Family obligations informed a willingness to receive genetic information, which was perceived as burdensome given disease stage or as inconsequential given faced challenges. Physicians were optimistic about long-term potential but conservative about immediate benefits and mindful of elevated patient expectations; consent and counseling processes were expected to mitigate challenges from incidental findings. These findings suggest the need for information and decision tools to support physicians in communicating realistic prospects of benefit, and for cautious approaches to the generation of incidental genetic information. PMID:23860039

  19. Somatic retrotransposition in the cancer genome

    E-print Network

    Helman, Elena

    2014-01-01

    Cancer is a complex disease of the genome exhibiting myriad somatic mutations, from single nucleotide changes to various chromosomal rearrangements. The technological advances of next-generation sequencing enable high-throughput ...

  20. Whole-Genome Sequences of DA and F344 Rats with Different Susceptibilities to Arthritis, Autoimmunity, Inflammation and Cancer

    PubMed Central

    Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A.; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S.

    2013-01-01

    DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301

  1. Pig genome sequence - analysis and publication strategy

    Microsoft Academic Search

    Alan L Archibald; Lars Bolund; Carol Churcher; Merete Fredholm; Martien AM Groenen; Barbara Harlizius; Kyung-Tai Lee; Denis Milan; Jane Rogers; Max F Rothschild; Hirohide Uenishi; Jun Wang; Lawrence B Schook

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through

  2. Whole-genome sequencing of asian lung cancers: second-hand smoke unlikely to be responsible for higher incidence of lung cancer among Asian never-smokers.

    PubMed

    Krishnan, Vidhya G; Ebert, Philip J; Ting, Jason C; Lim, Elaine; Wong, Swee-Seong; Teo, Audrey S M; Yue, Yong G; Chua, Hui-Hoon; Ma, Xiwen; Loh, Gary S L; Lin, Yuhao; Tan, Joanna H J; Yu, Kun; Zhang, Shenli; Reinhard, Christoph; Tan, Daniel S W; Peters, Brock A; Lincoln, Stephen E; Ballinger, Dennis G; Laramie, Jason M; Nilsen, Geoffrey B; Barber, Thomas D; Tan, Patrick; Hillmer, Axel M; Ng, Pauline C

    2014-11-01

    Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status. PMID:25189529

  3. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing

    E-print Network

    Helman, Elena

    Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon ...

  4. A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome.

    PubMed

    Hampton, Oliver A; Den Hollander, Petra; Miller, Christopher A; Delgado, David A; Li, Jian; Coarfa, Cristian; Harris, Ronald A; Richards, Stephen; Scherer, Steven E; Muzny, Donna M; Gibbs, Richard A; Lee, Adrian V; Milosavljevic, Aleksandar

    2009-02-01

    By applying a method that combines end-sequence profiling and massively parallel sequencing, we obtained a sequence-level map of chromosomal aberrations in the genome of the MCF-7 breast cancer cell line. A total of 157 distinct somatic breakpoints of two distinct types, dispersed and clustered, were identified. A total of 89 breakpoints are evenly dispersed across the genome. A majority of dispersed breakpoints are in regions of low copy repeats (LCRs), indicating a possible role for LCRs in chromosome breakage. The remaining 68 breakpoints form four distinct clusters of closely spaced breakpoints that coincide with the four highly amplified regions in MCF-7 detected by array CGH located in the 1p13.1-p21.1, 3p14.1-p14.2, 17q22-q24.3, and 20q12-q13.33 chromosomal cytobands. The clustered breakpoints are not significantly associated with LCRs. Sequences flanking most (95%) breakpoint junctions are consistent with double-stranded DNA break repair by nonhomologous end-joining or template switching. A total of 79 known or predicted genes are involved in rearrangement events, including 10 fusions of coding exons from different genes and 77 other rearrangements. Four fusions result in novel expressed chimeric mRNA transcripts. One of the four expressed fusion products (RAD51C-ATXN7) and one gene truncation (BRIP1 or BACH1) involve genes coding for members of protein complexes responsible for homology-driven repair of double-stranded DNA breaks. Another one of the four expressed fusion products (ARFGEF2-SULF2) involves SULF2, a regulator of cell growth and angiogenesis. We show that knock-down of SULF2 in cell lines causes tumorigenic phenotypes, including increased proliferation, enhanced survival, and increased anchorage-independent growth. PMID:19056696

  5. The Trichomonas vaginalis Genome Sequencing Project

    NSDL National Science Digital Library

    The Institute for Genomic Research (TIGR) in 2003 released the first draft assembly of the Trichomonas vaginalis_genome, available through this website to the academic and not-for-profit research community for noncommercial use only. TIGR will release more data at regular intervals during the sequencing project, which should help researchers better understand this widespread parasite and its role in HIV infection, neo-natal disorders, predisposition to cervical cancer, and of course, vaginitis. The website also includes background information on T. vaginalis, as well as a link to TIGR's sequencing project for Entamoeba histolytica -- a closely related organism.

  6. The genomic complexity of primary human prostate cancer

    E-print Network

    Carter, Scott Lambert

    Prostate cancer is the second most common cause of male cancer deaths in the United States. However, the full range of prostate cancer genomic alterations is incompletely characterized. Here we present the complete sequence ...

  7. Genomic Instability and Cancer

    PubMed Central

    Yao, Yixin; Dai, Wei

    2014-01-01

    Genomic instability is a characteristic of most cancer cells. It is an increased tendency of genome alteration during cell division. Cancer frequently results from damage to multiple genes controlling cell division and tumor suppressors. It is known that genomic integrity is closely monitored by several surveillance mechanisms, DNA damage checkpoint, DNA repair machinery and mitotic checkpoint. A defect in the regulation of any of these mechanisms often results in genomic instability, which predisposes the cell to malignant transformation. Posttranslational modifications of the histone tails are closely associated with regulation of the cell cycle as well as chromatin structure. Nevertheless, DNA methylation status is also related to genomic integrity. We attempt to summarize recent developments in this field and discuss the debate of driving force of tumor initiation and progression. PMID:25541596

  8. NCI Community Cancer Centers Program - Related Programs - The Cancer Genome Atlas

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a large-scale collaborative effort by NCI and the National Human Genome Research Institute (NHGRI) to systematically characterize the genomic changes that occur in cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  9. Endometrial and acute myeloid leukemia cancer genomes characterized

    Cancer.gov

    The characterization of acute myeloid leukemia and endometrial cancer are the latest results of The Cancer Genome Atlas Research Network’s efforts to sequence the genomes of 20 major cancers. The photo above shows technicians from The Genome Institute at Washington University in St. Louis.

  10. Bacterial genome sequence bagged

    SciTech Connect

    Nowak, R.

    1995-07-28

    This is a summary of the research which produced the first complete genome of a free-living organism, the bacterium Haemophilus influenzae. Also included are the practical information and future possibilities of this type of research. The work was done partly under the aspices of Human Genome Program.

  11. Genomic sequencing in clinical trials

    PubMed Central

    2011-01-01

    Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to find its way into clinical trials both nationally and worldwide. We highlight the currently available types of genomic sequencing platforms, outline the advantages and disadvantages of each, and compare first- and next-generation techniques with respect to capabilities, quality, and cost. We describe the current geographical distributions and types of disease conditions in which these technologies are used, and how next-generation sequencing is strategically being incorporated into new and existing studies. Lastly, recent major breakthroughs and the ongoing challenges of using genomic sequencing in clinical research are discussed. PMID:22206293

  12. Punctuated Evolution of Prostate Cancer Genomes

    PubMed Central

    Baca, Sylvan C.; Prandi, Davide; Lawrence, Michael S.; Mosquera, Juan Miguel; Romanel, Alessandro; Drier, Yotam; Park, Kyung; Kitabayashi, Naoki; MacDonald, Theresa Y.; Ghandi, Mahmoud; Van Allen, Eliezer; Kryukov, Gregory V.; Sboner, Andrea; Theurillat, Jean-Philippe; Soong, T. David; Nickerson, Elizabeth; Auclair, Daniel; Tewari, Ashutosh; Beltran, Himisha; Onofrio, Robert C.; Boysen, Gunther; Guiducci, Candace; Barbieri, Christopher E.; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L.; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Winckler, Wendy; Cipicchio, Michelle; Ardlie, Kristin; Kantoff, Philip W.; Berger, Michael F.; Gabriel, Stacey B.; Golub, Todd R.; Meyerson, Matthew; Lander, Eric S.; Elemento, Olivier; Getz, Gad; Demichelis, Francesca; Rubin, Mark A.; Garraway, Levi A.

    2013-01-01

    SUMMARY The analysis of exonic DNA from prostate cancers has identified recurrently mutated genes, but the spectrum of genome-wide alterations has not been profiled extensively in this disease. We sequenced the genomes of 57 prostate tumors and matched normal tissues to characterize somatic alterations and to study how they accumulate during oncogenesis and progression. By modeling the genesis of genomic rearrangements, we identified abundant DNA translocations and deletions that arise in a highly interdependent manner. This phenomenon, which we term “chromoplexy”, frequently accounts for the dysregulation of prostate cancer genes and appears to disrupt multiple cancer genes coordinately. Our modeling suggests that chromoplexy may induce considerable genomic derangement over relatively few events in prostate cancer and other neoplasms, supporting a model of punctuated cancer evolution. By characterizing the clonal hierarchy of genomic lesions in prostate tumors, we charted a path of oncogenic events along which chromoplexy may drive prostate carcinogenesis. PMID:23622249

  13. Using the Potato Genome Sequence! Robin Buell!

    E-print Network

    Douches, David S.

    Using the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant Biology! August 15, 2010! buell@msu.edu! 1 #12;Whole Genome Shotgun Sequencing 2 #12;New genomics & post-genomic biology genomes genera 2002 2010 3 #12;So, you say you can sequence-Now what

  14. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  15. Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,

    E-print Network

    Purugganan, Michael D.

    COMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome Information Entrez Genome Projects website reports that sequencing of several more plant genomes is in prog in plant genomics re- search. Many of the obvious candidates for genome sequencing, model species

  16. Poultry Genome Sequences: Progress and Outstanding Challenges

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The first build of the chicken genome sequence appeared in March 2004 – the first genome sequence of any animal agriculture species. That sequence was done primarily by whole genome shotgun Sanger sequencing, along with the use of an extensive BAC contig-based physical map to assemble the sequence ...

  17. DNA Methylation of Cancer Genome

    PubMed Central

    Cheung, Hoi-Hung; Lee, Tin-Lap; Rennert, Owen M.; Chan, Wai-Yee

    2010-01-01

    DNA methylation plays an important role in regulating normal development and carcinogenesis. Current understanding of the biological roles of DNA methylation is limited to its role in the regulation of gene transcription, genomic imprinting, genomic stability, and X chromosome inactivation. In the past 2 decades, a large number of changes have been identified in cancer epigenomes when compared with normals. These alterations fall into two main categories, namely, hypermethylation of tumor suppressor genes and hypomethylation of oncogenes or heterochromatin, respectively. Aberrant methylation of genes controlling the cell cycle, proliferation, apoptosis, metastasis, drug resistance, and intracellular signaling has been identified in multiple cancer types. Recent advancements in whole-genome analysis of methylome have yielded numerous differentially methylated regions, the functions of which are largely unknown. With the development of high resolution tiling microarrays and high throughput DNA sequencing, more cancer methylomes will be profiled, facilitating the identification of new candidate genes or ncRNAs that are related to oncogenesis, new prognostic markers, and the discovery of new target genes for cancer therapy.† PMID:19960550

  18. Translating genomics in cancer care.

    PubMed

    Bombard, Yvonne; Bach, Peter B; Offit, Kenneth

    2013-11-01

    There is increasing enthusiasm for genomics and its promise in advancing personalized medicine. Genomic information has been used to personalize health care for decades, spanning the fields of cardiovascular disease, infectious disease, endocrinology, metabolic medicine, and hematology. However, oncology has often been the first test bed for the clinical translation of genomics for diagnostic, prognostic, and therapeutic applications. Notable hereditary cancer examples include testing for mutations in BRCA1 or BRCA2 in unaffected women to identify those at significantly elevated risk for developing breast and ovarian cancers, and screening patients with newly diagnosed colorectal cancer for mutations in 4 mismatch repair genes to reduce morbidity and mortality in their relatives. Somatic genomic testing is also increasingly used in oncology, with gene expression profiling of breast tumors and EGFR testing to predict treatment response representing commonly used examples. Health technology assessment provides a rigorous means to inform clinical and policy decision-making through systematic assessment of the evidentiary base, along with precepts of clinical effectiveness, cost-effectiveness, and consideration of risks and benefits for health care delivery and society. Although this evaluation is a fundamental step in the translation of any new therapeutic, procedure, or diagnostic test into clinical care, emerging developments may threaten this standard. These include "direct to consumer" genomic risk assessment services and the challenges posed by incidental results generated from next-generation sequencing (NGS) technologies. This article presents a review of the evidentiary standards and knowledge base supporting the translation of key cancer genomic technologies along the continuum of validity, utility, cost-effectiveness, health service impacts, and ethical and societal issues, and offers future research considerations to guide the responsible introduction of NGS technologies into health care. It concludes that significant evidentiary gaps remain in translating genomic technologies into routine clinical practice, particularly in efficacy, health outcomes, cost-effectiveness, and health services research. These caveats are especially germane in the context of NGS, wherein efforts are underway to translate NGS results despite their limited accuracy, lack of proven efficacy, and significant computational and counseling challenges. Further research across these domains is critical to inform the effective, efficient, and equitable translation of genomics into cancer care. PMID:24225968

  19. The UCSC Cancer Genomics Browser: update 2013.

    PubMed

    Goldman, Mary; Craft, Brian; Swatloski, Teresa; Ellrott, Kyle; Cline, Melissa; Diekhans, Mark; Ma, Singer; Wilks, Chris; Stuart, Josh; Haussler, David; Zhu, Jingchun

    2013-01-01

    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a set of web-based tools to display, investigate and analyse cancer genomics data and its associated clinical information. The browser provides whole-genome to base-pair level views of several different types of genomics data, including some next-generation sequencing platforms. The ability to view multiple datasets together allows users to make comparisons across different data and cancer types. Biological pathways, collections of genes, genomic or clinical information can be used to sort, aggregate and zoom into a group of samples. We currently display an expanding set of data from various sources, including 201 datasets from 22 TCGA (The Cancer Genome Atlas) cancers as well as data from Cancer Cell Line Encyclopedia and Stand Up To Cancer. New features include a completely redesigned user interface with an interactive tutorial and updated documentation. We have also added data downloads, additional clinical heatmap features, and an updated Tumor Image Browser based on Google Maps. New security features allow authenticated users access to private datasets hosted by several different consortia through the public website. PMID:23109555

  20. Whole genome sequencing in pharmacogenomics

    PubMed Central

    Katsila, Theodora

    2015-01-01

    Pharmacogenomics aims to shed light on the role of genes and genomic variants in clinical treatment response. Although, several drug–gene relationships are characterized to date, many challenges still remain toward the application of pharmacogenomics in the clinic; clinical guidelines for pharmacogenomic testing are still in their infancy, whereas the emerging high throughput genotyping technologies produce a tsunami of new findings. Herein, the potential of whole genome sequencing on pharmacogenomics research and clinical application are highlighted.

  1. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan [University of Washington

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

  2. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan [University of Washington

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

  3. Pairwise Comparison Between Genomic Sequences and

    E-print Network

    Mohri, Mehryar

    in comparative genomics and comparative optical-map study, respectively. A complete genome sequencePairwise Comparison Between Genomic Sequences and Optical-maps by Bing Sun A dissertation submitted experimental tech- nologies, massive amount of biological data including genomic sequences and optical-maps

  4. TCGA Announces Launch of the Cancer Genomics Hub

    Cancer.gov

    The Cancer Genome Atlas (TCGA) announces the beta launch of the Cancer Genomics Hub (CGHub) as the new secure repository for storing, cataloging, and accessing cancer genome sequences and alignments from TCGA. CGHub is managed by the University of California, Santa Cruz (UCSC), under a subcontract from SAIC-Frederick and will replace the function of the NCBI Sequence Read Archive for the TCGA program.

  5. Next-generation sequencing for lung cancer.

    PubMed

    Wu, Kehua; Huang, R Stephanie; House, Larry; Cho, William Chi

    2013-09-01

    Lung cancer is biologically aggressive and is the leading cause of cancer-related deaths. The development of lung cancer is unique in each patient according to clinical characterizations, prognosis, response and tolerance to treatment. Traditional capillary-based single-gene sequencing by a first-generation technique (known as Sanger sequencing) has been replaced by next-generation sequencing (NGS) since it allows massive parallel sequencing with lower cost and higher throughput. The NGS approach has made remarkable advances compared with traditional methods. We expect these methodologies to comprehensively interpret the global landscape of cancer and provide more information to fulfill the needs of personalized medicine. This review covers a brief introduction and summary on various NGS technologies, applications and important findings by NGS in lung cancer advances, including further discoveries in previously known target genes (EGFR, ALK and KRAS), the identification of additional lung cancer mutations and the global coordination of cancer genome studies. PMID:23980680

  6. NIH Launches Comprehensive Effort to Explore Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing.

  7. Sequencing the Sinorhizobium meliloti genome.

    PubMed

    Galibert, F; Barloy-Hubler, F; Capela, D; Gouzy, J

    2000-01-01

    The Sinorhizobium meliloti genome consists of three replicons. This bacterium forms an intricate symbiotic relationship with the roots of certain legumes and is considered as an agriculturally important nitrogen-fixer. A consortium of 6 European laboratories was organized to sequence its single chromosome (3.7 Mb), whereas the other two elements (pSyma 1.4 Mb and pSymb 1.7 Mb) will be sequenced by other groups. PMID:11092731

  8. The breast cancer genome - a key for better oncology

    E-print Network

    Vollan, Hans Kristian Moen; Caldas, Carlos

    2011-11-30

    Abstract Molecular classification has added important knowledge to breast cancer biology, but has yet to be implemented as a clinical standard. Full sequencing of breast cancer genomes could potentially refine classification and give a more complete...

  9. Utilization of the Human Genome Sequence Localizes Human Papillomavirus Type 16 DNA Integrated into the TNFAIP2 Gene in a Fatal Cervical Cancer from a 39YearOld Woman

    Microsoft Academic Search

    Mark H. Einstein; Yvette Cruz; Mustafa K. El-Awady; Nicolas C. Popescu; Joseph A. DiPaolo; Marc van Ranst; Anna S. Kadish; Seymour Romney; Carolyn D. Runowicz; Robert D. Burk

    Purpose: The purpose of our study was to characterize a human papillomavirus (HPV) 16 DNA integration in the genome of a rapidly progressive, lethal cervical cancer in a 39-year-old woman. Experimental Design: An HPV 16 integration site from cervical cancer tissue was cloned and analyzed using South- ern blot hybridization, nucleotide sequencing, fluorescence in situ hybridization analysis for chromosomal localization

  10. SPECIAL FEATURES Genomic Sequence Databases

    E-print Network

    Waterman, Michael S.

    SPECIAL FEATURES i I COMMENTARY Genomic Sequence Databases MICHAEL5. WATERMAN Departments at the molecular level are now being felt in other areas such as cell biology and medicine. The quantity a smaller number of people from chemistry, physics, medicine, the mathematical sciences, and other fields

  11. Genome Sequence of Salmonella Phage ?

    PubMed Central

    Ko, Ching-Chung; Jacobs-Sera, Deborah; Hatfull, Graham F.; Erhardt, Marc; Hughes, Kelly T.; Casjens, Sherwood R.

    2015-01-01

    Salmonella bacteriophage ? is a member of the Siphoviridae family that gains entry into its host cells by adsorbing to their flagella. We report the complete 59,578-bp sequence of the genome of phage ?, which together with its relatives, exemplifies a largely unexplored type of tailed bacteriophage. PMID:25720684

  12. Colon cancer-derived oncogenic EGFR G724S mutant identified by whole genome sequence analysis is dependent on asymmetric dimerization and sensitive to cetuximab

    PubMed Central

    2014-01-01

    Background Inhibition of the activated epidermal growth factor receptor (EGFR) with either enzymatic kinase inhibitors or anti-EGFR antibodies such as cetuximab, is an effective modality of treatment for multiple human cancers. Enzymatic EGFR inhibitors are effective for lung adenocarcinomas with somatic kinase domain EGFR mutations while, paradoxically, anti-EGFR antibodies are more effective in colon and head and neck cancers where EGFR mutations occur less frequently. In colorectal cancer, anti-EGFR antibodies are routinely used as second-line therapy of KRAS wild-type tumors. However, detailed mechanisms and genomic predictors for pharmacological response to these antibodies in colon cancer remain unclear. Findings We describe a case of colorectal adenocarcinoma, which was found to harbor a kinase domain mutation, G724S, in EGFR through whole genome sequencing. We show that G724S mutant EGFR is oncogenic and that it differs from classic lung cancer derived EGFR mutants in that it is cetuximab responsive in vitro, yet relatively insensitive to small molecule kinase inhibitors. Through biochemical and cellular pharmacologic studies, we have determined that cells harboring the colon cancer-derived G719S and G724S mutants are responsive to cetuximab therapy in vitro and found that the requirement for asymmetric dimerization of these mutant EGFR to promote cellular transformation may explain their greater inhibition by cetuximab than small-molecule kinase inhibitors. Conclusion The colon-cancer derived G719S and G724S mutants are oncogenic and sensitive in vitro to cetuximab. These data suggest that patients with these mutations may benefit from the use of anti-EGFR antibodies as part of the first-line therapy. PMID:24894453

  13. Patterns of somatic mutation in human cancer genomes

    Microsoft Academic Search

    Christopher Greenman; Philip Stephens; Raffaella Smith; Gillian L. Dalgliesh; Christopher Hunter; Graham Bignell; Helen Davies; Jon Teague; Adam Butler; Claire Stevens; Sarah Edkins; Sarah O'Meara; Imre Vastrik; Esther E. Schmidt; Tim Avis; Syd Barthorpe; Gurpreet Bhamra; Gemma Buck; Bhudipa Choudhury; Jody Clements; Jennifer Cole; Ed Dicks; Simon Forbes; Kris Gray; Kelly Halliday; Rachel Harrison; Katy Hills; Jon Hinton; Andy Jenkinson; David Jones; Andy Menzies; Tatiana Mironenko; Janet Perry; Keiran Raine; Dave Richardson; Rebecca Shepherd; Alexandra Small; Calli Tofts; Jennifer Varian; Tony Webb; Sofie West; Sara Widaa; Andy Yates; Daniel P. Cahill; David N. Louis; Peter Goldstraw; Andrew G. Nicholson; Francis Brasseur; Leendert Looijenga; Barbara L. Weber; Yoke-Eng Chiew; Anna Defazio; Mel F. Greaves; Anthony R. Green; Peter Campbell; Ewan Birney; Douglas F. Easton; Georgia Chenevix-Trench; Min-Han Tan; Sok Kean Khoo; Bin Tean Teh; Siu Tsan Yuen; Suet Yi Leung; Richard Wooster; P. Andrew Futreal; Michael R. Stratton

    2007-01-01

    Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274megabases (Mb) of DNA corresponding to the

  14. Fuzzy Genome Sequence Assembly for Single and Environmental Genomes

    E-print Network

    Nicolescu, Monica

    and to the first genome sequence as- sembly, Bacteriophage X174 [38]. In 1990 the Human Genome Project in 2003, two years before its projected date. #12;2 Sara Nasser, et al In 1993 The Institute for Genome advancements in technology that lead the to complete sequencing of the Human Genome and the H. influenzae

  15. Genomic medicine for cancer diagnosis.

    PubMed

    Gordon, Benjamin L; Finnerty, Brendan M; Aronova, Anna; Fahey, Thomas J

    2015-01-01

    Genomic diagnostics in cancer has evolved since the completion of the Human Genome Project and the advancements made in diagnosis and therapy in chronic myelogenous leukemia. Among the diseases to achieve limited success or potentially benefit from diagnostic genetic testing are thyroid cancer, Burkitt's lymphoma, gastrointestinal stromal tumors, adrenocortical carcinoma, and colorectal cancer. With increased understanding of genomics, genetic tests should improve diagnosis and help guide medical and surgical management. PMID:25346009

  16. The Sequence of the Human Genome

    Microsoft Academic Search

    J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

    2001-01-01

    A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

  17. Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project

    E-print Network

    Brendel, Volker

    Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project Vicki L. Chandler Genome Sequencing Project. The momentum for this endeavor has been building within the maize (Zea mays and human genomes (Gregory et al., 2002). Our current picture of the maize genome is largely derived from

  18. Sequence and analysis of the Arabidopsis genome

    Microsoft Academic Search

    Michael Bevan; Klaus Mayer; Owen White; Jonathan A Eisen; Daphne Preuss; Thomas Bureau; Steven L Salzberg; Hans-Werner Mewes

    2001-01-01

    The comprehensive analysis of the genome sequence of the plant Arabidopsis thaliana has been completed recently. The genome sequence and associated analyses provide the foundations for rapid progress in many fields of plant research, such as the exploitation of genetic variation in Arabidopsis ecotypes, the assessment of the transcriptome and proteome, and the association of genome changes at the sequence

  19. Visualizing multidimensional cancer genomics data.

    PubMed

    Schroeder, Michael P; Gonzalez-Perez, Abel; Lopez-Bigas, Nuria

    2013-01-01

    Cancer genomics projects employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Examples include projects carried out by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). A crucial step in the extraction of knowledge from the data is the exploration by experts of the different alterations, as well as the multiple relationships between them. To that end, the use of intuitive visualization tools that can integrate different types of alterations with clinical data is essential to the field of cancer genomics. Here, we review effective and common visualization techniques for exploring oncogenomics data and discuss a selection of tools that allow researchers to effectively visualize multidimensional oncogenomics datasets. The review covers visualization methods employed by tools such as Circos, Gitools, the Integrative Genomics Viewer, Cytoscape, Savant Genome Browser, StratomeX and platforms such as cBio Cancer Genomics Portal, IntOGen, the UCSC Cancer Genomics Browser, the Regulome Explorer and the Cancer Genome Workbench. PMID:23363777

  20. The interactive online SKY/M-FISH & CGH database and the Entrez cancer chromosomes search database: linkage of chromosomal aberrations with the genome sequence.

    PubMed

    Knutsen, Turid; Gobu, Vasuki; Knaus, Rodger; Padilla-Nash, Hesed; Augustus, Meena; Strausberg, Robert L; Kirsch, Ilan R; Sirotkin, Karl; Ried, Thomas

    2005-09-01

    To catalog data on chromosomal aberrations in cancer derived from emerging molecular cytogenetic techniques and to integrate these data with genome maps, we have established two resources, the NCI and NCBI SKY/M-FISH & CGH Database and the Cancer Chromosomes database. The goal of the former is to allow investigators to submit and analyze clinical and research cytogenetic data. It contains a karyotype parser tool, which automatically converts the ISCN short-form karyotype into an internal representation displayed in detailed form and as a colored ideogram with band overlay, and also has a tool to compare CGH profiles from multiple cases. The Cancer Chromosomes database integrates the SKY/M-FISH & CGH Database with the Mitelman Database of Chromosome Aberrations in Cancer and the Recurrent Chromosome Aberrations in Cancer database. These three datasets can now be searched seamlessly by use of the Entrez search and retrieval system for chromosome aberrations, clinical data, and reference citations. Common diagnoses, anatomic sites, chromosome breakpoints, junctions, numerical and structural abnormalities, and bands gained and lost among selected cases can be compared by use of the "similarity" report. Because the model used for CGH data is a subset of the karyotype data, it is now possible to examine the similarities between CGH results and karyotypes directly. All chromosomal bands are directly linked to the Entrez Map Viewer database, providing integration of cytogenetic data with the sequence assembly. These resources, developed as a part of the Cancer Chromosome Aberration Project (CCAP) initiative, aid the search for new cancer-associated genes and foster insights into the causes and consequences of genetic alterations in cancer. PMID:15934046

  1. The cancer genome: from structure to function.

    PubMed

    Geurts van Kessel, Ad

    2014-06-01

    The 2014 joint meeting of the International Society for Cellular Oncology (ISCO) and the European Workshop on Cytogenetics and Molecular Genetics of Solid Tumors (EWCMST), organized by Nick Gilbert, Juan Cigudosa and Bauke Ylstra, was held from 11 to 14 May in Malaga, Spain. Since the previous meeting in 2012, the ever increasing availability of new sequencing technologies has enabled the analysis of cancer genomes at an increasingly greater detail. In addition to structural changes in the genome (i.e., translocations, deletions, amplifications), frequent mutations in important regulatory genes have been found to occur, as also frequent alterations in a large number of epigenetic factors. The challenge now is to relate structural changes in cancer genomes to the underlying disease mechanisms and to reveal opportunities for the design of novel (targeted) therapies. During the meeting, various topics related to these challenges and opportunities were addressed, including those dealing with functional genomics, genome instability, biomarkers and diagnostics, cancer genetics and epigenomics. Special attention was paid to therapy-driven cancer evolution (keynote lecture) and relationships between DNA repair, cancer and ageing (Prof. Ploem lecture). Based on the information presented at the meeting, several aspects of the cancer genome and its functional implications are provided in this report. PMID:24980027

  2. Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project

    Microsoft Academic Search

    Mark D. Adams; Jenny M. Kelley; Jeannine D. Gocayne; Mark Dubnick; Mihael H. Polymeropoulos; Hong Xiao; Carl R. Merril; Andrew Wu; Bjorn Olde; Ruben F. Moreno; Anthony R. Kerlavage; W. Richard McCombie; J. Craig Venter

    1991-01-01

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity

  3. Breast cancer genome heterogeneity: a challenge to personalised medicine?

    PubMed

    Swanton, Charles; Burrell, Rebecca A; Futreal, P Andrew

    2011-01-01

    Implementation of high-throughput genomics sequencing approaches into routine laboratory practice has raised the potential for the identification of multiple breast cancer targets suitable for future therapeutic intervention in order to improve cancer outcomes. Results from these studies have revealed bewildering breast cancer genome complexity with very few aberrations occurring in common between breast cancers. In addition, such complexity is compounded by evidence of genomic heterogeneity occurring within individual breast cancers. Such inter-tumoural and intratumoural heterogeneity is likely to present a challenge to personalised therapeutic approaches that might be circumvented through the definition of genome instability mechanisms governing such diversity and their exploitation using synthetic lethal approaches. PMID:21345264

  4. Cancer Genomics Research Laboratory

    Cancer.gov

    CGR’s high throughput laboratory is equipped with state-of-the-art laboratory equipment and automation systems for a large number of applications. CGR supports DCEG in all stages of cancer research from planning to publishing, including experimental design and project management, sample handling, genotyping and sequencing assay design and execution, development and implementation of bioinformatic pipelines, and downstream scientific research and analytical support.

  5. A sequence-based survey of the complex structural organization of tumor genomes

    Microsoft Academic Search

    Benjamin J Raphael; Stanislav Volik; Peng Yu; Chunxiao Wu; Guiqing Huang; Elena V Linardopoulou; Barbara J Trask; Frederic Waldman; Joseph Costello; Kenneth J Pienta; Gordon B Mills; Krystyna Bajsarowicz; Yasuko Kobayashi; Shivaranjani Sridharan; Pamela L Paris; Quanzhou Tao; Sarah J Aerni; Raymond P Brown; Ali Bashir; Joe W Gray; Jan-Fang Cheng; Pieter de Jong; Mikhail Nefedov; Thomas Ried; Hesed M Padilla-Nash; Colin C Collins

    2008-01-01

    BACKGROUND: The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using end sequencing profiling, which relies on paired-end sequencing of cloned tumor genomes. RESULTS: In the present study brain, breast, ovary, and prostate tumors, along with three breast cancer cell lines, were surveyed using end sequencing profiling, yielding the largest available

  6. Human Whole-Genome Shotgun Sequencing

    Microsoft Academic Search

    James L. Weber; Eugene W. Myers

    1997-01-01

    Large-scale sequencing of the human genome is now under way (Boguski et al. 1996; Marshall and Pennisi 1996). Although at the beginning of the Ge-nome Project, many doubted the scientific value of sequencing the entire human genome, these doubts have evaporated almost entirely (Gibbs 1995; Olson 1995). Primary reasons for generating the human genomic sequence are listed in Table 1.The

  7. Genome sequencing and functional genomics approaches in tomato

    Microsoft Academic Search

    Daisuke Shibata

    2005-01-01

    Tomato genome sequencing has been taking place through an international, 10-year initiative entitled the “International Solanaceae Genome Project” (SOL). The strategy proposed by the SOL consortium is to sequence the approximately 220?Mb of euchromatin that contains the majority of genes, rather than the entire tomato genome. Tomato and other Solanaceae plants have unique developmental aspects, such as the formation of

  8. Whole genome sequencing reveals potential targets for therapy in patients with refractory KRAS mutated metastatic colorectal cancer

    PubMed Central

    2014-01-01

    Background The outcome of patients with metastatic colorectal carcinoma (mCRC) following first line therapy is poor, with median survival of less than one year. The purpose of this study was to identify candidate therapeutically targetable somatic events in mCRC patient samples by whole genome sequencing (WGS), so as to obtain targeted treatment strategies for individual patients. Methods Four patients were recruited, all of whom had received?>?2 prior therapy regimens. Percutaneous needle biopsies of metastases were performed with whole blood collection for the extraction of constitutional DNA. One tumor was not included in this study as the quality of tumor tissue was not sufficient for further analysis. WGS was performed using Illumina paired end chemistry on HiSeq2000 sequencing systems, which yielded coverage of greater than 30X for all samples. NGS data were processed and analyzed to detect somatic genomic alterations including point mutations, indels, copy number alterations, translocations and rearrangements. Results All 3 tumor samples had KRAS mutations, while 2 tumors contained mutations in the APC gene and the PIK3CA gene. Although we did not identify a TCF7L2-VTI1A translocation, we did detect a TCF7L2 mutation in one tumor. Among the other interesting mutated genes was INPPL1, an important gene involved in PI3 kinase signaling. Functional studies demonstrated that inhibition of INPPL1 reduced growth of CRC cells, suggesting that INPPL1 may promote growth in CRC. Conclusions Our study further supports potential molecularly defined therapeutic contexts that might provide insights into treatment strategies for refractory mCRC. New insights into the role of INPPL1 in colon tumor cell growth have also been identified. Continued development of appropriate targeted agents towards specific events may be warranted to help improve outcomes in CRC. PMID:24943349

  9. TUMOR HAPLOTYPE ASSEMBLY ALGORITHMS FOR CANCER GENOMICS

    PubMed Central

    AGUIAR, DEREK; WONG, WENDY S.W.; ISTRAIL, SORIN

    2014-01-01

    The growing availability of inexpensive high-throughput sequence data is enabling researchers to sequence tumor populations within a single individual at high coverage. But, cancer genome sequence evolution and mutational phenomena like driver mutations and gene fusions are difficult to investigate without first reconstructing tumor haplotype sequences. Haplotype assembly of single individual tumor populations is an exceedingly difficult task complicated by tumor haplotype heterogeneity, tumor or normal cell sequence contamination, polyploidy, and complex patterns of variation. While computational and experimental haplotype phasing of diploid genomes has seen much progress in recent years, haplotype assembly in cancer genomes remains uncharted territory. In this work, we describe HapCompass-Tumor a computational modeling and algorithmic framework for haplotype assembly of copy number variable cancer genomes containing haplotypes at different frequencies and complex variation. We extend our polyploid haplotype assembly model and present novel algorithms for (1) complex variations, including copy number changes, as varying numbers of disjoint paths in an associated graph, (2) variable haplotype frequencies and contamination, and (3) computation of tumor haplotypes using simple cycles of the compass graph which constrain the space of haplotype assembly solutions. The model and algorithm are implemented in the software package HapCompass-Tumor which is available for download from http://www.brown.edu/Research/Istrail_Lab/. PMID:24297529

  10. The genomic complexity of primary human prostate cancer

    Microsoft Academic Search

    Michael F. Berger; Michael S. Lawrence; Francesca Demichelis; Yotam Drier; Kristian Cibulskis; Andrey Y. Sivachenko; Andrea Sboner; Raquel Esgueva; Dorothee Pflueger; Carrie Sougnez; Robert Onofrio; Scott L. Carter; Kyung Park; Lukas Habegger; Lauren Ambrogio; Timothy Fennell; Melissa Parkin; Gordon Saksena; Douglas Voet; Alex H. Ramos; Trevor J. Pugh; Jane Wilkinson; Sheila Fisher; Wendy Winckler; Scott Mahan; Kristin Ardlie; Jennifer Baldwin; Jonathan W. Simons; Naoki Kitabayashi; Theresa Y. MacDonald; Philip W. Kantoff; Lynda Chin; Stacey B. Gabriel; Mark B. Gerstein; Todd R. Golub; Matthew Meyerson; Ashutosh Tewari; Eric S. Lander; Gad Getz; Mark A. Rubin; Levi A. Garraway

    2011-01-01

    Prostate cancer is the second most common cause of male cancer deaths in the United States. However, the full range of prostate cancer genomic alterations is incompletely characterized. Here we present the complete sequence of seven primary human prostate cancers and their paired normal counterparts. Several tumours contained complex chains of balanced (that is, `copy-neutral') rearrangements that occurred within or

  11. Genome-wide small nucleolar RNA expression analysis of lung cancer by next-generation deep sequencing.

    PubMed

    Gao, Lu; Ma, Jie; Mannoor, Kaiissar; Guarnera, Maria A; Shetty, Amol; Zhan, Min; Xing, Lingxiao; Stass, Sanford A; Jiang, Feng

    2015-03-15

    Emerging evidence indicates that small nucleolar RNAs (snoRNAs), a class of small noncoding RNAs, may play important function in tumorigenesis. Nonsmall-cell lung cancer (NSCLC) is the number one cancer killer for men and women. Systematically characterizing snoRNAs in NSCLC will develop biomarkers for its early detection and prognostication. We used next-generation deep sequencing to comprehensively characterize snoRNA profiles in 12 NSCLC tissues. We used quantitative reverse transcription polymerase chain reaction (qRT-PCR) to verify the findings in 40 surgical Stage I NSCLC specimens and 126 frozen NSCLC tissues of different stages. The 126 NSCLC tissues were divided into a training set and a testing set. Deep sequencing identified 458 snoRNAs, of which, 29 had a ?3.0-fold expression level change in Stage I NSCLC tissues versus normal tissues. qRT-PCR analysis showed that 16 of 29 snoRNAs exhibited consistent changes with deep sequencing data. The 16 snoRNAs exhibited 0.75-0.94 area under receiver-operator characteristic curve values in distinguishing lung tumor from normal lung tissues (all ?0.0001) with 70.0-95.0% sensitivity and 70.0-95.0% specificity. Six genes (snoRA47, snoRA68, snoRA78, snoRA21, snoRD28 and snoRD66) were identified whose expressions were associated with overall survival of the NSCLC patients. A prediction model consisting of three genes (snoRA47, snoRA68 and snoRA78) was developed in the training set of 77 cases, which could significantly predict overall survival of the NSCLC patients (p?

  12. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  13. Differential genome-wide profiling of tandem 3? UTRs among human breast cancer and normal cells by high-throughput sequencing

    PubMed Central

    Fu, Yonggui; Sun, Yu; Li, Yuxin; Li, Jie; Rao, Xingqiang; Chen, Chong; Xu, Anlong

    2011-01-01

    Tandem 3? UTRs produced by alternative polyadenylation (APA) play an important role in gene expression by impacting mRNA stability, translation, and translocation in cells. Several studies have investigated APA site switching in various physiological states; nevertheless, they only focused on either the genes with two known APA sites or several candidate genes. Here, we developed a strategy to study APA sites in a genome-wide fashion with second-generation sequencing technology which could not only identify new polyadenylation sites but also analyze the APA site switching of all genes, especially those with more than two APA sites. We used this strategy to explore the profiling of APA sites in two human breast cancer cell lines, MCF7 and MB231, and one cultured mammary epithelial cell line, MCF10A. More than half of the identified polyadenylation sites are not included in human poly(A) databases. While MCF7 showed shortening 3? UTRs, more genes in MB231 switched to distal poly(A) sites. Several gene ontology (GO) terms and pathways were enriched in the list of genes with switched APA sites, including cell cycle, apoptosis, and metabolism. These results suggest a more complex regulation of APA sites in cancer cells than previously thought. In short, our novel unbiased method can be a powerful approach to cost-effectively investigate the complex mechanism of 3? UTR switching in a genome-wide fashion among various physiological processes and diseases. PMID:21474764

  14. The Human Genome Project: Sequencing the Future

    E-print Network

    #12;The Human Genome Project: Sequencing the Future I n 1986, the U.S. Department of Energy (DOE and unilateral step by announcing its Human Genome Initiative--forerunner of the Human Genome Project critical areas, including those important to DOE missions. The Human Genome Project and DOE's complementary

  15. Expanding the computational toolbox for mining cancer genomes

    PubMed Central

    Ding, Li; Wendl, Michael C.; McMichael, Joshua F.; Raphael, Benjamin J.

    2014-01-01

    High-throughput DNA sequencing has revolutionized cancer genomics with numerous discoveries relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations including single nucleotide variants (SNVs), insertions and deletions (indels), structural aberrations, and gene fusions. Additional computational techniques have proved useful to define those mutations, genes, and molecular networks that drive diverse cancer phenotypes as well as determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic, epigenomic alterations and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application. PMID:25001846

  16. Value of a newly sequenced bacterial genome

    PubMed Central

    Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  17. Marsupial Genome Sequences: Providing Insight into Evolution and Disease

    PubMed Central

    Deakin, Janine E.

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

  18. Genomics at the Ontario Institute for Cancer Research

    SciTech Connect

    Ali, Johar [Ontario Institute for Cancer Research

    2010-06-02

    Johar Ali of the Ontario Institute for Cancer Research discusses genomics and next-gen applications at the OICR on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  19. The Genome Sequencing Center at NCGR

    SciTech Connect

    Schilkey, Faye [National Center for Genome Resources

    2010-06-02

    Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  20. Expressed sequence tags: alternative or complement to whole genome sequences?

    Microsoft Academic Search

    Stephen Rudd

    2003-01-01

    Over three million sequences from approximately 200 plant species have been deposited in the publicly available plant expressed sequence tag (EST) sequence databases. Many of the ESTs have been sequenced as an alternative to complete genome sequencing or as a substrate for cDNA array-based expression analyses. This creates a formidable resource from both biodiversity and gene-discovery standpoints. Bioinformatics-based sequence analysis

  1. Atypical regions in large genomic DNA sequences

    SciTech Connect

    Scherer, S. [Lawrence Berkeley Lab., CA (United States)]|[Univ. of Minnesota, Minneapolis, MN (United States); McPeek, M.S.; Speed, T.P. [Univ. of California, Berkeley, CA (United States)

    1994-07-19

    Large genomic DNA sequences contain regions with distinctive patterns of sequence organization. The authors describe a method using logarithms of probabilities based on seventh-order Markov chains to rapidly identify genomic sequences that do not resemble models of genome organization built from compilations of octanucleotide usage. Data bases have been constructed from Escherichia coli and Saccharomyces cerevisiae DNA sequences of >1000 nt and human sequences of >10,000 nt. Atypical genes and clusters of genes have been located in bacteriophage, yeast, and primate DNA sequences. The authors consider criteria for statistical significance of the results, offer possible explanations for the observed variation in genome organization, and give additional applications of these methods in DNA sequence analysis.

  2. Complete Genome Sequence of Mycobacterium massiliense

    PubMed Central

    Raiol, Tainá; Ribeiro, Guilherme Menegói; Maranhão, Andréa Queiroz; Bocca, Anamélia Lorenzetti; Silva-Pereira, Ildinete; Junqueira-Kipnis, Ana Paula; Brigido, Marcelo de Macedo

    2012-01-01

    Mycobacterium massiliense is a rapidly growing bacterium associated with opportunistic infections. The genome of a representative isolate (strain GO 06) recovered from wound samples from patients who underwent arthroscopic or laparoscopic surgery was sequenced. To the best of our knowledge, this is the first announcement of the complete genome sequence of an M. massiliense strain. PMID:22965084

  3. BSMAP: whole genome bisulfite sequence MAPping program

    Microsoft Academic Search

    Yuanxin Xi; Wei Li

    2009-01-01

    BACKGROUND: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the

  4. Next-generation sequencing, cancer and molecular diagnostics: an interview with Elaine Mardis.

    PubMed

    Mardis, Elaine

    2015-04-01

    Interview with Professor Elaine R Mardis, PhD, by Claire Raison (Commissioning Editor) Elaine Mardis, co-director of the Genome Institute at Washington University (St Louis, MO, USA), is an expert in genome sequencing technologies, having been involved in developing and automating the methods employed in sequencing the human genome. Professor Mardis has made key contributions to the Human Genome Project and more recently, to the field of cancer genomics, including work in The Cancer Genome Atlas. Her current research interests lie in next-generation sequencing and analysis of cancer genomes and the translation of these findings to support therapeutic decision making. PMID:25795041

  5. Genomic Resources for Cancer Epidemiology

    Cancer.gov

    The goal of the 1000 genomes project is to provide a comprehensive resource on human genetic variation. The Project is sequencing the genomes of approximately 2,500 samples at 4x coverage, to provide data on genetic variants with frequencies of at least 1% in the populations studied.

  6. BAC as tools for genome sequencing

    Microsoft Academic Search

    Hong-Bin Zhang; Chengcang Wu

    2001-01-01

    Genome sequencing represents the state-of-the-art technology for large-scale gene discovery, cloning and decoding. Bacteria-based large-insert clones, including bacterial artificial chromosome (BAC), bacteriophage P1-derived artificial chromosome (PAC) and large-insert conventional plasmid-based clone (PBC), are desirable resources and have offered numerous potentials for accelerated sequencing of large, complex genomes. They are not only capable of cloning large DNA fragments of complex genomes

  7. Sequence Imputation of HPV16 Genomes for Genetic Association Studies

    E-print Network

    DeSalle, Rob

    Sequence Imputation of HPV16 Genomes for Genetic Association Studies Benjamin Smith1 , Zigui Chen1 type 16 (HPV16) causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs

  8. Human Genome Sequencing in Health and Disease

    PubMed Central

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  9. Analyzing the cancer methylome through targeted bisulfite sequencing.

    PubMed

    Lee, Eun-Joon; Luo, Junfeng; Wilson, James M; Shi, Huidong

    2013-11-01

    Bisulfite conversion of genomic DNA combined with next-generation sequencing (NGS) has become a very effective approach for mapping the whole-genome and sub-genome wide DNA methylation landscapes. However, whole methylome shotgun bisulfite sequencing is still expensive and not suitable for analyzing large numbers of human cancer specimens. Recent advances in the development of targeted bisulfite sequencing approaches offer several attractive alternatives. The characteristics and applications of these methods are discussed in this review article. In addition, the bioinformatic tools that can be used for sequence capture probe design as well as downstream sequence analyses are also addressed. PMID:23200671

  10. Genome-wide analysis of oral cancer—early results from the Cancer Genome Anatomy Project

    Microsoft Academic Search

    E. J. Shillitoe; M. May; V. Patel; C. Lethanakul; J. F. Ensley; R. L. Strausberg; J. S. Gutkind

    2000-01-01

    The Cancer Genome Anatomy Project (CGAP) is a large cooperative effort sponsored by the US National Institutes of Health designed to find, catalog and annotate genes that are expressed during cancer development. In the past 2 years, the CGAP has sequenced over 700,000 clones from approximately 140 cDNA libraries, resulting in the identification of over 30,000 new human genes. As

  11. Genotyping-by-Sequencing for Populus Population Genomics: An Assessment of Genome Sampling Patterns

    E-print Network

    Genotyping-by-Sequencing for Populus Population Genomics: An Assessment of Genome Sampling Patterns Abstract Continuing advances in nucleotide sequencing technology are inspiring a suite of genomic, recent advances in sequencing chemistry, sequencing platforms, data storage, and computational processing

  12. Genome Sequencing and Analysis Conference IV

    SciTech Connect

    Not Available

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  13. Genomic sequencing of Pleistocene cave bears

    SciTech Connect

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  14. The genome sequence of Drosophila melanogaster.

    SciTech Connect

    NONE

    2000-03-24

    The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

  15. Genome sequence of the palaeopolyploid Jeremy Schmutz1,2

    E-print Network

    Bhattacharyya, Madan Kumar

    ). The soybean genome is the largest whole-genome shotgun- sequenced plant genome so far and compares favourably to all other high-quality draft whole-genome shotgun-sequenced plant genomes (Supplementary Table 4ARTICLES Genome sequence of the palaeopolyploid soybean Jeremy Schmutz1,2 , Steven B. Cannon3

  16. Screens, maps & networks: from genome sequences to personalized medicine.

    PubMed

    Sandmann, Thomas; Boutros, Michael

    2012-02-01

    Genome sequencing of tumors provides a wealth of information on mutations and structural variations, instilling hope that this data can be used to predict individual tumor progression and response to treatment. Yet currently, our ability to predict the functional consequences of these aberrations remains poor. How do cancer-associated mutations give rise to the hallmark phenotypes of cancer? Recently, information about the genetic makeup of cancer cells has been combined with novel functional genomics approaches to identify novel targets, exploit synthetic lethality and explore the rewiring of cellular pathways. Here, we highlight recent developments revealing the hidden landscape of genetic interactions in model organisms and cancer cells, a key step toward personalized cancer diagnostics and therapy. PMID:22366531

  17. Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes

    PubMed Central

    Barthelson, Roger; McFarlin, Adam J.; Rounsley, Steven D.; Young, Sarah

    2011-01-01

    Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. PMID:22174807

  18. Next-generation sequencing: applications beyond genomes

    Microsoft Academic Search

    Samuel Marguerat; Jürg Bähler

    2008-01-01

    The development of DNA sequencing more than 30 years ago has profoundly impacted biological research. In the last couple of years, remarkable technological innovations have emerged that allow the direct and cost-effective sequencing of complex samples at unprecedented scale and speed. These next-generation technologies make it feasible to sequence not only static genomes, but also entire transcriptomes expressed under different

  19. Quality assessment of the human genome sequence

    Microsoft Academic Search

    Jeremy Schmutz; Jeremy Wheeler; Jane Grimwood; Mark Dickson; Joan Yang; Chenier Caoile; Eva Bajorek; Stacey Black; Yee Man Chan; Mirian Denys; Julio Escobar; Dave Flowers; Dea Fotopulos; Carmen Garcia; Maria Gomez; Eidelyn Gonzales; Lauren Haydu; Frederick Lopez; Lucia Ramirez; James Retterer; Alex Rodriguez; Stephanie Rogers; Angelica Salazar; Ming Tsai; Richard M. Myers

    2004-01-01

    As the final sequencing of the human genome has now been completed, we present the results of the largest examination of the quality of the finished DNA sequence. The completed study covers the major contributing sequencing centres and is based on a rigorous combination of laboratory experiments and computational analysis.

  20. An emerging place for lung cancer genomics in 2013

    PubMed Central

    Bowman, Rayleen V.; Yang, Ian A.; Govindan, Ramaswamy; Fong, Kwun M.

    2013-01-01

    Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could see many of these discoveries translated to the clinic at a rapid pace. We look forward to these advances making a difference for the many patients we treat in the Asia-Pacific region and around the world. PMID:24163742

  1. Solvable Sequence Evolution Models and Genomic Correlations

    NASA Astrophysics Data System (ADS)

    Messer, Philipp W.; Arndt, Peter F.; Lässig, Michael

    2005-04-01

    We study a minimal model for genome evolution whose elementary processes are single site mutation, duplication and deletion of sequence regions, and insertion of random segments. These processes are found to generate long-range correlations in the composition of letters as long as the sequence length is growing; i.e., the combined rates of duplications and insertions are higher than the deletion rate. For constant sequence length, on the other hand, all initial correlations decay exponentially. These results are obtained analytically and by simulations. They are compared with the long-range correlations observed in genomic DNA, and the implications for genome evolution are discussed.

  2. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge

    PubMed Central

    Czerwi?ska, Patrycja; Wiznerowicz, Maciej

    2015-01-01

    The Cancer Genome Atlas (TCGA) is a public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive “atlas” of cancer genomic profiles. So far, TCGA researchers have analysed large cohorts of over 30 human tumours through large-scale genome sequencing and integrated multi-dimensional analyses. Studies of individual cancer types, as well as comprehensive pan-cancer analyses have extended current knowledge of tumorigenesis. A major goal of the project was to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer. This review discusses the current status of TCGA Research Network structure, purpose, and achievements. PMID:25691825

  3. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge.

    PubMed

    Tomczak, Katarzyna; Czerwi?ska, Patrycja; Wiznerowicz, Maciej

    2015-01-01

    The Cancer Genome Atlas (TCGA) is a public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive "atlas" of cancer genomic profiles. So far, TCGA researchers have analysed large cohorts of over 30 human tumours through large-scale genome sequencing and integrated multi-dimensional analyses. Studies of individual cancer types, as well as comprehensive pan-cancer analyses have extended current knowledge of tumorigenesis. A major goal of the project was to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer. This review discusses the current status of TCGA Research Network structure, purpose, and achievements. PMID:25691825

  4. About The Center for Cancer Genomics (CCG)

    Cancer.gov

    Recognizing the power of genomics, the National Cancer Institute (NCI) established the Center for Cancer Genomics (CCG) to develop and apply genome science to better diagnose and treat cancer patients. NCI is supporting research to identify the genetic drivers of cancer, to advance adoption of precise tumor diagnosis and treatment, and to prepare patients and their doctors for the changes in medical care influenced by genomics. Throughout these efforts, NCI protects patients’ privacy without hindering treatment or research.

  5. International network of cancer genome projects

    Microsoft Academic Search

    Thomas J. Hudson; Warwick Anderson; Axel Aretz; Anna D. Barker; Cindy Bell; Rosa R. Bernabé; M. K. Bhan; Iiro Eerola; Daniela S. Gerhard; Alan Guttmacher; Mark Guyer; Fiona M. Hemsley; Jennifer L. Jennings; David Kerr; Peter Klatt; Patrik Kolar; Jun Kusuda; Frank Laplace; Youyong Lu; Gerd Nettekoven; Brad Ozenberger; Jane Peterson; T. S. Rao; Jacques Remacle; Alan J. Schafer; Tatsuhiro Shibata; Michael R. Stratton; Joseph G. Vockley; Koichi Watanabe; Huanming Yang; Martin Bobrow; Anne Cambon-Thomsen; Lynn G. Dressler; Stephanie O. M. Dyke; Yann Joly; Kazuto Kato; Karen L. Kennedy; Pilar Nicolás; Michael J. Parker; Emmanuelle Rial-Sebbag; Carlos M. Romeo-Casabona; Kenna M. Shaw; Susan Wallace; Georgia L. Wiesner; Andrew V. Biankin; Christian Chabannon; Lynda Chin; Bruno Clément; Enrique de Alava; Françoise Degos; Martin L. Ferguson; Peter Geary; D. Neil Hayes; Amber L. Johns; Arek Kasprzyk; Hidewaki Nakagawa; Robert Penny; Miguel A. Piris; Rajiv Sarin; Aldo Scarpa; Hiroyuki Aburatani; Mónica Bayés; David D. L. Bowtell; Peter J. Campbell; Xavier Estivill; Ivo Gut; Martin Hirst; Carlos López-Otín; Partha Majumder; Marco Marra; John D. McPherson; Zemin Ning; Xose S. Puente; Yijun Ruan; Hendrik G. Stunnenberg; Harold Swerdlow; Victor E. Velculescu; Richard K. Wilson; Hong H. Xue; Paul T. Spellman; Gary D. Bader; Paul C. Boutros; Paul Flicek; Gad Getz; Roderic Guigó; Guangwu Guo; David Haussler; Simon Heath; Tim J. Hubbard; Tao Jiang; Steven M. Jones; Qibin Li; Nuria López-Bigas; Ruibang Luo; Lakshmi Muthuswamy; B. F. Francis Ouellette; John V. Pearson; Victor Quesada; Benjamin J. Raphael; Chris Sander; Terence P. Speed; Joshua M. Stuart; Jon W. Teague; Yasushi Totoki; Tatsuhiko Tsunoda; Alfonso Valencia; David A. Wheeler; Honglong Wu; Shancen Zhao; Mark Lathrop; Gilles Thomas; Myles Axton; Chris Gunter; Linda J. Miller; Junjun Zhang; Syed A. Haider; Jianxin Wang; Christina K. Yung; Anthony Cross; Yong Liang; Saravanamuttu Gnaneshan; Jonathan Guberman; Don R. C. Chalmers; Karl W. Hasel; Terry S. H. Kaan; William W. Lowrance; Tohru Masui; Laura Lyman Rodriguez; Catherine Vergely; Nicole Cloonan; Anna Defazio; James R. Eshleman; Dariush Etemadmoghadam; Brooke A. Gardiner; James G. Kench; Robert L. Sutherland; Margaret A. Tempero; Nicola J. Waddell; Steve Gallinger; Ming-Sound Tsao; Patricia A. Shaw; Gloria M. Petersen; Debabrata Mukhopadhyay; Ronald A. Depinho; Sarah Thayer; Kamran Shazand; Timothy Beck; Michelle Sam; Lee Timms; Jiafu Ji; Xiuqing Zhang; Feng Chen; Xueda Hu; Guangyu Zhou; Qi Yang; Geng Tian; Lianhai Zhang; Xiaofang Xing; Xianghong Li; Zhenggang Zhu; Yingyan Yu; Jun Yu; Jörg Tost; Paul Brennan; Ivana Holcatova; David Zaridze; Alvis Brazma; Lars Egevad; Egor Prokhortchouk; Rosamonde Elizabeth Banks; Mathias Uhlén; Juris Viksna; Fredrik Ponten; Ewan Birney; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A. Foekens; Sancha Martin; Jorge S. Reis-Filho; Andrea L. Richardson; Christos Sotiriou; Marc van de Vijver; Daniel Birnbaum; Hélène Blanche; Pascal Boucher; Sandrine Boyault; Jocelyne D. Masson-Jacquemier; Iris Pauporté; Xavier Pivot; Anne Vincent-Salomon; Eric Tabone; Charles Theillet; Paulette Bioulac-Sage; Thomas Decaens; Dominique Franco; Marta Gut; Didier Samuel; Benedikt Brors; Jan O. Korbel; Andrey Korshunov; Pablo Landgraf; Hans Lehrach; Stefan Pfister; Bernhard Radlwimmer; Guido Reifenberger; Michael D. Taylor; Paolo Pederzoli; Rita T. Lawlor; Massimo Delledonne; Alberto Bardelli; Thomas Gress; David Klimstra; Yusuke Nakamura; Satoru Miyano; Akihiro Fujimoto; Silvia de Sanjosé; Emili Montserrat; Marcos González-Díaz; Pedro Jares; Heinz Himmelbaue; Samuel Aparicio; Laura van't Veer; Douglas F. Easton; Francis S. Collins; Carolyn C. Compton; Eric S. Lander; Wylie Burke; Anthony R. Green; Olli P. Kallioniemi; Timothy J. Ley; Edison T. Liu; Brandon J. Wainwright

    2010-01-01

    The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and\\/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic

  6. Clinical cancer genome and precision medicine.

    PubMed

    Roukos, Dimitrios H; Ku, Chee-Seng

    2012-11-01

    Revolutionary sequencing technologies have changed biomedical research and life science exponentially. Revealing the whole landscape of causal somatic and inherited mutations underlying individual patient's cancer sample by whole-genome sequencing (WGS) and whole-exome sequencing (WES) can lead to not only a new mutations-based taxonomy of solid tumors (Stratton, Science 331:1553-1558, 2011). But also shapes a roadmap for precision medicine (Roychowdhury et al., Sci Transl Med 3:111ra121, 2011; Roukos, Expert Rev Mol Diagn 12:215-218, 2012; Mirnezami et al., N Engl J Med 366:489-491, 2012). This inevitable approach for personalized diagnostics in concert with free-falling genome sequencing costs raises now the question of applying next-generation sequencing (NGS) technology in the clinic. In the pragmatic clinical world and in contrast to innovative research, is NGS-based clinical evidence sufficient for decision-making on tailoring the best available treatment to the individual cancer patient? PMID:22851046

  7. Genome sequence of Coxiella burnetii strain Namibia

    PubMed Central

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  8. Genome sequence of Coxiella burnetii strain Namibia.

    PubMed

    Walter, Mathias C; Öhrman, Caroline; Myrtennäs, Kerstin; Sjödin, Andreas; Byström, Mona; Larsson, Pär; Macellaro, Anna; Forsman, Mats; Frangoulidis, Dimitrios

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  9. First Complete Sequence of the Human Genome

    NSDL National Science Digital Library

    de Nie, Michael Willem.

    On April 6, Celera Genomics announced that it had completed the sequencing phase of one person's genome. It will now begin the process of assembling the sequenced fragments into their proper order with the aid of powerful computers. Work on this project began in September 1999 using a method called "whole genome shotgun sequencing," a quicker method than that used by the international Human Genome Project, which has completed about two-thirds of its own, more thorough, sequence of the human genome. Although talks between Celera and the Human Genome Project over the sharing of data broke down earlier this year, they have since resumed and the company has stated that it will cooperate. While this is just the first step towards understanding the human genome, it only reveals the order of the nucleotides, not what the genes do, it is certainly an important milestone, with broad implications for biology and medicine. Users can begin with the company's press release and then read reports from the BBC, the New York Times (free registration required), CNN, National Public Radio's All Things Considered, and the Times of India. Additional related resources are available from the Human Genome Project site and Doubletwist.com.

  10. Complete genome sequence of Streptomyces fulvissimus.

    PubMed

    Myronovskyi, M; Tokovenko, B; Manderscheid, N; Petzke, L; Luzhetskyy, A

    2013-10-10

    The complete genome sequence of Streptomyces fulvissimus (DSM 40593), consisting of a linear chromosome with a size of 7.9Mbp, is reported. Preliminary data indicates that the chromosome of S. fulvissimus contains 32 putative gene clusters involved in the biosynthesis of secondary metabolites, two of them showing very high similarity to the valinomycin and nonactin biosynthetic clusters. The availability of genome sequence of S. fulvissimus will contribute to the evaluation of the full biosynthetical potential of streptomycetes. PMID:23965270

  11. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  12. NIH researchers complete whole-exome sequencing of skin cancer;

    Cancer.gov

    A team led by researchers at NIH is the first to systematically survey the landscape of the melanoma genome, the DNA code of the deadliest form of skin cancer. The researchers have made surprising new discoveries using whole-exome sequencing, an approach that decodes the 1-2 percent of the genome that contains protein-coding genes.

  13. Genomic sequencing of Pleistocene cave bears.

    PubMed

    Noonan, James P; Hofreiter, Michael; Smith, Doug; Priest, James R; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J Chris; Pääbo, Svante; Rubin, Edward M

    2005-07-22

    Despite the greater information content of genomic DNA, ancient DNA studies have largely been limited to the amplification of mitochondrial sequences. Here we describe metagenomic libraries constructed with unamplified DNA extracted from skeletal remains of two 40,000-year-old extinct cave bears. Analysis of approximately 1 megabase of sequence from each library showed that despite significant microbial contamination, 5.8 and 1.1% of clones contained cave bear inserts, yielding 26,861 base pairs of cave bear genome sequence. Comparison of cave bear and modern bear sequences revealed the evolutionary relationship of these lineages. The metagenomic approach used here establishes the feasibility of ancient DNA genome sequencing programs. PMID:15933159

  14. Sequencing and comparing whole mitochondrial genomes ofanimals

    SciTech Connect

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  15. From sequence mapping to genome assemblies.

    PubMed

    Otto, Thomas D

    2015-01-01

    The development of "next-generation" high-throughput sequencing technologies has made it possible for many labs to undertake sequencing-based research projects that were unthinkable just a few years ago. Although the scientific applications are diverse, e.g., new genome projects, gene expression analysis, genome-wide functional screens, or epigenetics-the sequence data are usually processed in one of two ways: sequence reads are either mapped to an existing reference sequence, or they are built into a new sequence ("de novo assembly"). In this chapter, we first discuss some limitations of the mapping process and how these may be overcome through local sequence assembly. We then introduce the concept of de novo assembly and describe essential assembly improvement procedures such as scaffolding, contig ordering, gap closure, error evaluation, gene annotation transfer and ab initio gene annotation. The results are high-quality draft assemblies that will facilitate informative downstream analyses. PMID:25388106

  16. Genetic variation in the genome-wide predicted estrogen response element-related sequences is associated with breast cancer development

    Microsoft Academic Search

    Jyh-Cherng Yu; Chia-Ni Hsiung; Huan-Ming Hsu; Bo-Ying Bao; Shou-Tung Chen; Giu-Cheng Hsu; Wen-Cheng Chou; Ling-Yueh Hu; Shian-Ling Ding; Chun-Wen Cheng; Pei-Ei Wu; Chen-Yang Shen

    2011-01-01

    Introduction  Estrogen forms a complex with the estrogen receptor (ER) that binds to estrogen response elements (EREs) in the promoter region\\u000a of estrogen-responsive genes, regulates their transcription, and consequently mediates physiological or tumorigenic effects.\\u000a Thus, sequence variants in EREs have the potential to affect the estrogen-ER-ERE interaction. In this study, we examined the\\u000a hypothesis that genetic variations of EREs are associated

  17. The Wellcome Trust Sanger Institute: The Cancer Genome Project

    NSDL National Science Digital Library

    Supported by the Wellcome Trust Sanger Institute, the Cancer Genome Project (CGP) "is using the human genome sequence and high throughput mutation detection techniques to identify somatically acquired sequence variants/mutations and hence identify genes critical in the development of human cancers. This initiative will ultimately provide the paradigm for the detection of germline mutations in non-neoplastic human genetic diseases through genome-wide mutation detection approaches." The CGP website links to a number of Data Resources including the Cancer Gene Census, Cancer Cell Line Project, Catalogue of Somatic Mutations in Cancer (reported on in the March 4, 2005 NSDL Scout Report for Life Sciences), Somatic Mutations in Protein Kinase Genes, and more. The site also contains an extensive listing of publications from 1998 to 2004 with links to PubMed Abstracts.

  18. Genome Sequencing, Assembly and Gene Prediction in Fungi

    Microsoft Academic Search

    Brendan Loftus

    2003-01-01

    Genome sequencing and the science of genomics is now being applied to the study of fungi. Although resources have been slow in coming, a number of fungi are now being sequenced and an increasingly diverse array of these organisms are being considered as candidates for whole genome sequencing. Currently there are only two complete fungal genome sequences available, those of

  19. Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships

    PubMed Central

    2014-01-01

    Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia. PMID:25001059

  20. NIH-funded study uncovers range of molecular alterations in head and neck cancers, new potential drug targets; TCGA tumor genome sequencing analyses offer new insights into the effects of HPV and smoking

    Cancer.gov

    Investigators with The Cancer Genome Atlas (TCGA) Research Network have discovered genomic differences – with potentially important clinical implications – in head and neck cancers caused by infection with the human papillomavirus (HPV).

  1. International Rice Genome Sequencing Project: the effort to completely sequence the rice genome

    Microsoft Academic Search

    Takuji Sasaki; Benjamin Burr

    2000-01-01

    The International Rice Genome Sequencing Project (IRGSP) involves researchers from ten countries who are working to completely and accurately sequence the rice genome within a short period. Sequencing uses a map-based clone-by-clone shotgun strategy; shared bacterial artificial chromosome\\/ P1-derived artificial chromosome libraries have been constructed from Oryza sativa ssp. japonica variety ‘Nipponbare’. End-sequencing, fingerprinting and marker-aided PCR screening are being

  2. Comparison of 61 sequenced Escherichia coli genomes.

    PubMed

    Lukjancenko, Oksana; Wassenaar, Trudy M; Ussery, David W

    2010-11-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of this set of sequenced strains. A hierarchical clustering of variable genes allowed clear separation of the strains into clusters, including known pathotypes; clinically relevant serotypes can also be resolved in this way. In contrast, when in silico MLST was performed, many of the various strains appear jumbled and less well resolved. The predicted pan-genome comprises 15,741 gene families, and only 993 (6%) of the families are represented in every genome, comprising the core genome. The variable or 'accessory' genes thus make up more than 90% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of Enterobacteriaceae. PMID:20623278

  3. Comparison of 61 Sequenced Escherichia coli Genomes

    PubMed Central

    Lukjancenko, Oksana; Wassenaar, Trudy M.

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of this set of sequenced strains. A hierarchical clustering of variable genes allowed clear separation of the strains into clusters, including known pathotypes; clinically relevant serotypes can also be resolved in this way. In contrast, when in silico MLST was performed, many of the various strains appear jumbled and less well resolved. The predicted pan-genome comprises 15,741 gene families, and only 993 (6%) of the families are represented in every genome, comprising the core genome. The variable or ‘accessory’ genes thus make up more than 90% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of Enterobacteriaceae. PMID:20623278

  4. Whole Genome Sequence of a Turkish Individual

    PubMed Central

    Dogan, Haluk; Can, Handan; Otu, Hasan H.

    2014-01-01

    Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ?1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123?2,122,671 or 1?1.5) and transition/transversion ratios (2,383,204?1,154,590 or 2.06?1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1?1.09 insertion/deletion ratio), ranging from ?52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale. PMID:24416366

  5. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  6. SWINE GENOME SEQUENCING CONSORTIUM (SGSC): A STRATEGIC ROADMAP FOR SEQUENCING THE PIG GENOME

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Swine Genome Sequencing Consortium (SGSC) was formed in September 2003 by academic, government and industry representatives to provide international coordination for sequencing the pig genome. The SGSC's mission is to advance biomedical research for animal production and health by the developmen...

  7. Genome Sequence of the Palaeopolyploid soybean

    SciTech Connect

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  8. ORIGINAL PAPER Microsatellite DNA in genomic survey sequences

    E-print Network

    ORIGINAL PAPER Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine Craig S) 2011 Abstract Genomic DNA sequence databases are a potential and growing resource for simple sequence densities in genome survey sequences (GSSs) to those in non-redundant EST and cDNA sequences (Uni

  9. Using comparative genomics to reorder the human genome sequence into a virtual sheep genome

    Microsoft Academic Search

    Brian P Dalrymple; Ewen F Kirkness; Mikhail Nefedov; Sean McWilliam; Abhirami Ratnakumar; Wes Barris; Shaying Zhao; Jyoti Shetty; Jillian F Maddox; Margaret O'Grady; Frank Nicholas; Allan M Crawford; Tim Smith; Pieter J de Jong; John McEwan; V Hutton Oddy; Noelle E Cockett

    2007-01-01

    BACKGROUND: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? RESULTS: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the

  10. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    E-print Network

    2011-01-01

    plants have large and complex genomes with an abundance of repeated sequences.plants have large and complex genomes with a great abundance of repeated sequences.Sequence composition, organization, and evolution of the core Triticeae genome. Plant

  11. Accelerating Genome Sequencing 100X with FPGAs

    SciTech Connect

    Storaasli, Olaf O [ORNL; Strenski, Dave [Cray, Inc.

    2007-01-01

    The performance of two Cray XD1 systems with Virtex-II Pro 50 and Virtex-4 LX160 FPGAs was evaluated using the FASTA computational biology program for human genome (DNA and protein) sequence comparisons. FPGA speedups of 50X (Virtex-II Pro 50) and 100X (Virtex-4 LX160) over a 2.2 GHz Opteron were obtained. FPGA coding issues for human genome data are described.

  12. Complete genome sequence of Borrelia crocidurae.

    PubMed

    Elbir, Haitham; Gimenez, Grégory; Robert, Catherine; Bergström, Sven; Cutler, Sally; Raoult, Didier; Drancourt, Michel

    2012-07-01

    We announce the draft genome sequence of Borrelia crocidurae (strain Achema). The 1,557,560-bp genome (27% GC content) comprises one 919,477-bp linear chromosome and 638,083-bp plasmids that together carry 1,472 open reading frames, 32 tRNAs, and three complete rRNAs, with almost complete colinearity between B. crocidurae and Borrelia duttonii chromosomes. PMID:22740657

  13. Automated correction of genome sequence errors

    Microsoft Academic Search

    Pawel Gajer; Michael Schatz; Steven L. Salzberg

    2004-01-01

    ABSTRACT By,using,information,from,an,assembly,of,a genome, a new program called AutoEditor signifi- cantly,improves,base,calling,accuracy,over,that achieved,by,previous,algorithms.,This in,turn improves,the,overall,accuracy,of,genome sequences,and,facilitates,the,use,of,these sequences,for,polymorphism,discovery.,We describe,the algorithm,and,its application,in a large set of recent genome sequencing,projects. The number,of erroneous,base,calls in these,projects was,reduced,by,80%. In an,analysis,of over,one million corrections, we found that AutoEditor made just one error per 8828 corrections. By substantially increasing the accuracy of base calling, AutoEditor can dramatically,accelerate,the process,of finishing

  14. Genome sequence and assembly of Bos indicus.

    PubMed

    Canavez, Flavio C; Luche, Douglas D; Stothard, Paul; Leite, Katia R M; Sousa-Canavez, Juliana M; Plastow, Graham; Meidanis, João; Souza, Maria Angélica; Feijao, Pedro; Moore, Steve S; Camara-Lopes, Luiz H

    2012-01-01

    Cattle are divided into 2 groups referred to as taurine and indicine, both of which have been under strong artificial selection due to their importance for human nutrition. A side effect of this domestication includes a loss of genetic diversity within each specialized breed. Recently, the first taurine genome was sequenced and assembled, allowing for a better understanding of this ruminant species. However, genetic information from indicine breeds has been limited. Here, we present the first genome sequence of an indicine breed (Nellore) generated with 52X coverage by SOLiD sequencing platform. As expected, both genomes share high similarity at the nucleotide level for all autosomes and the X chromosome. Regarding the Y chromosome, the homology was considerably lower, most likely due to uncompleted assembly of the taurine Y chromosome. We were also able to cover 97% of the annotated taurine protein-coding genes. PMID:22315242

  15. The Consensus Coding Sequences of Human Breast and Colorectal Cancers

    Microsoft Academic Search

    Tobias Sjöblom; Siân Jones; Laura D. Wood; D. Williams Parsons; Jimmy Lin; Thomas D. Barber; Diana Mandelker; Rebecca J. Leary; Janine Ptak; Natalie Silliman; Steve Szabo; Phillip Buckhaults; Christopher Farrell; Paul Meeh; Sanford D. Markowitz; Joseph Willis; Dawn Dawson; James K. V. Willson; Adi F. Gazdar; James Hartigan; Leo Wu; Changsheng Liu; Giovanni Parmigiani; Ben Ho Park; Kurtis E. Bachman; Nickolas Papadopoulos; Bert Vogelstein; Kenneth W. Kinzler; Victor E. Velculescu

    2006-01-01

    The elucidation of the human genome sequence has made it possible to identify genetic alterations in cancers in unprecedented detail. To begin a systematic analysis of such alterations, we determined the sequence of well-annotated human protein-coding genes in two common tumor types. Analysis of 13,023 genes in 11 breast and 11 colorectal cancers revealed that individual tumors accumulate an average

  16. Genome instability, cancer and aging

    PubMed Central

    Maslov, Alexander Y.; Vijg, Jan

    2015-01-01

    DNA damage-driven genome instability underlies the diversity of life forms generated by the evolutionary process but is detrimental to the somatic cells of individual organisms. The cellular response to DNA damage can be roughly divided in two parts. First, when damage is severe, programmed cell death may occur or, alternatively, temporary or permanent cell cycle arrest. This protects against cancer but can have negative effects on the long term, e.g., by depleting stem cell reservoirs. Second, damage can be repaired through one or more of the many sophisticated genome maintenance pathways. However, erroneous DNA repair and incomplete restoration of chromatin after damage is resolved, produce mutations and epimutations, respectively, both of which have been shown to accumulate with age. An increased burden of mutations and/or epimutations in aged tissues increases cancer risk and adversely affects gene transcriptional regulation, leading to progressive decline in organ function. Cellular degeneration and uncontrolled cell proliferation are both major hallmarks of aging. Despite the fact that one seems to exclude the other, they both may be driven by a common mechanism. Here, we review age related changes in the mammalian genome and their possible functional consequences, with special emphasis on genome instability in stem/progenitor cells. PMID:19344750

  17. AACR 2015: Pan-Cancer Analysis of Whole Genomes

    Cancer.gov

    The Pan-Cancer analysis of Whole Genomes (PCAWG) project of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) is co-ordinating analysis of more than 2,000 whole cancer genomes. Each genome is characterized through a suite of centralized algorithms, including alignment to the reference genome, standardized quality assessment and calling of all classes of somatic mutation.

  18. An International Plan to Sequence the Onion Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

  19. Cancer genetics and genomics: essentials for oncology nurses.

    PubMed

    Boucher, Jean; Habin, Karleen; Underhill, Meghan

    2014-06-01

    Cancer genetics and genomics are rapidly evolving, with new discoveries emerging in genetic mutations, variants, genomic sequencing, risk-reduction methods, and targeted therapies. To educate patients and families, state-of-the-art care requires nurses to understand terminology, scientific and technological advances, and pharmacogenomics. Clinical application of cancer genetics and genomics involves working in interdisciplinary teams to properly identify patient risk through assessing family history, facilitating genetic testing and counseling services, applying risk-reduction methods, and administering and monitoring targeted therapies. PMID:24867117

  20. Making sense of cancer genomic data

    PubMed Central

    Chin, Lynda; Hahn, William C.; Getz, Gad; Meyerson, Matthew

    2011-01-01

    High-throughput tools for nucleic acid characterization now provide the means to conduct comprehensive analyses of all somatic alterations in the cancer genomes. Both large-scale and focused efforts have identified new targets of translational potential. The deluge of information that emerges from these genome-scale investigations has stimulated a parallel development of new analytical frameworks and tools. The complexity of somatic genomic alterations in cancer genomes also requires the development of robust methods for the interrogation of the function of genes identified by these genomics efforts. Here we provide an overview of the current state of cancer genomics, appraise the current portals and tools for accessing and analyzing cancer genomic data, and discuss emerging approaches to exploring the functions of somatically altered genes in cancer. PMID:21406553

  1. Making sense of cancer genomic data.

    PubMed

    Chin, Lynda; Hahn, William C; Getz, Gad; Meyerson, Matthew

    2011-03-15

    High-throughput tools for nucleic acid characterization now provide the means to conduct comprehensive analyses of all somatic alterations in the cancer genomes. Both large-scale and focused efforts have identified new targets of translational potential. The deluge of information that emerges from these genome-scale investigations has stimulated a parallel development of new analytical frameworks and tools. The complexity of somatic genomic alterations in cancer genomes also requires the development of robust methods for the interrogation of the function of genes identified by these genomics efforts. Here we provide an overview of the current state of cancer genomics, appraise the current portals and tools for accessing and analyzing cancer genomic data, and discuss emerging approaches to exploring the functions of somatically altered genes in cancer. PMID:21406553

  2. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

  3. Telomeric repeat-containing RNA/G-quadruplex-forming sequences cause genome-wide alteration of gene expression in human cancer cells in vivo.

    PubMed

    Hirashima, Kyotaro; Seimiya, Hiroyuki

    2015-02-27

    Telomere erosion causes cell mortality, suggesting that longer telomeres enable more cell divisions. In telomerase-positive human cancer cells, however, telomeres are often kept shorter than those of surrounding normal tissues. Recently, we showed that cancer cell telomere elongation represses innate immune genes and promotes their differentiation in vivo. This implies that short telomeres contribute to cancer malignancy, but it is unclear how such genetic repression is caused by elongated telomeres. Here, we report that telomeric repeat-containing RNA (TERRA) induces a genome-wide alteration of gene expression in telomere-elongated cancer cells. Using three different cell lines, we found that telomere elongation up-regulates TERRA signal and down-regulates innate immune genes such as STAT1, ISG15 and OAS3 in vivo. Ectopic TERRA oligonucleotides repressed these genes even in cells with short telomeres under three-dimensional culture conditions. This appeared to occur from the action of G-quadruplexes (G4) in TERRA, because control oligonucleotides had no effect and a nontelomeric G4-forming oligonucleotide phenocopied the TERRA oligonucleotide. Telomere elongation and G4-forming oligonucleotides showed similar gene expression signatures. Most of the commonly suppressed genes were involved in the innate immune system and were up-regulated in various cancers. We propose that TERRA G4 counteracts cancer malignancy by suppressing innate immune genes. PMID:25653161

  4. Telomeric repeat-containing RNA/G-quadruplex-forming sequences cause genome-wide alteration of gene expression in human cancer cells in vivo

    PubMed Central

    Hirashima, Kyotaro; Seimiya, Hiroyuki

    2015-01-01

    Telomere erosion causes cell mortality, suggesting that longer telomeres enable more cell divisions. In telomerase-positive human cancer cells, however, telomeres are often kept shorter than those of surrounding normal tissues. Recently, we showed that cancer cell telomere elongation represses innate immune genes and promotes their differentiation in vivo. This implies that short telomeres contribute to cancer malignancy, but it is unclear how such genetic repression is caused by elongated telomeres. Here, we report that telomeric repeat-containing RNA (TERRA) induces a genome-wide alteration of gene expression in telomere-elongated cancer cells. Using three different cell lines, we found that telomere elongation up-regulates TERRA signal and down-regulates innate immune genes such as STAT1, ISG15 and OAS3 in vivo. Ectopic TERRA oligonucleotides repressed these genes even in cells with short telomeres under three-dimensional culture conditions. This appeared to occur from the action of G-quadruplexes (G4) in TERRA, because control oligonucleotides had no effect and a nontelomeric G4-forming oligonucleotide phenocopied the TERRA oligonucleotide. Telomere elongation and G4-forming oligonucleotides showed similar gene expression signatures. Most of the commonly suppressed genes were involved in the innate immune system and were up-regulated in various cancers. We propose that TERRA G4 counteracts cancer malignancy by suppressing innate immune genes. PMID:25653161

  5. Genome size and the accumulation of simple sequence repeats: implications of new data from genome sequencing projects

    Microsoft Academic Search

    John M. Hancock

    2002-01-01

    The relationship between the level of repetitiveness in genomic sequences and genome size has been re-investigated making use of the rapidly growing database of complete eubacterial and archaeal genome sequences combined with the fragmentary but now large amount of data from eukaryotic genomes. Relative simplicity factors (RSFs), which measure the repetitiveness of sequences, were calculated and significantly simple motifs (SSMs),

  6. Sequencing Your Genome: What Does It Mean?

    PubMed Central

    2014-01-01

    The human genome contains approximately 3.2 billion nucleotides and about 23,500 genes. Each gene has protein-coding regions that are referred to as exons. The human genome contains about 180,000 exons, which are collectively called an exome. An exome comprises about 1% of the human genome and hence is about 30 million nucleotides in size. Today’s technologies afford the opportunity to sequence all nucleotides in the human exome and even in the human genome. Given that more than three-quarters of the known disease-causing variants are located in the exome, and considering the cost and technical challenges in analyzing the whole genome sequence data, the focus of present research is primarily on whole exome sequencing (WES). While WES at the medical sequencing level is still expensive, it is becoming more affordable. Cost will not likely be a major barrier in the near future, and the data analysis is becoming less tedious. The most difficult challenge at the heart of medical sequencing is interpreting the findings. Each exome contains about 13,500 single nucleotide variants (SNVs) that affect the amino acid sequence, and a large number are expected to be functional variants. The daunting task is to distinguish the variants that are pathogenic from those that have minimal or no discernible clinical effects. While various algorithms exist, none are sufficiently robust. Thus, in-depth knowledge in genetics and medicine is essential for the proper interpretation of the WES findings. This review will discuss the potential applications of the WES data in the practice of cardiovascular medicine. PMID:24932355

  7. Mapping and sequencing the human genome

    SciTech Connect

    none,

    1988-01-01

    Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

  8. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    PubMed Central

    Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W.; Aarestrup, Frank M.; Lund, Ole

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

  9. Multilocus sequence typing of total-genome-sequenced bacteria.

    PubMed

    Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole

    2012-04-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

  10. Assigning genomic sequences to CATH

    Microsoft Academic Search

    Frances M. G. Pearl; David Lee; James E. Bray; Ian Sillitoe; Annabel E. Todd; Andrew P. Harrison; Janet M. Thornton; Christine A. Orengo

    2000-01-01

    We report the latest release (version 1.6) of the CATH protein domains database (http:\\/\\/www.biochem.ucl. ac.uk\\/bsm\\/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo- logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and

  11. Whole genome sequences of four Brucella strains.

    PubMed

    Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yang, Yi; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

    2011-07-01

    Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine. PMID:21602346

  12. Whole Genome Sequences of Four Brucella Strains ?

    PubMed Central

    Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yang, Yi; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

    2011-01-01

    Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine. PMID:21602346

  13. Genome Sequence of Corynebacterium ulcerans Strain 210932

    PubMed Central

    Viana, Marcus Vinicius Canário; de Jesus Benevides, Leandro; Batista Mariano, Diego Cesar; de Souza Rocha, Flávia; Bagano Vilas Boas, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Alves Dorella, Fernanda; Gomes Leal, Carlos Augusto; Fiorini de Carvalho, Alex; Silva, Artur; de Castro Soares, Siomar; Pereira Figueiredo, Henrique Cesar; Guimarães, Luis Carlos

    2014-01-01

    In this work, we present the complete genome sequence of Corynebacterium ulcerans strain 210932, isolated from a human. The species is an emergent pathogen that infects a variety of wild and domesticated animals and humans. It is associated with a growing number of cases of a diphtheria-like disease around the world. PMID:25428977

  14. Draft Genome Sequence of Virgibacillus halodenitrificans 1806

    PubMed Central

    Lee, Sang-Jae; Lee, Yong-Jik; Jeong, Haeyoung; Lee, Sang Jun; Lee, Han-Seung; Pan, Jae-Gu

    2012-01-01

    Virgibacillus halodenitrificans 1806 is an endospore-forming halophilic bacterium isolated from salterns in Korea. Here, we report the draft genome sequence of V. halodenitrificans 1806, which may reveal the molecular basis of osmoadaptation and insights into carbon and anaerobic metabolism in moderate halophiles. PMID:23105070

  15. DNA secondary structures and epigenetic determinants of cancer genome evolution

    PubMed Central

    De, Subhajyoti; Michor, Franziska

    2014-01-01

    An unstable genome is a hallmark of many cancers. It is unclear, however, whether some mutagenic features driving somatic alterations in cancer are encoded in the genome sequence and whether they can operate in a tissue-specific manner. We performed a genome-wide analysis of 663,446 DNA breakpoints associated with somatic copy-number alterations (SCNAs) from 2,792 cancer samples classified into 26 cancer types. Many SCNA breakpoints are spatially clustered in cancer genomes. We observed a significant enrichment for G-quadruplex sequences (G4s) in the vicinity of SCNA breakpoints and established that SCNAs show a strand bias consistent with G4-mediated structural alterations. Notably, abnormal hypomethylation near G4s-rich regions is a common signature for many SCNA breakpoint hotspots. We propose a mechanistic hypothesis that abnormal hypomethylation in genomic regions enriched for G4s acts as a mutagenic factor driving tissue-specific mutational landscapes in cancer. PMID:21725294

  16. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    E-print Network

    Timme, Ruth E.

    2009-01-01

    genomes are crop plants, their complete genome sequence willchloroplast genome sequence for any plant within the largersequence of Glycine max and comparative analyses with other legume genomes. Plant

  17. Cancer Vulnerabilities Unveiled by Genomic Loss

    E-print Network

    Nijhawan, Deepak

    Due to genome instability, most cancers exhibit loss of regions containing tumor suppressor genes and collateral loss of other genes. To identify cancer-specific vulnerabilities that are the result of copy number losses, ...

  18. Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken

    E-print Network

    Campbell, A. Malcolm

    -scale DNA sequencing has transformed biological research. Scientists can sequence whole genomes of microbesABSTRACT Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken into fragments and sequenced

  19. A Computer Program for Aligning a cDNA Sequence with a Genomic DNA Sequence

    Microsoft Academic Search

    Liliana Florea; George Hartzell; Gerald M. Rubin; Webb Miller

    1998-01-01

    We address the problem of efficiently aligning a transcribed and spliced DNA sequence with a genomic sequence containing that gene, allowing for introns in the genomic sequence and a relatively small number of sequencing errors. A freely available computer program, described herein, solves the problem for a 100-kb genomic sequence in a few seconds on a workstation. With large amounts

  20. Comprehensive characterization of the genomic alterations in human gastric cancer.

    PubMed

    Cui, Juan; Yin, Yanbin; Ma, Qin; Wang, Guoqing; Olman, Victor; Zhang, Yu; Chou, Wen-Chi; Hong, Celine S; Zhang, Chi; Cao, Sha; Mao, Xizeng; Li, Ying; Qin, Steve; Zhao, Shaying; Jiang, Jing; Hastings, Phil; Li, Fan; Xu, Ying

    2015-07-01

    Gastric cancer is one of the most prevalent and aggressive cancers worldwide, and its molecular mechanism remains largely elusive. Here we report the genomic landscape in primary gastric adenocarcinoma of human, based on the complete genome sequences of five pairs of cancer and matching normal samples. In total, 103,464 somatic point mutations, including 407 nonsynonymous ones, were identified and the most recurrent mutations were harbored by Mucins (MUC3A and MUC12) and transcription factors (ZNF717, ZNF595 and TP53). 679 genomic rearrangements were detected, which affect 355 protein-coding genes; and 76 genes show copy number changes. Through mapping the boundaries of the rearranged regions to the folded three-dimensional structure of human chromosomes, we determined that 79.6% of the chromosomal rearrangements happen among DNA fragments in close spatial proximity, especially when two endpoints stay in a similar replication phase. We demonstrated evidences that microhomology-mediated break-induced replication was utilized as a mechanism in inducing ?40.9% of the identified genomic changes in gastric tumor. Our data analyses revealed potential integrations of Helicobacter pylori DNA into the gastric cancer genomes. Overall a large set of novel genomic variations were detected in these gastric cancer genomes, which may be essential to the study of the genetic basis and molecular mechanism of the gastric tumorigenesis. PMID:25422082

  1. Defining Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    Chain, Patrick [DOE-JGI

    2009-05-27

    Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

  2. Whole-genome sequencing in bacteriology: state of the art

    PubMed Central

    Dark, Michael J

    2013-01-01

    Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

  3. Genome Sequencing Reveals a Phage in Helicobacter pylori

    PubMed Central

    Lehours, Philippe; Vale, Filipa F.; Bjursell, Magnus K.; Melefors, Ojar; Advani, Reza; Glavas, Steve; Guegueniat, Julia; Gontier, Etienne; Lacomme, Sabrina; Alves Matos, António; Menard, Armelle; Mégraud, Francis; Engstrand, Lars; Andersson, Anders F.

    2011-01-01

    ABSTRACT Helicobacter pylori chronically infects the gastric mucosa in more than half of the human population; in a subset of this population, its presence is associated with development of severe disease, such as gastric cancer. Genomic analysis of several strains has revealed an extensive H. pylori pan-genome, likely to grow as more genomes are sampled. Here we describe the draft genome sequence (63 contigs; 26× mean coverage) of H. pylori strain B45, isolated from a patient with gastric mucosa-associated lymphoid tissue (MALT) lymphoma. The major finding was a 24.6-kb prophage integrated in the bacterial genome. The prophage shares most of its genes (22/27) with prophage region II of Helicobacter acinonychis strain Sheeba. After UV treatment of liquid cultures, circular DNA carrying the prophage integrase gene could be detected, and intracellular tailed phage-like particles were observed in H. pylori cells by transmission electron microscopy, indicating that phage production can be induced from the prophage. PCR amplification and sequencing of the integrase gene from 341 H. pylori strains from different geographic regions revealed a high prevalence of the prophage (21.4%). Phylogenetic reconstruction showed four distinct clusters in the integrase gene, three of which tended to be specific for geographic regions. Our study implies that phages may play important roles in the ecology and evolution of H. pylori. PMID:22086490

  4. The first Irish genome and ways of improving sequence accuracy.

    PubMed

    Ju, Young Seok; Yoo, Yun Joo; Kim, Jong-Il; Seo, Jeong-Sun

    2010-01-01

    Whole-genome sequencing of an Irish person reveals hundreds of thousands of novel genomic variants. Imputation using previous known information improves the accuracy of low-read-depth sequencing. PMID:20815917

  5. The Genome Sequence of Drosophila melanogaster

    NSDL National Science Digital Library

    Ramanujan, Krishna.

    On Thursday March 23, 2000, a historic milestone was marked as researchers announced they have completed mapping the genome of the fruit fly, Drosophila melanogaster. The achievement, which was announced in a special issue of the journal Science, culminates close to 100 years of research. Drosophila melanogaster is the most complex animal thus far to have its genetic sequence deciphered. The findings have important implications for human medical research and for completing a map of the human genome. Mapping the fruit fly genome has been a broad collaborative effort between academia and industry in several countries. While a foundation was laid by US (Berkeley), European, and Canadian Drosophila Genome Projects, Celera Genomic finished the job over the last year by employing super-computers and state-of-the-art gene-sequencing machines. The techniques learned and used in this last phase of mapping may now be applied to more rapidly decode genes of other organisms, including humans. This week's In The News takes a closer look at this important landmark.

  6. Agaricus bisporus genome sequence: a commentary.

    PubMed

    Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

    2013-06-01

    The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and ?-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium. PMID:23558250

  7. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  8. Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

    Microsoft Academic Search

    Alexander F. Auch; Stefan R. Henz; Barbara R. Holland; Markus Göker

    2006-01-01

    BACKGROUND: Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX,

  9. Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence

    Microsoft Academic Search

    Susan E Celniker; David A Wheeler; Brent Kronmiller; Joseph W Carlson; Aaron Halpern; Sandeep Patel; Mark Adams; Mark Champe; Shannon P Dugan; Erwin Frise; Ann Hodgson; Reed A George; Roger A Hoskins; Todd Laverty; Donna M Muzny; Catherine R Nelson; Joanne M Pacleb; Soo Park; Barret D Pfeiffer; Stephen Richards; Erica J Sodergren; Robert Svirskas; Paul E Tabor; Kenneth Wan; Mark Stapleton; Granger G Sutton; Craig Venter; George Weinstock; Steven E Scherer; Eugene W Myers; Richard A Gibbs; Gerald M Rubin

    2002-01-01

    BACKGROUND: The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to

  10. Initial sequencing and comparative analysis of the mouse genome

    Microsoft Academic Search

    Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

    2002-01-01

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

  11. The diploid genome sequence of an Asian individual

    Microsoft Academic Search

    Jun Wang; Wei Wang; Ruiqiang Li; Yingrui Li; Geng Tian; Laurie Goodman; Wei Fan; Junqing Zhang; Jun Li; Juanbin Zhang; Yiran Guo; Binxiao Feng; Heng Li; Yao Lu; Xiaodong Fang; Huiqing Liang; Zhenglin Du; Dong Li; Yiqing Zhao; Yujie Hu; Zhenzhen Yang; Hancheng Zheng; Ines Hellmann; Michael Inouye; John Pool; Xin Yi; Jing Zhao; Jinjie Duan; Yan Zhou; Junjie Qin; Lijia Ma; Guoqing Li; Zhentao Yang; Guojie Zhang; Bin Yang; Chang Yu; Fang Liang; Wenjie Li; Shaochuan Li; Dawei Li; Peixiang Ni; Jue Ruan; Qibin Li; Hongmei Zhu; Dongyuan Liu; Zhike Lu; Ning Li; Guangwu Guo; Jianguo Zhang; Jia Ye; Lin Fang; Qin Hao; Quan Chen; Yu Liang; Yeyang Su; A. San; Cuo Ping; Shuang Yang; Fang Chen; Li Li; Ke Zhou; Hongkun Zheng; Yuanyuan Ren; Ling Yang; Guohua Yang; Zhuo Li; Xiaoli Feng; Karsten Kristiansen; Gane Ka-Shu Wong; Rasmus Nielsen; Richard Durbin; Lars Bolund; Xiuqing Zhang; Songgang Li; Huanming Yang; Jian Wang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the

  12. Insight into the heterogeneity of breast cancer through next-generation sequencing

    PubMed Central

    Russnes, Hege G.; Navin, Nicholas; Hicks, James; Borresen-Dale, Anne-Lise

    2011-01-01

    Rapid and sophisticated improvements in molecular analysis have allowed us to sequence whole human genomes as well as cancer genomes, and the findings suggest that we may be approaching the ability to individualize the diagnosis and treatment of cancer. This paradigmatic shift in approach will require clinicians and researchers to overcome several challenges including the huge spectrum of tumor types within a given cancer, as well as the cell-to-cell variations observed within tumors. This review discusses how next-generation sequencing of breast cancer genomes already reveals insight into tumor heterogeneity and how it can contribute to future breast cancer classification and management. PMID:21965338

  13. Discovery and Annotation of Repeats, Signatures, and Patterns in Genomic Sequences

    E-print Network

    Robinson, Michael

    in all solid tumors, aiding in staging of cancer and treatments strategies, 4 - Tracking HIV to extract the relevant information. This difficulty is present in a setting of ever increasing production-sequences found in the source genome and not in the target genomes. We first extract DNA, RNA or amino acid from

  14. Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux

    PubMed Central

    Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

    2012-01-01

    We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ?20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology. PMID:22848480

  15. Ancient human genome sequence of an extinct Palaeo-Eskimo

    E-print Network

    Nielsen, Rasmus

    ARTICLES Ancient human genome sequence of an extinct Palaeo-Eskimo Morten Rasmussen1,2 *, Yingrui the genome sequence of an ancient human. Obtained from ,4,000-year-old permafrost-preserved hair, the genome, independent of that giving rise to the modern Native Americans and Inuit. Recent advances in DNA sequencing

  16. Sequence and comparative analysis of the chicken genome provide unique

    E-print Network

    Edwards, Scott

    Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution International Chicken Genome Sequencing Consortium* *Lists of participants and affiliations appear ........................................................................................................................................................................................................................... We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken

  17. Data structures and compression algorithms for genomic sequence data

    Microsoft Academic Search

    Marty C. Brandon; Douglas C. Wallace; Pierre Baldi

    2009-01-01

    Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function, and evolution, but also for the storage, navigation, and privacy of genomic data. Here we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and

  18. The Z curve database: a graphic representation of genome sequences

    Microsoft Academic Search

    Chun-ting Zhang; Ren Zhang; Hong-yu Ou

    2003-01-01

    Motivation: Genome projects for many prokaryotic and eukaryotic species have been completed and more new genome projects are being underway currently. The avail- ability of a large number of genomic sequences for re- searchers creates a need to find graphic tools to study genomes in a perceivable form. The Z curve is one of such tools available for visualizing genomes.

  19. The Norway spruce genome sequence and conifer genome evolution.

    PubMed

    Nystedt, Björn; Street, Nathaniel R; Wetterbom, Anna; Zuccolo, Andrea; Lin, Yao-Cheng; Scofield, Douglas G; Vezzi, Francesco; Delhomme, Nicolas; Giacomello, Stefania; Alexeyenko, Andrey; Vicedomini, Riccardo; Sahlin, Kristoffer; Sherwood, Ellen; Elfstrand, Malin; Gramzow, Lydia; Holmberg, Kristina; Hällman, Jimmie; Keech, Olivier; Klasson, Lisa; Koriabine, Maxim; Kucukoglu, Melis; Käller, Max; Luthman, Johannes; Lysholm, Fredrik; Niittylä, Totte; Olson, Ake; Rilakovic, Nemanja; Ritland, Carol; Rosselló, Josep A; Sena, Juliana; Svensson, Thomas; Talavera-López, Carlos; Theißen, Günter; Tuominen, Hannele; Vanneste, Kevin; Wu, Zhi-Qiang; Zhang, Bo; Zerbe, Philipp; Arvestad, Lars; Bhalerao, Rishikesh; Bohlmann, Joerg; Bousquet, Jean; Garcia Gil, Rosario; Hvidsten, Torgeir R; de Jong, Pieter; MacKay, John; Morgante, Michele; Ritland, Kermit; Sundberg, Björn; Thompson, Stacey Lee; Van de Peer, Yves; Andersson, Björn; Nilsson, Ove; Ingvarsson, Pär K; Lundeberg, Joakim; Jansson, Stefan

    2013-05-30

    Conifers have dominated forests for more than 200?million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000?base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding. PMID:23698360

  20. Genome-wide analysis of oral cancer--early results from the Cancer Genome Anatomy Project.

    PubMed

    Shillitoe, E J; May, M; Patel, V; Lethanakul, C; Ensley, J F; Strausberg, R L; Gutkind, J S

    2000-01-01

    The Cancer Genome Anatomy Project (CGAP) is a large cooperative effort sponsored by the US National Institutes of Health designed to find, catalog and annotate genes that are expressed during cancer development. In the past 2 years, the CGAP has sequenced over 700,000 clones from approximately 140 cDNA libraries, resulting in the identification of over 30,000 new human genes. As a first step in applying this project to oral cancer we entered four cell lines--two from oral cancer, one from primary oral keratinocytes, and one from oral keratinocytes which had been immortalized by human papillomavirus. Libraries of cDNA were made and sequenced and the data were deposited in GenBank. The expressed genes were then identified where possible. The cell lines, and the total number of expressed genes that were cloned from each were: HN3 (oral cancer), 263 genes; HN4 (oral cancer), 550 genes; HN5 (primary keratinocytes), 237 genes; HN6 (immortalized keratinocytes), 408 genes. The total number of different genes that were found was 1160. A total of 38 new genes, of unknown function, were discovered. The data presented here represent a beginning of the application of the CGAP technology to oral cancer. Even though the data are still quite incomplete, they already represent a large quantity of new information and clones of potential utility to the oral cancer community, and provide a glimpse of the data sets to be forthcoming from the Project. It must therefore be expected that there will soon be a large expansion in the volume of data regarding the genetics of oral cancer. Those who study this disease must be prepared to develop new methods of analysis and storage for handling the oncoming volumes of information. PMID:10889913

  1. Complete genome sequence of Pyrobaculum oguniense

    PubMed Central

    Bernick, David L.; Karplus, Kevin; Lui, Lauren M.; Coker, Joanna K. C.; Murphy, Julie N.; Chan, Patricia P.; Cozen, Aaron E.

    2012-01-01

    Pyrobaculum oguniense TE7 is an aerobic hyperthermophilic crenarchaeon isolated from a hot spring in Japan. Here we describe its main chromosome of 2,436,033 bp, with three large-scale inversions and an extra-chromosomal element of 16,887 bp. We have annotated 2,800 protein-coding genes and 145 RNA genes in this genome, including nine H/ACA-like small RNA, 83 predicted C/D box small RNA, and 47 transfer RNA genes. Comparative analyses with the closest known relative, the anaerobe Pyrobaculum arsenaticum from Italy, reveals unexpectedly high synteny and nucleotide identity between these two geographically distant species. Deep sequencing of a mixture of genomic DNA from multiple cells has illuminated some of the genome dynamics potentially shared with other species in this genus. PMID:23407329

  2. Whole genome sequencing of matched primary and metastatic acral melanomas

    PubMed Central

    Turajlic, Samra; Furney, Simon J.; Lambros, Maryou B.; Mitsopoulos, Costas; Kozarewa, Iwanka; Geyer, Felipe C.; MacKay, Alan; Hakas, Jarle; Zvelebil, Marketa; Lord, Christopher J.; Ashworth, Alan; Thomas, Meirion; Stamp, Gordon; Larkin, James; Reis-Filho, Jorge S.; Marais, Richard

    2012-01-01

    Next generation sequencing has enabled systematic discovery of mutational spectra in cancer samples. Here, we used whole genome sequencing to characterize somatic mutations and structural variation in a primary acral melanoma and its lymph node metastasis. Our data show that the somatic mutational rates in this acral melanoma sample pair were more comparable to the rates reported in cancer genomes not associated with mutagenic exposure than in the genome of a melanoma cell line or the transcriptome of melanoma short-term cultures. Despite the perception that acral skin is sun-protected, the dominant mutational signature in these samples is compatible with damage due to ultraviolet light exposure. A nonsense mutation in ERCC5 discovered in both the primary and metastatic tumors could also have contributed to the mutational signature through accumulation of unrepaired dipyrimidine lesions. However, evidence of transcription-coupled repair was suggested by the lower mutational rate in the transcribed regions and expressed genes. The primary and the metastasis are highly similar at the level of global gene copy number alterations, loss of heterozygosity and single nucleotide variation (SNV). Furthermore, the majority of the SNVs in the primary tumor were propagated in the metastasis and one nonsynonymous coding SNV and one splice site mutation appeared to arise de novo in the metastatic lesion. PMID:22183965

  3. The genome sequence of Schizosaccharomyces pombe

    Microsoft Academic Search

    R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

    2002-01-01

    We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

  4. Modeling alternate RNA structures in genomic sequences.

    PubMed

    Saffarian, Azadeh; Giraud, Mathieu; Touzet, Hélène

    2015-03-01

    We introduce the concept of RNA multistructures, which is a formal grammar-based framework specifically designed to model a set of alternate RNA secondary structures. Such alternate structures can either be a set of suboptimal foldings, or distinct stable folding states, or variants within an RNA family. We provide several such examples and propose an efficient algorithm to search for RNA multistructures within a genomic sequence. PMID:25768235

  5. Draft Genome Sequence of Rubrivivax gelatinosus CBS

    SciTech Connect

    Hu, P. S.; Lang, J.; Wawrousek, K.; Yu, J. P.; Maness, P. C.; Chen, J.

    2012-06-01

    Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N{sub 2} as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H{sub 2}. We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium.

  6. Toward a Comprehensive Genomic Analysis of Cancer

    Cancer.gov

    The National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) convened a "Toward a Comprehensive Genomic Analysis of Cancer" workshop in Washington, D.C. This workshop brought together physicians, basic scientists and other members of the U.S. and international cancer communities to assist in outlining the most effective strategies for the development of a successful project. Information about this workshop is reported in the Executive Summary.

  7. Genome sequence of Halobacterium species NRC-1

    PubMed Central

    Ng, Wailap Victor; Kennedy, Sean P.; Mahairas, Gregory G.; Berquist, Brian; Pan, Min; Shukla, Hem Dutt; Lasky, Stephen R.; Baliga, Nitin S.; Thorsson, Vesteinn; Sbrogna, Jennifer; Swartzell, Steven; Weir, Douglas; Hall, John; Dahl, Timothy A.; Welti, Russell; Goo, Young Ah; Leithauser, Brent; Keller, Kim; Cruz, Randy; Danson, Michael J.; Hough, David W.; Maddocks, Deborah G.; Jablonski, Peter E.; Krebs, Mark P.; Angevine, Christine M.; Dale, Heather; Isenbarger, Thomas A.; Peck, Ronald F.; Pohlschroder, Mechthild; Spudich, John L.; Jung, Kwang-Hwan; Alam, Maqsudul; Freitas, Tracey; Hou, Shaobin; Daniels, Charles J.; Dennis, Patrick P.; Omer, Arina D.; Ebhardt, Holger; Lowe, Todd M.; Liang, Ping; Riley, Monica; Hood, Leroy; DasSarma, Shiladitya

    2000-01-01

    We report the complete sequence of an extreme halophile, Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-bp genome containing 91 insertion sequences representing 12 families and organized into a large chromosome and 2 related minichromosomes. The Halobacterium NRC-1 genome codes for 2,630 predicted proteins, 36% of which are unrelated to any previously reported. Analysis of the genome sequence shows the presence of pathways for uptake and utilization of amino acids, active sodium-proton antiporter and potassium uptake systems, sophisticated photosensory and signal transduction pathways, and DNA replication, transcription, and translation systems resembling more complex eukaryotic organisms. Whole proteome comparisons show the definite archaeal nature of this halophile with additional similarities to the Gram-positive Bacillus subtilis and other bacteria. The ease of culturing Halobacterium and the availability of methods for its genetic manipulation in the laboratory, including construction of gene knockouts and replacements, indicate this halophile can serve as an excellent model system among the archaea. PMID:11016950

  8. Open-Access Cancer Genomics Tools: the UCSC Cancer Genomics Browser

    Cancer.gov

    The completion of the Human Genome Project sparked a revolution in high-throughput genomics applied towards deciphering genetically complex diseases, like cancer. Now, almost 10 years later, we have a mountain of genomics data on many different cancer types and subtypes that is rapidly expanding.

  9. Precision medicine in breast cancer: genes, genomes, and the future of genomically driven treatments.

    PubMed

    Stover, Daniel G; Wagle, Nikhil

    2015-04-01

    Remarkable progress in sequencing technology over the past 20 years has made it possible to comprehensively profile tumors and identify clinically relevant genomic alterations. In breast cancer, the most common malignancy affecting women, we are now increasingly able to use this technology to help specify the use of therapies that target key molecular and genetic dependencies. Large sequencing studies have confirmed the role of well-known cancer-related genes and have also revealed numerous other genes that are recurrently mutated in breast cancer. This growing understanding of patient-to-patient variability at the genomic level in breast cancer is advancing our ability to direct the appropriate treatment to the appropriate patient at the appropriate time-a hallmark of "precision cancer medicine." This review focuses on the technological advances that have catalyzed these developments, the landscape of mutations in breast cancer, the clinical impact of genomic profiling, and the incorporation of genomic information into clinical care and clinical trials. PMID:25708799

  10. Why Assembling Plant Genome Sequences Is So Challenging

    PubMed Central

    Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

    2012-01-01

    In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

  11. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Microsoft Academic Search

    Frank M You; Naxin Huo; Karin R Deal; Yong Q Gu; Ming-Cheng Luo; Patrick E McGuire; Jan Dvorak; Olin D Anderson

    2011-01-01

    BACKGROUND: Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS)

  12. Whole genome sequence (WGS) analysis for exploring plant relationships

    Microsoft Academic Search

    Nicole F Rice; Giovanni M Cordeiro; Catherine J Nock; Daniel LE Waters; Stirling Bowen; Robert J Henry

    2010-01-01

    Shotgun sequencing plant genomic DNA preparations generates large quantities of sequence data in a single run. Using the Illumina GAII, whole genome shot-gun sequence (WGS) data was generated for Oryza sativa cv Nipponbarre, and the rice wild relatives Oryza meridionalis and Oryza australiensis. Two other grass species were also sequenced, Potamophila parviflora, from the Oryzeae tribe and Microlaena stipoides from

  13. The UCSC Cancer Genomics Browser: update 2015

    PubMed Central

    Goldman, Mary; Craft, Brian; Swatloski, Teresa; Cline, Melissa; Morozova, Olena; Diekhans, Mark; Haussler, David; Zhu, Jingchun

    2015-01-01

    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a web-based application that integrates relevant data, analysis and visualization, allowing users to easily discover and share their research observations. Users can explore the relationship between genomic alterations and phenotypes by visualizing various -omic data alongside clinical and phenotypic features, such as age, subtype classifications and genomic biomarkers. The Cancer Genomics Browser currently hosts 575 public datasets from genome-wide analyses of over 227 000 samples, including datasets from TCGA, CCLE, Connectivity Map and TARGET. Users can download and upload clinical data, generate Kaplan–Meier plots dynamically, export data directly to Galaxy for analysis, plus generate URL bookmarks of specific views of the data to share with others. PMID:25392408

  14. The UCSC Cancer Genomics Browser: update 2015.

    PubMed

    Goldman, Mary; Craft, Brian; Swatloski, Teresa; Cline, Melissa; Morozova, Olena; Diekhans, Mark; Haussler, David; Zhu, Jingchun

    2015-01-01

    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a web-based application that integrates relevant data, analysis and visualization, allowing users to easily discover and share their research observations. Users can explore the relationship between genomic alterations and phenotypes by visualizing various -omic data alongside clinical and phenotypic features, such as age, subtype classifications and genomic biomarkers. The Cancer Genomics Browser currently hosts 575 public datasets from genome-wide analyses of over 227,000 samples, including datasets from TCGA, CCLE, Connectivity Map and TARGET. Users can download and upload clinical data, generate Kaplan-Meier plots dynamically, export data directly to Galaxy for analysis, plus generate URL bookmarks of specific views of the data to share with others. PMID:25392408

  15. Second Generation Sequencing of the Mesothelioma Tumor Genome

    PubMed Central

    Bueno, Raphael; De Rienzo, Assunta; Dong, Lingsheng; Gordon, Gavin J.; Hercus, Colin F.; Richards, William G.; Jensen, Roderick V.; Anwar, Arif; Maulik, Gautam; Chirieac, Lucian R.; Ho, Kim-Fong; Taillon, Bruce E.; Turcotte, Cynthia L.; Hercus, Robert G.; Gullans, Steven R.; Sugarbaker, David J.

    2010-01-01

    The current paradigm for elucidating the molecular etiology of cancers relies on the interrogation of small numbers of genes, which limits the scope of investigation. Emerging second-generation massively parallel DNA sequencing technologies have enabled more precise definition of the cancer genome on a global scale. We examined the genome of a human primary malignant pleural mesothelioma (MPM) tumor and matched normal tissue by using a combination of sequencing-by-synthesis and pyrosequencing methodologies to a 9.6X depth of coverage. Read density analysis uncovered significant aneuploidy and numerous rearrangements. Method-dependent informatics rules, which combined the results of different sequencing platforms, were developed to identify and validate candidate mutations of multiple types. Many more tumor-specific rearrangements than point mutations were uncovered at this depth of sequencing, resulting in novel, large-scale, inter- and intra-chromosomal deletions, inversions, and translocations. Nearly all candidate point mutations appeared to be previously unknown SNPs. Thirty tumor-specific fusions/translocations were independently validated with PCR and Sanger sequencing. Of these, 15 represented disrupted gene-encoding regions, including kinases, transcription factors, and growth factors. One large deletion in DPP10 resulted in altered transcription and expression of DPP10 transcripts in a set of 53 additional MPM tumors correlated with survival. Additionally, three point mutations were observed in the coding regions of NKX6-2, a transcription regulator, and NFRKB, a DNA-binding protein involved in modulating NFKB1. Several regions containing genes such as PCBD2 and DHFR, which are involved in growth factor signaling and nucleotide synthesis, respectively, were selectively amplified in the tumor. Second-generation sequencing uncovered all types of mutations in this MPM tumor, with DNA rearrangements representing the dominant type. PMID:20485525

  16. Whole Chloroplast Genome Sequencing in Fragaria Using Deep Sequencing: A Comparison of Three Methods

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Chloroplast sequences previously investigated in Fragaria revealed low amounts of variation. Deep sequencing technologies enable economical sequencing of complete chloroplast genomes. These sequences can potentially provide robust phylogenetic resolution, even at low taxonomic levels within plant gr...

  17. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries

    Microsoft Academic Search

    Tim T. Binnewies; Yair Motro; Peter F. Hallin; Ole Lund; David Dunn; Tom La; David J. Hampson; Matthew Bellgard; Trudy M. Wassenaar; David W. Ussery

    2006-01-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of

  18. Genome Sequence of the Pea Aphid Acyrthosiphon The International Aphid Genomics Consortium"

    E-print Network

    Paris-Sud XI, Université de

    Genome Sequence of the Pea Aphid Acyrthosiphon pisum The International Aphid Genomics Consortium we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple

  19. Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer

    SciTech Connect

    Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

    2009-06-18

    Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes to cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also reveal the molecular differences between cancer and normal that may be exploited to therapeutic benefit or that provide targets for molecular assays that may enable early cancer detection, and predict individual disease progression or response to treatment. This chapter reviews current and future directions in genome analysis and summarizes studies that provide insights into breast cancer pathophysiology or that suggest strategies to improve breast cancer management.

  20. The International Rice Genome Sequencing Project: progress and prospects

    Microsoft Academic Search

    T. Sasaki; T. Matsumoto; T. Baba; K. Yamamoto; J. Wu; Y. Katayose; K. Sakata

    The rice genome sequencing project has been pursued as a national project in Japan since 1998. At the same time, a desire to accelerate the sequenc- ing of the entire rice genome led to the formation of the International Rice Genome Sequencing Project (IRGSP), initially comprising five countries. The sequencing strategy is the conventional clone-by-clone shotgun method us- ing P1-derived

  1. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  2. A comparison of whole genome sequencing with exome sequencing for family-based association studies

    PubMed Central

    2014-01-01

    As the cost of DNA sequencing decreases, association studies based on whole genome sequencing are now becoming feasible. It is still unclear, however, how much more we could gain from whole genome sequencing compared to exome sequencing, which has been widely used to study a variety of diseases. In this project, we performed a comparison between whole genome sequencing and exome sequencing for family-based association analysis using data from Genetic Analysis Workshop 18. Whole genome sequencing was able to identify several significant hits within intergenic regions. However, the increased cost of multiple testing counteracted the benefits and resulted in a higher false discovery rate. Our results suggest that exome sequencing is a cost-effective way to identify disease-related variants. With the decreasing sequencing cost and accumulating knowledge of the human genome, whole genome sequencing has the potential to identify important variants in regulatory regions typically inaccessible for exome sequencing. PMID:25519383

  3. [Science, communication and policy: sequencing the rice genome].

    PubMed

    Delseny, Michel

    2003-04-01

    Nearly 4 years after launching the International Rice Genome Sequencing Project (IRGSP), the rice genome sequence is almost completed. This is the second plant genome after Arabidopsis thaliana and one expect that it is more representative of other cereal genomes. Indeed, no more than 4 sequences have been independently reported as a result of a tough competition between economy, politics and media. The efficiency and impact of this way of managing a large scale project is questionable. This paper reports the various phases in sequencing rice genome as well as what we start to learn. PMID:12836228

  4. Statistical Properties of Open Reading Frames in Complete Genome Sequences

    Microsoft Academic Search

    Wentian Li

    1999-01-01

    Some statistical properties of open reading frames in all currently available complete genome sequences are analyzed (seventeen prokatyotic genomes, and 16 chromosome sequences from the yeast genome). The size distribution of open reading frames is characterized by various techniques, such as quantile tables, QQ-plots, rank- size plots (Zipf's plots), and spatial densities. The issue of the influence of CG% on

  5. Draft Genome Sequence of Geotrichum candidum Strain 3C

    PubMed Central

    Bobrov, Kirill S.; Eneyskaya, Elena V.; Kulminskaya, Anna A.

    2014-01-01

    We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

  6. Combined Evidence Annotation of Transposable Elements in Genome Sequences

    Microsoft Academic Search

    Hadi Quesneville; Olivier Andrieu; Delphine Autard; Danielle Nouaud; Michael Ashburner; Dominique Anxolabehere

    2005-01-01

    Transposable elements (TEs) are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from

  7. Genome Sequence of Pediococcus pentosaceus Strain IE-3

    PubMed Central

    Midha, Samriti; Ranjan, Manish; Sharma, Vikas; Kumari, Annu; Singh, Pradip Kumar

    2012-01-01

    We report the 1.8-Mb genome sequence of Pediococcus pentosaceus strain IE-3, isolated from a dairy effluent sample. The whole-genome sequence of this strain will aid in comparative genomics of Pediococcus pentosaceus strains of diverse ecological origins and their biotechnological applications. PMID:22843596

  8. Draft Genome Sequence of Geotrichum candidum Strain 3C.

    PubMed

    Polev, Dmitrii E; Bobrov, Kirill S; Eneyskaya, Elena V; Kulminskaya, Anna A

    2014-01-01

    We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

  9. Complete Genome Sequence of the Embu Virus Strain SPAn880

    PubMed Central

    Antwerpen, Markus; Georgi, Enrico; Vette, Philipp; Zoeller, Gudrun; Meyer, Hermann

    2014-01-01

    We report the complete genome sequence of the Embu virus. The genome consists of 185,139 bp and is nearly identical to that of the Cotia virus. This is the first report on the Embu virus genome sequence, which has been considered an unclassified poxvirus until now. PMID:25477400

  10. Simple sequence repeats in bryophyte mitochondrial genomes.

    PubMed

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2014-02-01

    Abstract Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

  11. Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences

    Microsoft Academic Search

    Robert K. Jansen; Linda A. Raubeson; Jeffrey L. Boore; Claude W. dePamphilis; Timothy W. Chumley; Rosemarie C. Haberle; Stacia K. Wyman; Andrew J. Alverson; Rhiannon Peery; Sallie J. Herman; H. Matthew Fourcade; Jennifer V. Kuehl; Joel R. McNeal; James Leebens-Mack; Liying Cui

    2005-01-01

    During the past decade, there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. There are 45 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next 5 years. Several groups of researchers including ours have

  12. Initial sequencing and comparative analysis of the mouse genome

    E-print Network

    Eddy, Sean

    and knockin techniques17­22 . For these and other reasons, the Human Genome Project (HGP) recognized from its ........................................................................................................................................................................................................................... The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from

  13. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  14. Sequencing approach evaluates all 24 genes implicated in breast cancer

    Cancer.gov

    Since 1994, many thousands of women with breast cancer from families severely affected with the disease have been tested for inherited mutations in BRCA1 and BRCA2. The vast majority of those patients were told that their gene sequences were normal. With the development of modern genomics sequencing tools, the discovery of additional genes implicated in breast cancer and the change in the legal status of genetic testing for BRCA1 and BRCA2, it is now possible to determine how often families in these circumstances actually do carry cancer-predisposing mutations in BRCA1, BRCA2, or another gene implicated in breast cancer, despite the results of their previous genetic tests. The results were presented Oct. 24, by researchers from the University of Washington (which is affiliated with the Fred Hutchinson Cancer Research Center) at the American Society of Human Genetics 2013 meeting in Boston.

  15. Building on Discoveries in Cancer Genomics

    Cancer.gov

    Deciphering the genomes of many cancers is necessary to understand the extent of their complexity and diversity. These molecular analyses are leading to a new classification of tumors, which may have therapeutic implications.

  16. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission

    PubMed Central

    Giongo, Adriana; Tyler, Heather L.; Zipperer, Ursula N.; Triplett, Eric W.

    2010-01-01

    Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures. PMID:21304715

  17. Characterizing the cancer genome in lung adenocarcinoma

    Microsoft Academic Search

    Barbara A. Weir; Michele S. Woo; Gad Getz; Sven Perner; Li Ding; Rameen Beroukhim; William M. Lin; Michael A. Province; Aldi Kraja; Laura A. Johnson; Kinjal Shah; Mitsuo Sato; Roman K. Thomas; Justine A. Barletta; Ingrid B. Borecki; Stephen Broderick; Andrew C. Chang; Derek Y. Chiang; Lucian R. Chirieac; Jeonghee Cho; Yoshitaka Fujii; Adi F. Gazdar; Thomas Giordano; Heidi Greulich; Megan Hanna; Bruce E. Johnson; Mark G. Kris; Alex Lash; Ling Lin; Neal Lindeman; Elaine R. Mardis; John D. McPherson; John D. Minna; Margaret B. Morgan; Mark Nadel; Mark B. Orringer; John R. Osborne; Brad Ozenberger; Alex H. Ramos; James Robinson; Jack A. Roth; Valerie Rusch; Hidefumi Sasaki; Frances Shepherd; Carrie Sougnez; Margaret R. Spitz; Ming-Sound Tsao; David Twomey; Roel G. W. Verhaak; George M. Weinstock; David A. Wheeler; Wendy Winckler; Akihiko Yoshizawa; Soyoung Yu; Maureen F. Zakowski; Qunyuan Zhang; David G. Beer; Ignacio I. Wistuba; Mark A. Watson; Levi A. Garraway; Marc Ladanyi; William D. Travis; William Pao; Mark A. Rubin; Stacey B. Gabriel; Richard A. Gibbs; Harold E. Varmus; Richard K. Wilson; Eric S. Lander; Matthew Meyerson

    2007-01-01

    Somatic alterations in cellular DNA underlie almost all human cancers1. The prospect of targeted therapies2 and the development of high-resolution, genome-wide approaches3-8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection oftumours(n 5371)usingdensesinglenucleotidepolymorphism arrays, we identify a total of 57

  18. Characterizing the cancer genome in lung adenocarcinoma

    Microsoft Academic Search

    Barbara A. Weir; Michele S. Woo; Gad Getz; Sven Perner; Li Ding; Rameen Beroukhim; William M. Lin; Michael A. Province; Aldi Kraja; Laura A. Johnson; Kinjal Shah; Mitsuo Sato; Roman K. Thomas; Justine A. Barletta; Ingrid B. Borecki; Stephen Broderick; Andrew C. Chang; Derek Y. Chiang; Lucian R. Chirieac; Jeonghee Cho; Yoshitaka Fujii; Adi F. Gazdar; Thomas Giordano; Heidi Greulich; Megan Hanna; Bruce E. Johnson; Mark G. Kris; Alex Lash; Ling Lin; Neal Lindeman; Elaine R. Mardis; John D. McPherson; John D. Minna; Margaret B. Morgan; Mark Nadel; Mark B. Orringer; John R. Osborne; Brad Ozenberger; Alex H. Ramos; James Robinson; Jack A. Roth; Valerie Rusch; Hidefumi Sasaki; Frances Shepherd; Carrie Sougnez; Margaret R. Spitz; Ming-Sound Tsao; David Twomey; Roel G. W. Verhaak; George M. Weinstock; David A. Wheeler; Wendy Winckler; Akihiko Yoshizawa; Soyoung Yu; Maureen F. Zakowski; Qunyuan Zhang; David G. Beer; Ignacio I. Wistuba; Mark A. Watson; Levi A. Garraway; Marc Ladanyi; William D. Travis; William Pao; Mark A. Rubin; Stacey B. Gabriel; Richard A. Gibbs; Harold E. Varmus; Richard K. Wilson; Eric S. Lander; Matthew Meyerson

    2007-01-01

    Somatic alterations in cellular DNA underlie almost all human cancers. The prospect of targeted therapies and the development of high-resolution, genome-wide approaches are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumours (n = 371) using dense single nucleotide

  19. The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.

    PubMed

    Droege, Marcus; Hill, Brendon

    2008-08-31

    The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer. PMID:18616967

  20. Current challenges in de novo plant genome sequencing and assembly

    PubMed Central

    2012-01-01

    Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community. PMID:22546054

  1. Initial impact of the sequencing of the human genome

    E-print Network

    Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

    The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

  2. Genome sequencing of the important oilseed crop Sesamum indicum L

    PubMed Central

    2013-01-01

    The Sesame Genome Working Group (SGWG) has been formed to sequence and assemble the sesame (Sesamum indicum L.) genome. The status of this project and our planned analyses are described. PMID:23369264

  3. Fast and Sensitive Alignment of Large Genomic Sequences

    Microsoft Academic Search

    Michael Brudno; Burkhard Morgenstern

    2002-01-01

    Comparative analysis of syntenic genome sequences can be used to identify functional sites such as exons and regulatory elements. Here, the first step is to align two or several evolutionary related sequences and, in recent years, a number of computer programs have been developed for alignment of large genomic sequences. Some of these programs are extremely fast but often time-efficiency

  4. Digital Signal Processing in the Analysis of Genomic Sequences

    Microsoft Academic Search

    Juan V. Lorenzo-Ginori; Aníbal Rodríguez-Fuentes; Ricardo Grau Ábalo; Robersy Sánchez

    2009-01-01

    Digital Signal Processing (DSP) applications in Bioinformatics have received great attention in recent years, where new effective methods for genomic sequence analysis, such as the detection of coding regions, have been devel- oped. The use of DSP principles to analyze genomic sequences requires defining an adequate representation of the nucleo- tide bases by numerical values, converting the nucleotide sequences into

  5. A large genome center's improvements to the Illumina sequencing system

    Microsoft Academic Search

    Michael A Quail; Iwanka Kozarewa; Frances Smith; Aylwyn Scally; Philip J Stephens; Richard Durbin; Harold Swerdlow; Daniel J Turner

    2008-01-01

    The Wellcome Trust Sanger Institute is one of the world's largest genome centers, and a substantial amount of our sequencing is performed with 'next-generation' massively parallel sequencing technologies: in June 2008 the quantity of purity-filtered sequence data generated by our Genome Analyzer (Illumina) platforms reached 1 terabase, and our average weekly Illumina production output is currently 64 gigabases. Here we

  6. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter [University of Chicago

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  7. MIPS: a database for genomes and protein sequences

    Microsoft Academic Search

    Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

    2000-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

  8. Validation of rice genome sequence by optical mapping

    Microsoft Academic Search

    Shiguo Zhou; Michael C Bechner; Chris P Churas; Louise Pape; Sally A Leong; Rod Runnheim; Dan K Forrest; Steve Goldstein; Miron Livny; David C Schwartz

    2007-01-01

    BACKGROUND: Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. RESULTS: To facilitate ongoing sequencing finishing and validation

  9. Genome variation discovery with high-throughput sequencing data

    E-print Network

    Toronto, University of

    require extensive computational analysis in order to identify genomic variants present in the sequenced], and to understand the regulation of genes by sequencing chromatin immunoprecipitation products (ChIP-Seq) [12 representative of the species (the reference), while an HTS technology is used to sequence reads from the genome

  10. A sequence-based survey of the complex structural organization of tumor genomes

    SciTech Connect

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  11. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  12. AACR 2014: NCI/NIH-Sponsored Session: Large-Scale Genomics Data for the Research Community through the NCI Center for Cancer Genomics

    Cancer.gov

    The NCI’s Center for Cancer Genomics (CCG), which includes the Office of Cancer Genomics and The Cancer Genome Atlas Program Office, provides the research community access to large-scale molecular characterization data, which is largely sequence-based. CCG programs aim to improve patient outcome through identification of valid molecular targets and associated molecular markers (prognostic or diagnostic), in and across diseases investigated, which should ultimately lead to the rapid development of novel, more effective therapies.

  13. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    SciTech Connect

    Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions fr...

  15. Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) Webinar Series

    Cancer.gov

    The Epidemiology and Genomics Research Program is launching a new webinar series entitled "Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE)." The purpose of this forum is to provide an opportunity for our grantees and other interested individuals to share lessons learned and practical information regarding the application of next generation sequencing to cancer epidemiology studies.

  16. Sequencing techniques uncover mutations in genes that can increase cancer risk

    Cancer.gov

    Now that the findings from the Human Genome Project are widely available, scientists are working to put that data to work to understand the genetic causes of many diseases, including cancer, by using the latest sequencing techniques.

  17. Toolbox for mobile-element insertion detection on cancer genomes.

    PubMed

    Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

    2014-01-01

    Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram. PMID:25452688

  18. Toolbox for Mobile-Element Insertion Detection on Cancer Genomes

    PubMed Central

    Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

    2014-01-01

    Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram. PMID:25452688

  19. Mechanisms of Base Substitution Mutagenesis in Cancer Genomes

    PubMed Central

    Bacolla, Albino; Cooper, David N.; Vasquez, Karen M.

    2014-01-01

    Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules. PMID:24705290

  20. Individual Patient Cancer Profiles in The Cancer Genome Atlas - Jianjiong Gao, TCGA Scientific Symposium 2012

    Cancer.gov

    Home News and Events Multimedia Library Videos Individual Patient Cancer Profiles in The Cancer Genome Atlas - Jianjiong Gao Individual Patient Cancer Profiles in The Cancer Genome Atlas - Jianjiong Gao, TCGA Scientific Symposium 2012 You will need

  1. Next-generation sequencing strategies for characterizing the turkey genome.

    PubMed

    Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

    2014-02-01

    The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry. PMID:24570472

  2. Applications of next-generation sequencing technologies in functional genomics

    Microsoft Academic Search

    Olena Morozova; Marco A. Marra

    2008-01-01

    A new generation of sequencing technologies, from Illumina\\/Solexa, ABI\\/SOLiD, 454\\/Roche, and Helicos, has provided unprecedented opportunities for high-throughput functional genomic research. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted resequencing, discovery of transcription factor binding sites, and noncoding RNA expression profiling. This review discusses applications of next-generation sequencing technologies in functional genomics

  3. Diversity through duplication: Whole-genome sequencing reveals novel gene retrocopies in the human population

    PubMed Central

    Richardson, Sandra R; Salvador-Palomeque, Carmen; Faulkner, Geoffrey J

    2014-01-01

    Gene retrocopies are generated by reverse transcription and genomic integration of mRNA. As such, retrocopies present an important exception to the central dogma of molecular biology, and have substantially impacted the functional landscape of the metazoan genome. While an estimated 8,000–17,000 retrocopies exist in the human genome reference sequence, the extent of variation between individuals in terms of retrocopy content has remained largely unexplored. Three recent studies by Abyzov et al., Ewing et al. and Schrider et al. have exploited 1,000 Genomes Project Consortium data, as well as other sources of whole-genome sequencing data, to uncover novel gene retrocopies. Here, we compare the methods and results of these three studies, highlight the impact of retrocopies in human diversity and genome evolution, and speculate on the potential for somatic gene retrocopies to impact cancer etiology and genetic diversity among individual neurons in the mammalian brain. PMID:24615986

  4. Research Resources for Cancer Epidemiology and Genomics

    Cancer.gov

    The Epidemiology and Genomics Research Program (EGRP) has developed a list with links to a number of cancer-related research resources available through EGRP-supported cohorts, consortia, and initiatives; other research programs in the Division of Cancer Control and Population Sciences and NCI; and partners elsewhere at NIH and other research organizations.

  5. Comparative DNA Sequence Analysis of Wheat and Rice Genomes

    PubMed Central

    Sorrells, Mark E.; La Rota, Mauricio; Bermudez-Kandianis, Catherine E.; Greene, Robert A.; Kantety, Ramesh; Munkvold, Jesse D.; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J.; Qi, Lili L.; Echalier, Benjamin; Gill, Bikram S.; Matthews, David E.; Lazo, Gerard R.; Chao, Shiaoman; Anderson, Olin D.; Edwards, Hugh; Linkiewicz, Anna M.; Dubcovsky, Jorge; Akhunov, Eduard D.; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T.; Peng, Junhua; Lapitan, Nora L.V.; Gonzalez-Hernandez, Jose L.; Anderson, James A.; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F.; Choi, Dong-Woog; Close, Timothy J.; Dilbirligi, Muharrem; Gill, Kulvinder S.; Steber, Camille; Walker-Simmons, Mary K.; McGuire, Patrick E.; Qualset, Calvin O.

    2003-01-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  6. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus

    E-print Network

    2007-01-01

    genome sequences and make comparisons (within angiosperms, seed plants,genome sequence from Korean Ginseng (Panax schiseng Nees) and comparative analysis of sequence evolution among 17 vascular plants.genomes of all other vascular plant taxa examined, a similar sequence

  7. The Cancer Genome Atlas Pan-Cancer analysis project

    E-print Network

    Lander, Eric S.

    The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a ...

  8. Sequencing and Assembly of the 22-Gb Loblolly Pine Genome

    PubMed Central

    Zimin, Aleksey; Stevens, Kristian A.; Crepeau, Marc W.; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L.; de Jong, Pieter J.; Neale, David B.; Salzberg, Steven L.; Yorke, James A.; Langley, Charles H.

    2014-01-01

    Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer “super-reads,” rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp. PMID:24653210

  9. Genome Science and Personalized Cancer Treatment

    ScienceCinema

    Joe Gray

    2010-01-08

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

  10. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-04

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  11. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Joe Gray

    2009-08-07

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  12. Mapping the Human Reference Genome’s Missing Sequence by Three-Way Admixture in Latino Genomes

    PubMed Central

    Genovese, Giulio; Handsaker, Robert E.; Li, Heng; Kenny, Eimear E.; McCarroll, Steven A.

    2013-01-01

    A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

  13. Genome scanning : an AFM-based DNA sequencing technique

    E-print Network

    Elmouelhi, Ahmed (Ahmed M.), 1979-

    2003-01-01

    Genome Scanning is a powerful new technique for DNA sequencing. The method presented in this thesis uses an atomic force microscope with a functionalized cantilever tip to sequence single stranded DNA immobilized to a mica ...

  14. Insights from 20 years of bacterial genome sequencing.

    PubMed

    Land, Miriam; Hauser, Loren; Jun, Se-Ran; Nookaew, Intawat; Leuze, Michael R; Ahn, Tae-Hyuk; Karpinets, Tatiana; Lund, Ole; Kora, Guruprased; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

    2015-03-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them. PMID:25722247

  15. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    Microsoft Academic Search

    Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

    2002-01-01

    Background: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D.

  16. Ovarian Cancer Biomarker Discovery Based on Genomic Approaches

    PubMed Central

    Lee, Jung-Yun; Kim, Hee Seung; Suh, Dong Hoon; Kim, Mi-Kyung; Chung, Hyun Hoon; Song, Yong-Sang

    2013-01-01

    Ovarian cancer presents at an advanced stage in more than 75% of patients. Early detection has great promise to improve clinical outcomes. Although the advancing proteomic technologies led to the discovery of numerous ovarian cancer biomarkers, no screening method has been recommended for early detection of ovarian cancer. Complexity and heterogeneity of ovarian carcinogenesis is a major obstacle to discover biomarkers. As cancer arises due to accumulation of genetic change, understanding the close connection between genetic changes and ovarian carcinogenesis would provide the opportunity to find novel gene-level ovarian cancer biomarkers. In this review, we summarize the various gene-based biomarkers by genomic technologies, including inherited gene mutations, epigenetic changes, and differential gene expression. In addition, we suggest the strategy to discover novel gene-based biomarkers with recently introduced next generation sequencing. PMID:25337559

  17. De Novo Next Generation Sequencing of Plant Genomes

    Microsoft Academic Search

    Steve Rounsley; Pradeep Reddy Marri; Yeisoo Yu; Ruifeng He; Nick Sisneros; Jose Luis Goicoechea; So Jeong Lee; Angelina Angelova; Dave Kudrna; Meizhong Luo; Jason Affourtit; Brian Desany; James Knight; Faheem Niazi; Michael Egholm; Rod A. Wing

    2009-01-01

    The genome sequencing of all major food and bioenergy crops is of critical importance in the race to improve crop production\\u000a to meet the future food and energy security needs of the world. Next generation sequencing technologies have brought about\\u000a great improvements in sequencing throughput and cost, but do not yet allow for de novo sequencing of large repetitive genomes

  18. Identification of genomic alterations in oesophageal squamous cell cancer.

    PubMed

    Song, Yongmei; Li, Lin; Ou, Yunwei; Gao, Zhibo; Li, Enmin; Li, Xiangchun; Zhang, Weimin; Wang, Jiaqian; Xu, Liyan; Zhou, Yong; Ma, Xiaojuan; Liu, Lingyan; Zhao, Zitong; Huang, Xuanlin; Fan, Jing; Dong, Lijia; Chen, Gang; Ma, Liying; Yang, Jie; Chen, Longyun; He, Minghui; Li, Miao; Zhuang, Xuehan; Huang, Kai; Qiu, Kunlong; Yin, Guangliang; Guo, Guangwu; Feng, Qiang; Chen, Peishan; Wu, Zhiyong; Wu, Jianyi; Ma, Ling; Zhao, Jinyang; Luo, Longhai; Fu, Ming; Xu, Bainan; Chen, Bo; Li, Yingrui; Tong, Tong; Wang, Mingrong; Liu, Zhihua; Lin, Dongxin; Zhang, Xiuqing; Yang, Huanming; Wang, Jun; Zhan, Qimin

    2014-05-01

    Oesophageal cancer is one of the most aggressive cancers and is the sixth leading cause of cancer death worldwide. Approximately 70% of global oesophageal cancer cases occur in China, with oesophageal squamous cell carcinoma (ESCC) being the histopathological form in the vast majority of cases (>90%). Currently, there are limited clinical approaches for the early diagnosis and treatment of ESCC, resulting in a 10% five-year survival rate for patients. However, the full repertoire of genomic events leading to the pathogenesis of ESCC remains unclear. Here we describe a comprehensive genomic analysis of 158 ESCC cases, as part of the International Cancer Genome Consortium research project. We conducted whole-genome sequencing in 17 ESCC cases and whole-exome sequencing in 71 cases, of which 53 cases, plus an additional 70 ESCC cases not used in the whole-genome and whole-exome sequencing, were subjected to array comparative genomic hybridization analysis. We identified eight significantly mutated genes, of which six are well known tumour-associated genes (TP53, RB1, CDKN2A, PIK3CA, NOTCH1, NFE2L2), and two have not previously been described in ESCC (ADAM29 and FAM135B). Notably, FAM135B is identified as a novel cancer-implicated gene as assayed for its ability to promote malignancy of ESCC cells. Additionally, MIR548K, a microRNA encoded in the amplified 11q13.3-13.4 region, is characterized as a novel oncogene, and functional assays demonstrate that MIR548K enhances malignant phenotypes of ESCC cells. Moreover, we have found that several important histone regulator genes (MLL2 (also called KMT2D), ASH1L, MLL3 (KMT2C), SETD1B, CREBBP and EP300) are frequently altered in ESCC. Pathway assessment reveals that somatic aberrations are mainly involved in the Wnt, cell cycle and Notch pathways. Genomic analyses suggest that ESCC and head and neck squamous cell carcinoma share some common pathogenic mechanisms, and ESCC development is associated with alcohol drinking. This study has explored novel biological markers and tumorigenic pathways that would greatly improve therapeutic strategies for ESCC. PMID:24670651

  19. Integrative genomic and functional profiling of the pancreatic cancer genome

    PubMed Central

    2013-01-01

    Background Pancreatic cancer is a deadly disease with a five-year survival of less than 5%. A better understanding of the underlying biology may suggest novel therapeutic targets. Recent surveys of the pancreatic cancer genome have uncovered numerous new alterations; yet systematic functional characterization of candidate cancer genes has lagged behind. To address this challenge, here we have devised a highly-parallel RNA interference-based functional screen to evaluate many genomically-nominated candidate pancreatic cancer genes simultaneously. Results For 185 candidate pancreatic cancer genes, selected from recurrently altered genomic loci, we performed a pooled shRNA library screen of cell growth/viability across 10 different cell lines. Knockdown-associated effects on cell growth were assessed by enrichment or depletion of shRNA hairpins, by hybridization to barcode microarrays. A novel analytical approach (COrrelated Phenotypes for On-Target Effects; COPOTE) was used to discern probable on-target knockdown, based on identifying different shRNAs targeting the same gene and displaying concordant phenotypes across cell lines. Knockdown data were integrated with genomic architecture and gene-expression profiles, and selected findings validated using individual shRNAs and/or independent siRNAs. The pooled shRNA library design delivered reproducible data. In all, COPOTE analysis identified 52 probable on-target gene-knockdowns. Knockdown of known oncogenes (KRAS, MYC, SMURF1 and CCNE1) and a tumor suppressor (CDKN2A) showed the expected contrasting effects on cell growth. In addition, the screen corroborated purported roles of PLEKHG2 and MED29 as 19q13 amplicon drivers. Most notably, the analysis also revealed novel possible oncogenic functions of nucleoporin NUP153 (ostensibly by modulating TGF? signaling) and Kruppel-like transcription factor KLF5 in pancreatic cancer. Conclusions By integrating physical and functional genomic data, we were able to simultaneously evaluate many candidate pancreatic cancer genes. Our findings uncover new facets of pancreatic cancer biology, with possible therapeutic implications. More broadly, our study provides a general strategy for the efficient characterization of candidate genes emerging from cancer genome studies. PMID:24041470

  20. Volatiles from nineteen recently genome sequenced actinomycetes.

    PubMed

    Citron, Christian A; Barra, Lena; Wink, Joachim; Dickschat, Jeroen S

    2015-02-18

    The volatiles released by agar plate cultures of nineteen actinomycetes whose genomes were recently sequenced were collected by use of a closed-loop stripping apparatus (CLSA) and analysed by GC/MS. In total, 178 compounds from various classes were identified. The most interesting findings were the detection of the insect pheromone frontalin in Streptomyces varsoviensis, and the emission of the unusual plant metabolite 1-nitro-2-phenylethane. Its biosynthesis from phenylalanine was investigated in isotopic labelling experiments. Furthermore, the identified terpenes were correlated to the information about terpene cyclase homologs encoded in the investigated strains. The analytical data were in line with functionally characterised bacterial terpene cyclases and particularly corroborated the recently suggested function of a terpene cyclase from Streptomyces violaceusniger by the identification of a functional homolog in Streptomyces rapamycinicus. PMID:25585196

  1. Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.

  2. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.

    PubMed

    Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

    2006-07-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this information. PMID:16773396

  3. Pattern discovery and cancer gene identification in integrated cancer genomic data

    PubMed Central

    Mo, Qianxing; Wang, Sijian; Seshan, Venkatraman E.; Olshen, Adam B.; Schultz, Nikolaus; Sander, Chris; Powers, R. Scott; Ladanyi, Marc; Shen, Ronglai

    2013-01-01

    Large-scale integrated cancer genome characterization efforts including the cancer genome atlas and the cancer cell line encyclopedia have created unprecedented opportunities to study cancer biology in the context of knowing the entire catalog of genetic alterations. A clinically important challenge is to discover cancer subtypes and their molecular drivers in a comprehensive genetic context. Curtis et al. [Nature (2012) 486(7403):346–352] has recently shown that integrative clustering of copy number and gene expression in 2,000 breast tumors reveals novel subgroups beyond the classic expression subtypes that show distinct clinical outcomes. To extend the scope of integrative analysis for the inclusion of somatic mutation data by massively parallel sequencing, we propose a framework for joint modeling of discrete and continuous variables that arise from integrated genomic, epigenomic, and transcriptomic profiling. The core idea is motivated by the hypothesis that diverse molecular phenotypes can be predicted by a set of orthogonal latent variables that represent distinct molecular drivers, and thus can reveal tumor subgroups of biological and clinical importance. Using the cancer cell line encyclopedia dataset, we demonstrate our method can accurately group cell lines by their cell-of-origin for several cancer types, and precisely pinpoint their known and potential cancer driver genes. Our integrative analysis also demonstrates the power for revealing subgroups that are not lineage-dependent, but consist of different cancer types driven by a common genetic alteration. Application of the cancer genome atlas colorectal cancer data reveals distinct integrated tumor subtypes, suggesting different genetic pathways in colon cancer progression. PMID:23431203

  4. Finishing The Euchromatic Sequence Of The Human Genome

    SciTech Connect

    Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

    2004-09-07

    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

  5. Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes.

    PubMed

    Genovese, Giulio; Handsaker, Robert E; Li, Heng; Kenny, Eimear E; McCarroll, Steven A

    2013-09-01

    A principal obstacle to completing maps and analyses of the human genome involves the genome's "inaccessible" regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)-a substantial fraction of the human genome's remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

  6. Research ethics and the challenge of whole-genome sequencing

    Microsoft Academic Search

    Amy L. McGuire; Mildred K. Cho; Timothy Caulfield

    2007-01-01

    The recent completion of the first two individual whole-genome sequences is a research milestone. As personal genome research advances, investigators and international research bodies must ensure ethical research conduct. We identify three major ethical considerations that have been implicated in whole-genome research: the return of research results to participants; the obligations, if any, that are owed to participants' relatives; and

  7. Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum.

    PubMed

    Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

    2015-01-01

    Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches. PMID:25883287

  8. Genome Sequence of Aedes aegypti, a Major Arbovirus Vector

    Microsoft Academic Search

    Vishvanath Nene; Jennifer R. Wortman; Daniel Lawson; Brian Haas; Chinnappa Kodira; Z. Tu; Brendan Loftus; Zhiyong Xi; Karyn Megy; Manfred Grabherr; Quinghu Ren; E. M. Zdobnov; N. F. Lobo; K. S. Campbell; S. E. Brown; M. F. Bonaldo; Jingsong Zhu; S. P. Sinkins; D. G. Hogenkamp; Paolo Amedeo; Peter Arensburger; P. W. Atkinson; Shelby Bidwell; Jim Biedler; Ewan Birney; Robert V. Bruggner; Javier Costas; M. R. Coy; Jonathan Crabtree; Matt Crawford; Becky deBruyn; David DeCaprio; Karin Eiglmeier; Eric Eisenstadt; Hamza El-Dorry; W. M. Gelbart; S. L. Gomes; Martin Hammond; Linda I. Hannick; M. H. Holmes; J. R. Hogan; David Jaffe; J. S. Johnston; R. C. Kennedy; Hean Koo; Saul Kravitz; Evgenia V. Kriventseva; David Kulp; Kurt LaButti; Eduardo Lee; Song Li; Diane D. Lovin; Chunhong Mao; Evan Mauceli; C. F. M. Menck; J. R. Miller; Philip Montgomery; Akio Mori; A. L. Nascimento; H. F. Naveira; Chad Nusbaum; S. O'Leary; Joshua Orvis; Mihaela Pertea; Hadi Quesneville; K. R. Reidenbach; Yu-Hui Rogers; C. W. Roth; J. R. Schneider; Michael Schatz; Martin Shumway; Mario Stanke; E. O. Stinson; J. M. C. Tubio; J. P. VanZee; Sergio Verjovski-Almeida; Doreen Werner; Owen White; Stefan Wyder; Qiandong Zeng; Qi Zhao; Yongmei Zhao; C. A. Hill; A. S. Raikhel; M. B. Soares; D. L. Knudson; N. H. Lee; James Galagan; S. L. Salzberg; I. T. Paulsen; George Dimopoulos; F. H. Collins; Bruce Birren; C. M. Fraser-Liggett; D. W. Severson

    2007-01-01

    We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at ~1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of ~4 to

  9. Complete Genome Sequence of the Mesoplasma florum W37 Strain

    PubMed Central

    Baby, Vincent; Matteau, Dominick; Knight, Thomas F.

    2013-01-01

    Mesoplasma florum is a small-genome fast-growing mollicute that is an attractive model for systems and synthetic genomics studies. We report the complete 825,824-bp genome sequence of a second representative of this species, M. florum strain W37, which contains 733 predicted open reading frames and 35 stable RNAs. PMID:24285658

  10. Genome Sequence of Brevibacillus laterosporus Strain GI-9

    PubMed Central

    Sharma, Vikas; Singh, Pradip K.; Midha, Samriti; Ranjan, Manish

    2012-01-01

    We report the 5.18-Mb genome sequence of Brevibacillus laterosporus strain GI-9, isolated from a subsurface soil sample during a screen for novel strains producing antimicrobial compounds. The draft genome of this strain will aid in biotechnological exploitation and comparative genomics of Brevibacillus laterosporus strains. PMID:22328768

  11. Genome Sequence of Xanthomonas axonopodis pv. punicae Strain LMG 859

    PubMed Central

    Sharma, Vikas; Midha, Samriti; Ranjan, Manish; Pinnaka, Anil Kumar

    2012-01-01

    We report the 4.94-Mb genome sequence of Xanthomonas axonopodis pv. punicae strain LMG 859, the causal agent of bacterial leaf blight disease in pomegranate. The draft genome will aid in comparative genomics, epidemiological studies, and quarantine of this devastating phytopathogen. PMID:22493202

  12. Genome Sequence of the Rice Pathogen Pseudomonas fuscovaginae CB98818

    PubMed Central

    Xie, Guanlin; Cui, Zhouqi; Tao, Zhongyun; Qiu, Hui; Liu, He; Zhu, Bo; Jin, Gulei; Sun, Guochang; Almoneafy, Abdulwareth

    2012-01-01

    Pseudomonas fuscovaginae is a phytopathogenic bacterium causing bacterial sheath brown rot of cereal crops. Here, we present the draft genome sequence of P. fuscovaginae CB98818, originally isolated from a diseased rice plant in China. The draft genome will aid in epidemiological studies, comparative genomics, and quarantine of this broad-host-range pathogen. PMID:22965098

  13. Genome sequence of the rice pathogen Pseudomonas fuscovaginae CB98818.

    PubMed

    Xie, Guanlin; Cui, Zhouqi; Tao, Zhongyun; Qiu, Hui; Liu, He; Ibrahim, Muhammad; Zhu, Bo; Jin, Gulei; Sun, Guochang; Almoneafy, Abdulwareth; Li, Bin

    2012-10-01

    Pseudomonas fuscovaginae is a phytopathogenic bacterium causing bacterial sheath brown rot of cereal crops. Here, we present the draft genome sequence of P. fuscovaginae CB98818, originally isolated from a diseased rice plant in China. The draft genome will aid in epidemiological studies, comparative genomics, and quarantine of this broad-host-range pathogen. PMID:22965098

  14. Identification of Candidate Drosophila Olfactory Receptors from Genomic DNA Sequence

    Microsoft Academic Search

    Qian Gao; Andrew Chess

    1999-01-01

    We have taken advantage of the availability of a large amount of Drosophila genomic DNA sequence in the Berkeley Drosophila Genome Project database (?1\\/5 of the genome) to identify a family of novel seven transmembrane domain encoding genes that are putative Drosophila olfactory receptors. Members of the family are expressed in distinct subsets of olfactory neurons, and certain family members

  15. Finished Genome Sequence of Collimonas arenae Cal35.

    PubMed

    Wu, Je-Jia; de Jager, Victor C L; Deng, Wen-Ling; Leveau, Johan H J

    2015-01-01

    We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. PMID:25573943

  16. SEQUENCING THE PIG GENOME USING A BAC BY BAC APPROACH

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have generated a highly contiguous physical map covering >98% of the pig genome in just 176 contigs. The map is localized to the genome through integration with the UIVC RH map as well BAC end sequence alignments to the human genome. Over 265k HindIII restriction digest fingerprints totaling 16.2...

  17. On the sequencing of the human genome Robert H. Waterston*

    E-print Network

    Batzoglou, Serafim

    . The international Human Ge- nome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics. One was the product of the international Human Genome Project (HGP), and the other was the productOn the sequencing of the human genome Robert H. Waterston* , Eric S. Lander , and John E. Sulston

  18. Finished Genome Sequence of Collimonas arenae Cal35

    PubMed Central

    Wu, Je-Jia; de Jager, Victor C. L.; Deng, Wen-Ling

    2015-01-01

    We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. PMID:25573943

  19. Whole-genome sequencing and variant discovery in C. elegans

    Microsoft Academic Search

    LaDeana W Hillier; Gabor T Marth; Aaron R Quinlan; David Dooling; Ginger Fewell; Derek Barnett; Paul Fox; Jarret I Glasscock; Matthew Hickenbotham; Weichun Huang; Vincent J Magrini; Ryan J Richt; Sacha N Sander; Donald A Stewart; Michael Stromberg; Eric F Tsung; Todd Wylie; Tim Schedl; Richard K Wilson; Elaine R Mardis

    2008-01-01

    Massively parallel sequencing instruments enable rapid and inexpensive DNA sequence data production. Because these instruments are new, their data require characterization with respect to accuracy and utility. To address this, we sequenced a Caernohabditis elegans N2 Bristol strain isolate using the Solexa Sequence Analyzer, and compared the reads to the reference genome to characterize the data and to evaluate coverage

  20. Accurate whole human genome sequencing using reversible terminator chemistry

    Microsoft Academic Search

    David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

    2008-01-01

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

  1. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  2. PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly

    E-print Network

    Bader, David A.

    PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly Xing Liu, Student Member subsequences from the reads. With Next Generation Sequencing (NGS) technologies, assembly software needs sequence. We focus here on de novo assembly, where no reference sequence aids the reconstruction. Next

  3. Methods, challenges, and promise of next-generation sequencing in cancer biology.

    PubMed

    Haimovich, Adrian D

    2011-12-01

    It is generally accepted that cancers result from the aggregation of somatic mutations. The emergence of next-generation sequencing (NGS) technologies during the past half-decade has enabled studies of cancer genomes with high sensitivity and resolution through whole-genome and whole-exome sequencing approaches, among others. This saltatory advance introduces the possibility of assembling multiple cancer genomes for analysis in a cost-effective manner. Analytical approaches are now applied to the detection of a number of somatic genome alterations, including nucleotide substitutions, insertions/deletions, copy number variations, and chromosomal rearrangements. This review provides a thorough introduction to the cancer genomics pipeline as well as a case study of these methods put into practice. PMID:22180681

  4. On the current status of Phakopsora pachyrhizi genome sequencing

    PubMed Central

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

  5. Draft Genome Sequence of Bordetella trematum Strain HR18

    PubMed Central

    Chang, Dong-Ho; Jin, Tae-Eun; Rhee, Moon-Soo; Jeong, Haeyoung; Kim, Seil

    2015-01-01

    The genus Bordetella is reportedly a human or animal pathogen and environmental microbe. We report the draft genome sequence of Bordetella trematum strain HR18, which was isolated from the rumen of Korean native cattle (Hanwoo; Bos taurus coreanae). It is the first genome sequence of a Bordetella sp. isolated from the rumen of cattle. PMID:25573930

  6. Draft Genome Sequence of Kocuria rhizophila P7-4?

    PubMed Central

    Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

    2011-01-01

    We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

  7. Complete Genome Sequence of Magnetospirillum gryphiswaldense MSR-1

    PubMed Central

    Wang, Xu; Wang, Qing; Zhang, Weijia; Wang, Yinjia; Li, Li; Wen, Tong; Zhang, Tongwei; Zhang, Yang; Xu, Jun; Hu, Junying; Li, Shuqi; Liu, Lingzi; Liu, Jinxin; Jiang, Wei; Tian, Jiesheng; Wang, Lei; Li, Jilun

    2014-01-01

    We report the complete genomic sequence of Magnetospirillum gryphiswaldense MSR-1 (DSM 6361), a type strain of the genus Magnetospirillum belonging to the Alphaproteobacteria. Compared to the reported draft sequence, extensive rearrangements and differences were found, indicating high genomic flexibility and “domestication” by accelerated evolution of the strain upon repeated passaging. PMID:24625872

  8. Draft Genome Sequence of Neurospora crassa Strain FGSC 73

    PubMed Central

    Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V.; Simmons, Blake A.

    2015-01-01

    We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models. PMID:25838471

  9. Almost finished: the complete genome sequence of Mycosphaerella graminicola

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mycosphaerella graminicola causes septoria tritici blotch of wheat. An 8.9x shotgun sequence of bread wheat strain IPO323 was generated through the Community Sequencing Program of the U.S. Department of Energy’s Joint Genome Institute (JGI), and was finished at the Stanford Human Genome Center. The ...

  10. Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis

    PubMed Central

    Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A.; Edwards, Robert A.

    2013-01-01

    Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

  11. Draft Genome Sequence of Raoultella planticola, Isolated from River Water

    PubMed Central

    Kahler, Amy; Strockbine, Nancy; Gladney, Lori; Hill, Vincent R.

    2014-01-01

    We isolated Raoultella planticola from a river water sample, which was phenotypically indistinguishable from Escherichia coli on MI agar. The genome sequence of R. planticola was determined to gain information about its metabolic functions contributing to its false positive appearance of E. coli on MI agar. We report the first whole genome sequence of Raoultella planticola. PMID:25323725

  12. Draft Genome Sequence of Xanthomonas sacchari Strain LMG 476

    PubMed Central

    Pieretti, Isabelle; Bolot, Stéphanie; Carrère, Sébastien; Barbe, Valérie; Cociancich, Stéphane; Rott, Philippe

    2015-01-01

    We report the high-quality draft genome sequence of Xanthomonas sacchari strain LMG 476, isolated from sugarcane. The genome comparison of this strain with a previously sequenced X. sacchari strain isolated from a distinct environmental source should provide further insights into the adaptation of this species to different habitats and its evolution. PMID:25792064

  13. Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages

    PubMed Central

    Sheflo, Michael A.; Gardner, Adam V.; Merrill, Bryan D.; Fisher, Joshua N. B.; Lunt, Bryce L.; Breakwell, Donald P.; Grose, Julianne H.

    2013-01-01

    Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

  14. Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages.

    PubMed

    Sheflo, Michael A; Gardner, Adam V; Merrill, Bryan D; Fisher, Joshua N B; Lunt, Bryce L; Breakwell, Donald P; Grose, Julianne H; Burnett, Sandra H

    2013-01-01

    Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

  15. Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...

  16. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii

    PubMed Central

    Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  17. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.

    PubMed

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  18. Draft genome sequence of Kocuria rhizophila P7-4.

    PubMed

    Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

    2011-08-01

    We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

  19. Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing

    E-print Network

    Batzoglou, Serafim

    Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing Ken J. Kalafus,1,2 Andrew R and Molecular Biophysics, 2 Bioinformatics Research Laboratory, 3 Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA Pash is a computer

  20. The Prospects for Sequencing the Western Corn Rootworm Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Historically, obtaining the complete sequence of eukaryotic genomes has been an expensive and complex task. For this reason, efforts to sequence insect genomes have largely been confined to model organisms, species that are important to human health, and representative species from a few insect orde...

  1. Initial sequencing and analysis of the human genome

    Microsoft Academic Search

    Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

    2001-01-01

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

  2. Draft Genome Sequence of Tolypothrix boutellei Strain VB521301

    PubMed Central

    Chandrababunaidu, Mathu Malar; Singh, Deeksha; Sen, Diya; Bhan, Sushma; Das, Subhadeep; Gupta, Akash

    2015-01-01

    We report here the draft genome sequence of the filamentous nitrogen-fixing cyanobacterium Tolypothrix boutellei strain VB521301. The organism is lipid rich and hydrophobic and produces polyunsaturated fatty acids which can be harnessed for industrial purpose. The draft genome sequence assembled into 11,572,263 bp with 70 scaffolds and 7,777 protein coding genes. PMID:25700407

  3. Genome sequence of Pasteurella multocida subsp. gallicida Anand1_poultry.

    PubMed

    Ahir, V B; Roy, A; Jhala, M K; Bhanderi, B B; Mathakiya, R A; Bhatt, V D; Padiya, K B; Jakhesara, S J; Koringa, P G; Joshi, C G

    2011-10-01

    We report the finished and annotated genome sequence of Pasteurella multocida gallicida strain Anand1_poultry, which was isolated from the liver of a diseased adult female chicken. The strain causes a disease called "fowl cholera," which is a contagious disease in birds. We compared it with the published genome sequence of Pasteurella multocida Pm70. PMID:21914901

  4. Genome Sequence of Pasteurella multocida subsp. gallicida Anand1_poultry

    PubMed Central

    Ahir, V. B.; Roy, A.; Jhala, M. K.; Bhanderi, B. B.; Mathakiya, R. A.; Bhatt, V. D.; Padiya, K. B.; Jakhesara, S. J.; Koringa, P. G.; Joshi, C. G.

    2011-01-01

    We report the finished and annotated genome sequence of Pasteurella multocida gallicida strain Anand1_poultry, which was isolated from the liver of a diseased adult female chicken. The strain causes a disease called “fowl cholera,” which is a contagious disease in birds. We compared it with the published genome sequence of Pasteurella multocida Pm70. PMID:21914901

  5. Unexpected cross-species contamination in genome sequencing projects

    PubMed Central

    Merchant, Samier; Wood, Derrick E.

    2014-01-01

    The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes. PMID:25426337

  6. Implications of the Plastid Genome Sequence of Typha (Typhaceae, Poales) for Understanding Genome Evolution in Poaceae

    Microsoft Academic Search

    Mary M. GuisingerTimothy; Timothy W. Chumley; Jennifer V. Kuehl; Jeffrey L. Boore; Robert K. Jansen

    2010-01-01

    Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been\\u000a a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution\\u000a has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the

  7. The Cancer Genome Atlas Pan-Cancer analysis project.

    PubMed

    Weinstein, John N; Collisson, Eric A; Mills, Gordon B; Shaw, Kenna R Mills; Ozenberger, Brad A; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M

    2013-10-01

    The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile. PMID:24071849

  8. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  9. De Novo Assembly of a Bell Pepper Endornavirus Genome Sequence Using RNA Sequencing Data

    PubMed Central

    Jo, Yeonhwa; Choi, Hoseng

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  10. Genome sequence of the human malaria parasite Plasmodium falciparum

    Microsoft Academic Search

    Malcolm J. Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W. Hyman; Jane M. Carlton; Arnab Pain; Sharen Bowman; Ian T. Paulsen; Keith James; Kim Rutherford; Steven L. Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J. Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W. Mather; Akhil B. Vaidya; Alan H. Fairlamb; Martin J. Fraunholz; David S. Roos; Stuart A. Ralph; Geoffrey I. McFadden; Leda M. Cummings; G. Mani Subramanian; Chris Mungall; J. Craig Venter; Daniel J. Carucci; Stephen L. Hoffman; Chris Newbold; Ronald W. Davis; Claire M. Fraser; Bart Barrell

    2002-01-01

    The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date.

  11. The human genome sequence: impact on health care

    Microsoft Academic Search

    M. D. Bashyam; S. E. Hasnain

    2003-01-01

    The recent sequencing of the human genome, resulting from two independent global efforts, is poised to revolutionize all aspects of human health. This landmark achievement has also vindicated two different methodologies that can now be used to target other important large genomes. The human genome sequence has revealed several novel\\/surprising features notably the probable presence of a mere 30-35,000 genes.

  12. Draft sequences of the radish (Raphanus sativus L.) genome.

    PubMed

    Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

    2014-10-01

    Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ? 300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

  13. Clinical efficacy and possible applications of genomics in lung cancer.

    PubMed

    Alharbi, Khalid Khalaf

    2015-01-01

    The heterogeneous nature of lung cancer has become increasingly apparent since introduction of molecular classification. In general, advanced lung cancer is an aggressive malignancy with a poor prognosis. Activating alterations in several potential driver oncogenic genes have been identified, including EGFR, ROS1 and ALK and understanding of their molecular mechanisms underlying development, progression, and survival of lung cancer has led to the design of personalized treatments that have produced superior clinical outcomes in tumours harbouring these mutations. In light of the tsunami of new biomarkers and targeted agents, next generation sequencing testing strategies will be more appropriate in identifying the patients for each therapy and enabling personalized patients care. The challenge now is how best to interpret the results of these genomic tests, in the context of other clinical data, to optimize treatment choices. In genomic era of cancer treatment, the traditional one-size-fits-all paradigm is being replaced with more effective, personalized oncologic care. This review provides an overview of lung cancer genomics and personalized treatment. PMID:25773789

  14. Complete Genome Sequence of Corynebacterium minutissimum, an Opportunistic Pathogen and the Causative Agent of Erythrasma.

    PubMed

    Penton, Patricia K; Tyagi, Eishita; Humrighouse, Ben W; McQuiston, John R

    2015-01-01

    Corynebacterium minutissimum was first isolated in 1961 from infection sites of patients presenting with erythrasma, a common cutaneous infection characterized by a rash. Since its discovery, C. minutissimum has been identified as an opportunistic pathogen in immunosuppressed cancer and HIV patients. Here, we report the whole-genome sequence of C. minutissimum. PMID:25792058

  15. The Cancer Genome Atlas (TCGA): The next stage

    Cancer.gov

    The Cancer Genome Atlas (TCGA), the NIH research program that has helped set the standards for characterizing the genomic underpinnings of dozens of cancers on a large scale, is moving to its next phase.

  16. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Microsoft Academic Search

    Tina T. Hu; Pedro Pattyn; Erica G. Bakker; Jun Cao; Jan-Fang Cheng; Richard M. Clark; Noah Fahlgren; Jeffrey A. Fawcett; Jane Grimwood; Heidrun Gundlach; Georg Haberer; Jesse D. Hollister; Stephan Ossowski; Robert P. Ottilar; Asaf A. Salamov; Korbinian Schneeberger; Manuel Spannagl; Xi Wang; Liang Yang; Mikhail E. Nasrallah; Joy Bergelson; James C. Carrington; Brandon S. Gaut; Jeremy Schmutz; Klaus F. X. Mayer; Yves Van de Peer; Igor V. Grigoriev; Magnus Nordborg; Detlef Weigel; Ya-Long Guo

    2011-01-01

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN\\/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how

  17. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    SciTech Connect

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  18. Scrutinizing Virus Genome Termini by High-Throughput Sequencing

    PubMed Central

    Fan, Huahao; Jiang, Huanhuan; Chen, Yubao; Tong, Yigang

    2014-01-01

    Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS) have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs) found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell. PMID:24465717

  19. Standards for sequencing viral genomes in the era of high-throughput sequencing.

    PubMed

    Ladner, Jason T; Beitzel, Brett; Chain, Patrick S G; Davenport, Matthew G; Donaldson, Eric F; Frieman, Matthew; Kugelman, Jeffrey R; Kuhn, Jens H; O'Rear, Jules; Sabeti, Pardis C; Wentworth, David E; Wiley, Michael R; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher; Palacios, Gustavo

    2014-01-01

    Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five "standard" categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

  20. Standards for Sequencing Viral Genomes in the Era of High-Throughput Sequencing

    PubMed Central

    Beitzel, Brett; Chain, Patrick S. G.; Davenport, Matthew G.; Donaldson, Eric; Frieman, Matthew; Kugelman, Jeffrey; Kuhn, Jens H.; O’Rear, Jules; Sabeti, Pardis C.; Wentworth, David E.; Wiley, Michael R.; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher

    2014-01-01

    ABSTRACT Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

  1. Emerging Knowledge from Genome Sequencing of Crop Species

    Microsoft Academic Search

    Delfina Barabaschi; Davide Guerra; Katia Lacrima; Paolo Laino; Vania Michelotti; Simona Urso; Giampiero Valè; Luigi Cattivelli

    Extensive insights into the genome composition, organization, and evolution have been gained from the plant genome sequencing\\u000a and annotation ongoing projects. The analysis of crop genomes provided surprising evidences with important implications in\\u000a plant origin and evolution: genome duplication, ancestral re-arrangements and unexpected polyploidization events opened new\\u000a doors to address fundamental questions related to species proliferation, adaptation, and functional modulations.

  2. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    SciTech Connect

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  3. Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding

    PubMed Central

    Jiang, Yanliang; Ninwichian, Parichart; Liu, Shikai; Zhang, Jiaren; Kucuktas, Huseyin; Sun, Fanyue; Kaltenboeck, Ludmilla; Sun, Luyang; Bao, Lisui; Liu, Zhanjiang

    2013-01-01

    Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly) were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge. PMID:24205335

  4. Genome remodelling in a basal-like breast cancer metastasis and xenograft

    Microsoft Academic Search

    Li Ding; Matthew J. Ellis; Shunqiang Li; David E. Larson; Ken Chen; John W. Wallis; Christopher C. Harris; Michael D. McLellan; Robert S. Fulton; Lucinda L. Fulton; Rachel M. Abbott; Jeremy Hoog; David J. Dooling; Daniel C. Koboldt; Heather Schmidt; Joelle Kalicki; Qunyuan Zhang; Lei Chen; Ling Lin; Michael C. Wendl; Joshua F. McMichael; Vincent J. Magrini; Lisa Cook; Sean D. McGrath; Tammi L. Vickery; Elizabeth Appelbaum; Katherine Deschryver; Sherri Davies; Therese Guintoli; Li Lin; Robert Crowder; Yu Tao; Jacqueline E. Snider; Scott M. Smith; Adam F. Dukes; Gabriel E. Sanderson; Craig S. Pohl; Kim D. Delehaunty; Catrina C. Fronick; Kimberley A. Pape; Jerry S. Reed; Jody S. Robinson; Jennifer S. Hodges; William Schierding; Nathan D. Dees; Dong Shen; Devin P. Locke; Madeline E. Wiechert; James M. Eldred; Josh B. Peck; Benjamin J. Oberkfell; Justin T. Lolofie; Feiyu Du; Amy E. Hawkins; Michelle D. O'Laughlin; Kelly E. Bernard; Mark Cunningham; Glendoria Elliott; Mark D. Mason; Dominic M. Thompson Jr.; Jennifer L. Ivanovich; Paul J. Goodfellow; Charles M. Perou; George M. Weinstock; Rebecca Aft; Mark Watson; Timothy J. Ley; Richard K. Wilson; Elaine R. Mardis

    2010-01-01

    Massively parallel DNA sequencing technologies provide an unprecedented ability to screen entire genomes for genetic changes associated with tumour progression. Here we describe the genomic analyses of four DNA samples from an African-American patient with basal-like breast cancer: peripheral blood, the primary tumour, a brain metastasis and a xenograft derived from the primary tumour. The metastasis contained two de novo

  5. Genome Science: A Video Tour of the Washington University Genome Sequencing Center for High School and Undergraduate Students

    ERIC Educational Resources Information Center

    Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.

    2005-01-01

    Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…

  6. Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus

    PubMed Central

    Wang, Qing-Yan; Huang, Yan-Yan; Song, Li-Fu; Du, Qi-Shi; Yu, Bo; Chen, Dong

    2014-01-01

    We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies. PMID:25395648

  7. Interplay Between the Cancer Genome and Epigenome

    PubMed Central

    Shen, Hui; Laird, Peter W.

    2013-01-01

    Cancer arises as a consequence of cumulative disruptions to cellular growth control, with Darwinian selection for those heritable changes which provide the greatest clonal advantage. These traits can be acquired and stably maintained by either genetic or epigenetic means. Here we explore the ways in which alterations in the genome and epigenome influence each other and cooperate to promote oncogenic transformation. Disruption of epigenomic control is pervasive in malignancy, and can be classified as an enabling characteristic of cancer cells, akin to genome instability and mutation. PMID:23540689

  8. The Brachypodium genome sequence: a resource for oat genomics research

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oat (Avena sativa) is an important cereal crop used as both an animal feed and for human consumption. Genetic and genomic research on oat is hindered because it is hexaploid and possesses a large (13 Gb) genome. Diploid Avena relatives have been employed for genetic and genomic studies, but only mod...

  9. Single-Molecule DNA Sequencing of a Viral Genome

    Microsoft Academic Search

    Timothy D. Harris; Phillip R. Buzby; Hazen Babcock; Eric Beer; Jayson Bowers; Ido Braslavsky; Marie Causey; Jennifer Colonell; James DiMeo; J. William Efcavitch; Eldar Giladi; Jaime Gill; John Healy; Mirna Jarosz; Dan Lapen; Keith Moulton; Stephen R. Quake; Kathleen Steinmann; Edward Thayer; Anastasia Tyurina; Rebecca Ward; Howard Weiss; Zheng Xie

    2008-01-01

    The full promise of human genomics will be realized only when the genomes of thousands of individuals can be sequenced for comparative analysis. A reference sequence enables the use of short read length. We report an amplification-free method for determining the nucleotide sequence of more than 280,000 individual DNA molecules simultaneously. A DNA polymerase adds labeled nucleotides to surface-immobilized primer-template

  10. MIPS: a database for genomes and protein sequences

    Microsoft Academic Search

    Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

  11. Automated de novo identification of repeat sequence families in sequenced genomes.

    PubMed

    Bao, Zhirong; Eddy, Sean R

    2002-08-01

    Repetitive sequences make up a major part of eukaryotic genomes. We have developed an approach for the de novo identification and classification of repeat sequence families that is based on extensions to the usual approach of single linkage clustering of local pairwise alignments between genomic sequences. Our extensions use multiple alignment information to define the boundaries of individual copies of the repeats and to distinguish homologous but distinct repeat element families. When tested on the human genome, our approach was able to properly identify and group known transposable elements. The program, should be useful for first-pass automatic classification of repeats in newly sequenced genomes. PMID:12176934

  12. Complete genome sequence of Anaerococcus prevotii type strain (PC1).

    PubMed

    Labutti, Kurt; Pukall, Rudiger; Steenblock, Katja; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Saunders, Elizabeth; Brettin, Thomas; Detter, John C; Han, Cliff; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

    2009-01-01

    Anaerococcus prevotii (Foubert and Douglas 1948) Ezaki et al. 2001 is the type species of the genus, and is of phylogenetic interest because of its arguable assignment to the provisionally arranged family 'Peptostreptococcaceae'. A. prevotii is an obligate anaerobic coccus, usually arranged in clumps or tetrads. The strain, whose genome is described here, was originally isolated from human plasma; other strains of the species were also isolated from clinical specimen. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the genus. Next to Finegoldia magna, A. prevotii is only the second species from the family 'Peptostreptococcaceae' for which a complete genome sequence is described. The 1,998,633 bp long genome (chromosome and one plasmid) with its 1852 protein-coding and 61 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304652

  13. Nervous system regulation of the cancer genome

    PubMed Central

    Cole, Steven W.

    2012-01-01

    Genomics-based analyses have provided deep insight into the basic biology of cancer and are now clarifying the molecular pathways by which psychological and social factors can regulate tumor cell gene expression and genome evolution. This review summarizes basic and clinical research on neural and endocrine regulation of the cancer genome and its interactions with the surrounding tumor microenvironment, including the specific types of genes subject to neural and endocrine regulation, the signal transduction pathways that mediate such effects, and therapeutic approaches that might be deployed to mitigate their impact. Beta-adrenergic signaling from the sympathetic nervous system has been found to up-regulated a diverse array of genes that contribute to tumor progression and metastasis, whereas glucocorticoid-regulated genes can inhibit DNA repair and promote cancer cell survival and resistance to chemotherapy. Relationships between socio-environmental risk factors, neural and endocrine signaling to the tumor microenvironment, and transcriptional responses by cancer cells and surrounding stromal cells are providing new mechanistic insights into the social epidemiology of cancer, new therapeutic approaches for protecting the health of cancer patients, and new molecular biomarkers for assessing the impact of behavioral and pharmacologic interventions. PMID:23207104

  14. Somatic Mutation Profiles of MSI and MSS Colorectal Cancer Identified by Whole Exome Next Generation Sequencing and Bioinformatics Analysis

    Microsoft Academic Search

    Bernd Timmermann; Martin Kerick; Christina Roehr; Axel Fischer; Melanie Isau; Stefan T. Boerno; Andrea Wunderlich; Christian Barmeyer; Petra Seemann; Jana Koenig; Michael Lappe; Andreas W. Kuss; Masoud Garshasbi; Lars Bertram; Kathrin Trappe; Martin Werber; Bernhard G. Herrmann; Kurt Zatloukal; Hans Lehrach; Michal R. Schweiger; Amanda Ewart Toland

    2010-01-01

    BackgroundColorectal cancer (CRC) is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains

  15. Looking to future of genome mapping, sequencing

    SciTech Connect

    Kangilaski, J.

    1989-07-21

    The human genome mapping and sequencing project is perhaps the prime example of an international project in medicine today. The project director, Nobelist James D. Watson, PhD, noted at the bicentennial conference that it may be possible to bring the cost down to as low as 50{cents} a base pair without any enormous technological breakthroughs in the 10-nation effort. Another speaker, George Poste, PhD, DVM, DSc, head of research and development, Smith Kline French Laboratories, Philadelphia, PA, predicted that completion of the genetic dictionary will lead to compilation of a protein dictionary for each cell type for use against disease. Anti-trust legislation, he said, is overtly ignored all the time in the defense industry because it is deemed to be in the national interest. However, Poste went on, the legislative bodies of the world do not yet understand the implications of the directions in which we are going in terms of Big Biology and the requirements for companies to be able to work together.

  16. Complete genome sequence of Thermomonospora curvata type strain (B9)

    SciTech Connect

    Chertkov, Olga [Los Alamos National Laboratory (LANL); Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California

    2011-01-01

    Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  17. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    SciTech Connect

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Jando, Marlen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  18. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    SciTech Connect

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. STATUS OF THE RB51 GENOME SEQUENCING PROJECT

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The shotgun sequencing of the B. abortus vaccine strain, RB51 genome is nearly complete. Thus far, approximately 49,000 recombinant clones have been sequenced, generating approximately 34,300,000-bp of raw DNA sequence data. The resulting data has been compiled and aligned using the B. abortus st...

  20. Recurrence time statistics: Versatile tools for genomic DNA sequence analysis

    E-print Network

    Gao, Jianbo

    Recurrence time statistics: Versatile tools for genomic DNA sequence analysis Yinhe Cao1, Wen from DNA sequences. One of the more important structures in a DNA se- quence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant

  1. INVESTIGATION Whole-Genome Sequencing of Sordaria macrospora

    E-print Network

    Kück, Ulrich

    prior mapping information. KEYWORDS next-generation sequencing developmental mutants Sordaria macrospora. In recent years, so-called next-generation sequencing techniques were developed that allow a massivelyINVESTIGATION Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes

  2. Cancer Vulnerabilities Unveiled by Genomic Loss

    E-print Network

    Bhatia, Sangeeta

    Cancer Vulnerabilities Unveiled by Genomic Loss Deepak Nijhawan,1,2,7,9,10 Travis I. Zack,1 02139, USA 7Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA *Correspondence: rameen

  3. Genetic and Clonal Dissection of Murine Small Cell Lung Carcinoma Progression by Genome Sequencing

    PubMed Central

    McFadden, David G.; Papagiannakopoulos, Thales; Taylor-Weiner, Amaro; Stewart, Chip; Carter, Scott L.; Cibulskis, Kristian; Bhutkar, Arjun; McKenna, Aaron; Dooley, Alison; Vernon, Amanda; Sougnez, Carrie; Malstrom, Scott; Heimann, Megan; Park, Jennifer; Chen, Frances; Farago, Anna F.; Dayton, Talya; Shefler, Erica; Gabriel, Stacey; Getz, Gad; Jacks, Tyler

    2014-01-01

    Summary Small cell lung carcinoma (SCLC) is a highly lethal, smoking-associated cancer with few known targetable genetic alterations. Using genome sequencing, we characterized the somatic evolution of a genetically engineered mouse model (GEMM) of SCLC initiated by loss of Trp53 and Rb1. We identified alterations in DNA copy number and complex genomic rearrangements and demonstrated a low somatic point mutation frequency in the absence of tobacco mutagens. Alterations targeting the tumor suppressor Pten occurred in the majority of murine SCLC studied, and engineered Pten deletion accelerated murine SCLC and abrogated loss of Chr19 in Trp53; Rb1; Pten compound mutant tumors. Finally, we found evidence for polyclonal and sequential metastatic spread of murine SCLC by comparative sequencing of families of related primary tumors and metastases. We propose a temporal model of SCLC tumorigenesis with implications for human SCLC therapeutics and the nature of cancer-genome evolution in GEMMs. PMID:24630729

  4. MicroRNAs, Genomic Instability and Cancer

    PubMed Central

    Vincent, Kimberly; Pichler, Martin; Lee, Gyeong-Won; Ling, Hui

    2014-01-01

    MicroRNAs (miRNAs) are small non-coding RNA transcripts approximately 20 nucleotides in length that regulate expression of protein-coding genes via complementary binding mechanisms. The last decade has seen an exponential increase of publications on miRNAs, ranging from every aspect of basic cancer biology to diagnostic and therapeutic explorations. In this review, we summarize findings of miRNA involvement in genomic instability, an interesting but largely neglected topic to date. We discuss the potential mechanisms by which miRNAs induce genomic instability, considered to be one of the most important driving forces of cancer initiation and progression, though its precise mechanisms remain elusive. We classify genomic instability mechanisms into defects in cell cycle regulation, DNA damage response, and mitotic separation, and review the findings demonstrating the participation of specific miRNAs in such mechanisms. PMID:25141103

  5. Transcriptome profiling of the cancer and normal tissues from gastric cancer patients by deep sequencing.

    PubMed

    Zhang, Fei-Gong; He, Zhi-Ying; Wang, Qiang

    2014-08-01

    Gastric cancer is the second highest cause of global cancer mortality. Genome-wide screening of transcriptome dysregulation between cancer and normal tissues would provide insights into the molecular basis of gastric cancer initiation and progression. Recently, next-generation sequencing technique has started to revolutionize biomedical studies. RNA-seq method has become a superior approach in cancer studies, which enables accurate measurement of gene expression levels. In this work, we used RNA-seq data from tumor and matched normal samples to investigate their transcriptional changes. We totally identified 114 significantly differentially expressed genes, and these genes are highly enriched in some gene ontology (GO) categories, such as "digestive system process," "regulation of body fluid levels," "secretion," "digestion," etc. This study provided the preliminary survey of the transcriptome of Chinese gastric cancer patients, which provides better insights into the complexity of regulatory changes during tumorgenesis. PMID:24777338

  6. The genus burkholderia: analysis of 56 genomic sequences.

    PubMed

    Ussery, D W; Kiil, K; Lagesen, K; Sicheritz-Pontén, T; Bohlin, J; Wassenaar, T M

    2009-01-01

    The genus Burkholderia consists of a number of very diverse species, both in terms of lifestyle (which varies from category B pathogens to apathogenic soil bacteria and plant colonizers) and their genetic contents. We have used 56 publicly available genomes to explore the genomic diversity within this genus, including genome sequences that are not completely finished, but are available from the NCBI database. Defining the pan- and core genomes of species results in insights in the conserved and variable fraction of genomes, and can verify (or question) historic, taxonomic groupings. We find only several hundred genes that are conserved across all Burkholderia genomes, whilst there are more than 40,000 gene families in the Burkholderia pan-genome. A BLAST matrix visualizes the fraction of conserved genes in pairwise comparisons. A BLAST atlas shows which genes are actually conserved in a number of genomes, located and visualized with reference to a chosen genome. Genomic islands are common in many Burkholderia genomes, and most of these can be readily visualized by DNA structural properties of the chromosome. Trees that are based on relatedness of gene family content yield different results depending on what genes are analyzed. Some of the differences can be explained by errors in incomplete genome sequences, but, as our data illustrate, the outcome of phylogenetic trees depends on the type of genes that are analyzed. PMID:19696499

  7. Complete genome sequence of Thioalkalivibrio sp. K90mix

    PubMed Central

    Muyzer, Gerard; Sorokin, Dimitry Y.; Mavromatis, Konstantinos; Lapidus, Alla; Foster, Brian; Sun, Hui; Ivanova, Natalia; Pati, Amrita; D'haeseleer, Patrik; Woyke, Tanja; Kyrpides, Nikos C.

    2011-01-01

    Thioalkalivibrio sp. K90mix is an obligately chemolithoautotrophic, natronophilic sulfur-oxidizing bacterium (SOxB) belonging to the family Ectothiorhodospiraceae within the Gammaproteobacteria. The strain was isolated from a mixture of sediment samples obtained from different soda lakes located in the Kulunda Steppe (Altai, Russia) based on its extreme potassium carbonate tolerance as an enrichment method. Here we report the complete genome sequence of strain K90mix and its annotation. The genome was sequenced within the Joint Genome Institute Community Sequencing Program, because of its relevance to the sustainable removal of sulfide from wastewater and gas streams. PMID:22675584

  8. Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens

    PubMed Central

    Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E.; Bakker, Freek T.

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

  9. US-Mexico sequence-analysis collaboration illuminates breast cancer's drivers

    Cancer.gov

    Breast cancer is not a single disease, but a collection of diseases with dozens of different mutations that crop up with varying frequency across different breast cancer subtypes. Deeper exploration of the genetic changes that drive breast cancer is revealing new complexity in the leading cause of cancer death in women worldwide. In one of the largest breast cancer sequencing efforts to date, scientists from the Broad Institute, Dana-Farber Cancer Institute, the National Institute of Genomic Medicine in Mexico City, and Beth Israel Deaconess Medical Center have discovered surprising alterations in genes that were not previously associated with breast cancer. They report their results in the June 21 issue of Nature.

  10. Study reveals genomic similarities between breast and ovarian cancers

    Cancer.gov

    A new study from The Cancer Genome Atlas captured a complete view of genomic alterations in breast cancer and classified them into four intrinsic subtypes, one of which shares many genetic features with high-grade serous ovarian cancer. Depicted are breast cancer cells with the HER2 protein, which can trigger cell growth responses, lit up in bright red. (Photo credit: NIST)

  11. Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome

    Microsoft Academic Search

    Andreia J Amaral; Hendrik-Jan Megens; Hindrik HD Kerstens; Henri CM Heuven; Bert Dibbits; Richard PMA Crooijmans; Johan T den Dunnen; Martien AM Groenen

    2009-01-01

    BACKGROUND: Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of

  12. A physical map of the papaya genome with integrated genetic map and genome sequence

    Microsoft Academic Search

    Qingyi Yu; Eric Tong; Rachel L Skelton; John E Bowers; Meghan R Jones; Jan E Murray; Shaobin Hou; Peizhu Guan; Ricelle A Acob; Ming-Cheng Luo; Paul H Moore; Maqsudul Alam; Andrew H Paterson; Ray Ming

    2009-01-01

    BACKGROUND: Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we

  13. New complete genome sequences of human rhinoviruses shed light on their phylogeny and genomic features

    Microsoft Academic Search

    Caroline Tapparel; Thomas Junier; Daniel Gerlach; Samuel Cordey; Sandra Van Belle; Luc Perrin; Evgeny M Zdobnov; Laurent Kaiser

    2007-01-01

    BACKGROUND: Human rhinoviruses (HRV), the most frequent cause of respiratory infections, include 99 different serotypes segregating into two species, A and B. Rhinoviruses share extensive genomic sequence similarity with enteroviruses and both are part of the picornavirus family. Nevertheless they differ significantly at the phenotypic level. The lack of HRV full-length genome sequences and the absence of analysis comparing picornaviruses

  14. De Novo Whole-Genome Sequence and Genome Annotation of Lichtheimia ramosa

    PubMed Central

    Linde, Jörg; Schwartze, Volker; Binder, Ulrike; Lass-Flörl, Cornelia

    2014-01-01

    We report the annotated draft genome sequence of Lichtheimia ramosa (JMRC FSU:6197). It has been reported to be a causative organism of mucormycosis, a rare but rapidly progressive infection in immunocompromised humans. The functionally annotated genomic sequence consists of 74 scaffolds with a total number of 11,510 genes. PMID:25212617

  15. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    SciTech Connect

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  16. Sequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations

    PubMed Central

    Hahn, Matthew W.; Zhang, Simo V.; Moyle, Leonie C.

    2014-01-01

    Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences—which we call recombinant population genome construction—that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross. PMID:24531727

  17. The Genomics of Colorectal Cancer: State of the Art

    PubMed Central

    Beggs, Andrew D; Hodgson, Shirley V

    2008-01-01

    The concept of the adenoma-carcinoma sequence, as first espoused by Morson et al. whereby the development of colorectal cancer is dependent on a stepwise progression from adenomatous polyp to carcinoma is well documented. Initial studies of the genetics of inherited colorectal cancer susceptibility concentrated on the inherited colorectal cancer syndromes, such as Familial Adenomatous Polyposis (FAP) and Lynch Syndrome (also known as HNPCC). These syndromes, whilst easily characterisable, have a well understood sequence of genetic mutations that predispose the sufferer to developing colorectal cancer, initiated for example in FAP by the loss of the second, normal allelle of the tumour supressor APC gene. Later research has identified other inherited variants such as MUTYH (MYH) polyposis and Hyperplastic Polyposis Syndrome. Recent research has concentrated on the pathways by which colorectal adenomatous polyps not due to one of these known inherited susceptibilities undergo malignant transformation, and determination of the types of polyps most likely to do so. Also, why do individuals in certain families have a predisposition to colorectal cancer. In this article, we will discuss briefly the current state of knowledge of the genomics of the classical inherited colorectal cancer syndromes. We will also discuss in detail the genetic changes in polyps that undergo malignant transformation as well as current knowledge with regards to the epigenomic changes found in colorectal polyps. PMID:19424478

  18. Accurate whole human genome sequencing using reversible terminator chemistry.

    PubMed

    Bentley, David R; Balasubramanian, Shankar; Swerdlow, Harold P; Smith, Geoffrey P; Milton, John; Brown, Clive G; Hall, Kevin P; Evers, Dirk J; Barnes, Colin L; Bignell, Helen R; Boutell, Jonathan M; Bryant, Jason; Carter, Richard J; Keira Cheetham, R; Cox, Anthony J; Ellis, Darren J; Flatbush, Michael R; Gormley, Niall A; Humphray, Sean J; Irving, Leslie J; Karbelashvili, Mirian S; Kirk, Scott M; Li, Heng; Liu, Xiaohai; Maisinger, Klaus S; Murray, Lisa J; Obradovic, Bojan; Ost, Tobias; Parkinson, Michael L; Pratt, Mark R; Rasolonjatovo, Isabelle M J; Reed, Mark T; Rigatti, Roberto; Rodighiero, Chiara; Ross, Mark T; Sabot, Andrea; Sankar, Subramanian V; Scally, Aylwyn; Schroth, Gary P; Smith, Mark E; Smith, Vincent P; Spiridou, Anastassia; Torrance, Peta E; Tzonev, Svilen S; Vermaas, Eric H; Walter, Klaudia; Wu, Xiaolin; Zhang, Lu; Alam, Mohammed D; Anastasi, Carole; Aniebo, Ify C; Bailey, David M D; Bancarz, Iain R; Banerjee, Saibal; Barbour, Selena G; Baybayan, Primo A; Benoit, Vincent A; Benson, Kevin F; Bevis, Claire; Black, Phillip J; Boodhun, Asha; Brennan, Joe S; Bridgham, John A; Brown, Rob C; Brown, Andrew A; Buermann, Dale H; Bundu, Abass A; Burrows, James C; Carter, Nigel P; Castillo, Nestor; Chiara E Catenazzi, Maria; Chang, Simon; Neil Cooley, R; Crake, Natasha R; Dada, Olubunmi O; Diakoumakos, Konstantinos D; Dominguez-Fernandez, Belen; Earnshaw, David J; Egbujor, Ugonna C; Elmore, David W; Etchin, Sergey S; Ewan, Mark R; Fedurco, Milan; Fraser, Louise J; Fuentes Fajardo, Karin V; Scott Furey, W; George, David; Gietzen, Kimberley J; Goddard, Colin P; Golda, George S; Granieri, Philip A; Green, David E; Gustafson, David L; Hansen, Nancy F; Harnish, Kevin; Haudenschild, Christian D; Heyer, Narinder I; Hims, Matthew M; Ho, Johnny T; Horgan, Adrian M; Hoschler, Katya; Hurwitz, Steve; Ivanov, Denis V; Johnson, Maria Q; James, Terena; Huw Jones, T A; Kang, Gyoung-Dong; Kerelska, Tzvetana H; Kersey, Alan D; Khrebtukova, Irina; Kindwall, Alex P; Kingsbury, Zoya; Kokko-Gonzales, Paula I; Kumar, Anil; Laurent, Marc A; Lawley, Cynthia T; Lee, Sarah E; Lee, Xavier; Liao, Arnold K; Loch, Jennifer A; Lok, Mitch; Luo, Shujun; Mammen, Radhika M; Martin, John W; McCauley, Patrick G; McNitt, Paul; Mehta, Parul; Moon, Keith W; Mullens, Joe W; Newington, Taksina; Ning, Zemin; Ling Ng, Bee; Novo, Sonia M; O'Neill, Michael J; Osborne, Mark A; Osnowski, Andrew; Ostadan, Omead; Paraschos, Lambros L; Pickering, Lea; Pike, Andrew C; Pike, Alger C; Chris Pinkard, D; Pliskin, Daniel P; Podhasky, Joe; Quijano, Victor J; Raczy, Come; Rae, Vicki H; Rawlings, Stephen R; Chiva Rodriguez, Ana; Roe, Phyllida M; Rogers, John; Rogert Bacigalupo, Maria C; Romanov, Nikolai; Romieu, Anthony; Roth, Rithy K; Rourke, Natalie J; Ruediger, Silke T; Rusman, Eli; Sanches-Kuiper, Raquel M; Schenker, Martin R; Seoane, Josefina M; Shaw, Richard J; Shiver, Mitch K; Short, Steven W; Sizto, Ning L; Sluis, Johannes P; Smith, Melanie A; Ernest Sohna Sohna, Jean; Spence, Eric J; Stevens, Kim; Sutton, Neil; Szajkowski, Lukasz; Tregidgo, Carolyn L; Turcatti, Gerardo; Vandevondele, Stephanie; Verhovsky, Yuli; Virk, Selene M; Wakelin, Suzanne; Walcott, Gregory C; Wang, Jingwen; Worsley, Graham J; Yan, Juying; Yau, Ling; Zuerlein, Mike; Rogers, Jane; Mullikin, James C; Hurles, Matthew E; McCooke, Nick J; West, John S; Oaks, Frank L; Lundberg, Peter L; Klenerman, David; Durbin, Richard; Smith, Anthony J

    2008-11-01

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications. PMID:18987734

  19. Massive Genomic Rearrangement Acquired in a Single Catastrophic Event during Cancer Development

    PubMed Central

    Stephens, Philip J.; Greenman, Chris D.; Fu, Beiyuan; Yang, Fengtang; Bignell, Graham R.; Mudie, Laura J.; Pleasance, Erin D.; Lau, King Wai; Beare, David; Stebbings, Lucy A.; McLaren, Stuart; Lin, Meng-Lay; McBride, David J.; Varela, Ignacio; Nik-Zainal, Serena; Leroy, Catherine; Jia, Mingming; Menzies, Andrew; Butler, Adam P.; Teague, Jon W.; Quail, Michael A.; Burton, John; Swerdlow, Harold; Carter, Nigel P.; Morsberger, Laura A.; Iacobuzio-Donahue, Christine; Follows, George A.; Green, Anthony R.; Flanagan, Adrienne M.; Stratton, Michael R.; Futreal, P. Andrew; Campbell, Peter J.

    2011-01-01

    Summary Cancer is driven by somatically acquired point mutations and chromosomal rearrangements, conventionally thought to accumulate gradually over time. Using next-generation sequencing, we characterize a phenomenon, which we term chromothripsis, whereby tens to hundreds of genomic rearrangements occur in a one-off cellular crisis. Rearrangements involving one or a few chromosomes crisscross back and forth across involved regions, generating frequent oscillations between two copy number states. These genomic hallmarks are highly improbable if rearrangements accumulate over time and instead imply that nearly all occur during a single cellular catastrophe. The stamp of chromothripsis can be seen in at least 2%–3% of all cancers, across many subtypes, and is present in ?25% of bone cancers. We find that one, or indeed more than one, cancer-causing lesion can emerge out of the genomic crisis. This phenomenon has important implications for the origins of genomic remodeling and temporal emergence of cancer. PaperClip PMID:21215367

  20. Tandem repeats in complete bacterial genome sequences: sequence and structural analyses for comparative studies

    Microsoft Academic Search

    Edouard Yeramian; Henri Buc

    1999-01-01

    A series of complete bacterial genome sequences have recently become available and powerful methods have been developed for the identification of tandem repeats on a very large scale. It is thus possible to derive extensive comparative descriptions of such repeats at the level of complete genomes, as illustrated here for three different bacterial genomes: Escherichia coli, Haemophilus influenzae, and Mycobacterium

  1. Multiplex Sequencing of Seven Ocular Herpes Simplex Virus Type-1 Genomes: Phylogeny, Sequence Variability,

    E-print Network

    Craven, Mark

    . Brandt1,4 PURPOSE. Little is known about the role of sequence variation in the pathology of HSV-1 is feasible for simultaneously sequencing seven HSV-1 ocular strains. METHODS. A genome sequencer was used to sequence the HSV-1 ocular isolates TFT401, 134, CJ311, CJ360, CJ394, CJ970, and OD4, in a single lane

  2. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    Microsoft Academic Search

    Hervé Tettelin; Vega Masignani; Michael J. Cieslewicz; Jonathan A. Eisen; Scott Peterson; Michael R. Wessels; Ian T. Paulsen; Karen E. Nelson; Immaculada Margarit; Timothy D. Read; Lawrence C. Madoff; Alex M. Wolf; Maureen J. Beanan; Lauren M. Brinkac; Sean C. Daugherty; Robert T. Deboy; A. Scott Durkin; James F. Kolonay; Ramana Madupu; Matthew R. Lewis; Diana Radune; Nadezhda B. Fedorova; David Scanlan; Hoda Khouri; Stephanie Mulligan; Heather A. Carty; Robin T. Cline; Susan E. van Aken; John Gill; Maria Scarselli; Marirosa Mora; Emilia T. Iacobini; Cecilia Brettoni; Giuliano Galli; Massimo Mariani; Filippo Vegni; Domenico Maione; Daniela Rinaudo; Rino Rappuoli; John L. Telford; Dennis L. Kasper; Guido Grandi; Claire M. Fraser

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined

  3. BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

  4. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    PubMed Central

    Abt, Birte; Foster, Brian; Lapidus, Alla; Clum, Alicia; Sun, Hui; Pukall, Rüdiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Rohde, Manfred; Göker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304688

  5. Genome sequencing and analysis of the model grass Brachypodium distachyon

    SciTech Connect

    Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

    2010-01-01

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

  6. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    SciTech Connect

    Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Foster, Brian [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  7. The Release 6 reference sequence of the Drosophila melanogaster genome.

    PubMed

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

  8. The Release 6 reference sequence of the Drosophila melanogaster genome

    PubMed Central

    Carlson, Joseph W.; Wan, Kenneth H.; Park, Soo; Mendez, Ivonne; Galle, Samuel E.; Booth, Benjamin W.; Pfeiffer, Barret D.; George, Reed A.; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V.; Andreyeva, Evgeniya N.; Boldyreva, Lidiya V.; Marra, Marco; Carvalho, A. Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F.; Rubin, Gerald M.; Karpen, Gary H.

    2015-01-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

  9. A survey of tools for variant analysis of next-generation genome sequencing data

    PubMed Central

    Pabinger, Stephan; Dander, Andreas; Fischer, Maria; Snajder, Rene; Sperk, Michael; Efremova, Mirjana; Krabichler, Birgit; Speicher, Michael R.; Zschocke, Johannes

    2014-01-01

    Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers. PMID:23341494

  10. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    PubMed

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  11. Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species

    PubMed Central

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  12. Draft Genome Sequence of Aneurinibacillus migulanus Strain Nagano.

    PubMed

    Alenezi, Faizah N; Weitz, Hedda J; Belbahri, Lassaad; Ben Rebah, Hassen; Luptakova, Lenka; Jaspars, Marcel; Woodward, Stephen

    2015-01-01

    Aneurinibacillus migulanus is characterized by inhibition of growth of a range of plant-pathogenic bacteria and fungi. Here, we report the high-quality draft genome sequences of A. migulanus Nagano. PMID:25838487

  13. Genome sequence of the fish pathogen Flavobacterium columnare ATCC 49512

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Flavobacterium columnare is a Gram-negative, rod shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. ...

  14. Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050

    PubMed Central

    Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

    2014-01-01

    Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

  15. Genome Sequence of the Fish Pathogen Flavobacterium columnare ATCC 49512

    PubMed Central

    Tekedar, Hasan C.; Karsi, Attila; Gillaspy, Allison F.; Dyer, David W.; Benton, Nicole R.; Zaitshik, Jeremy; Vamenta, Stefanie; Banes, Michelle M.; Gülsoy, Nagihan; Aboko-Cole, Mary; Waldbieser, Geoffrey C.

    2012-01-01

    Flavobacterium columnare is a Gram-negative, rod-shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. PMID:22535941

  16. Draft Genome Sequence of Aneurinibacillus migulanus Strain Nagano

    PubMed Central

    Alenezi, Faizah N.; Weitz, Hedda J.; Ben Rebah, Hassen; Luptakova, Lenka; Jaspars, Marcel; Woodward, Stephen

    2015-01-01

    Aneurinibacillus migulanus is characterized by inhibition of growth of a range of plant-pathogenic bacteria and fungi. Here, we report the high-quality draft genome sequences of A. migulanus Nagano. PMID:25838487

  17. Operational streamlining in a high-throughput genome sequencing center

    E-print Network

    Person, Kerry P. (Kerry Patrick)

    2006-01-01

    Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

  18. Genome Sequence of Microcystis aeruginosa Strain NIES-44.

    PubMed

    Okano, Kunihiro; Miyata, Naoyuki; Ozaki, Yasuo

    2015-01-01

    Microcystis aeruginosa is a typical algal bloom-forming cyanobacterium. This report describes the whole-genome sequence of a non-microcystin-producing strain of Microcystis aeruginosa, NIES-44, which was isolated from a Japanese lake. PMID:25792056

  19. Complete Genome Sequence of Rahnella aquatilis CIP 78.65

    SciTech Connect

    Martinez, Robert J [University of Alabama, Tuscaloosa; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Held, Brittany [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Sobeckya, Patricia A. [University of Alabama, Tuscaloosa

    2012-01-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

  20. The genome sequence of the filamentous fungus Neurospora crassa 

    E-print Network

    Read, Nick D; et al

    2003-04-24

    Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase ...

  1. The genome sequence of the filamentous fungus Neurospora crassa

    E-print Network

    Kellis, Manolis

    The genome sequence of the filamentous fungus Neurospora crassa James E. Galagan1 , Sarah E. Calvo1 is a multicellular filamentous fungus, it has also provided a system to study cellular differentiation

  2. Fulfilling the Promise of a Sequenced Human Genome – Part II

    SciTech Connect

    Green, Eric [National Human Genome Research Institute

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

  3. Fulfilling the Promise of a Sequenced Human Genome – Part I

    SciTech Connect

    Green, Eric [National Human Genome Research Institute

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

  4. Initial genome sequencing and analysis of multiple myeloma

    E-print Network

    Lander, Eric S.

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

  5. Genome sequence of vanilla distortion mosaic virus infecting Coriandrum sativum.

    PubMed

    Adams, I P; Rai, S; Deka, M; Harju, V; Hodges, T; Hayward, G; Skelton, A; Fox, A; Boonham, N

    2014-12-01

    The 9573-nucleotide genome of a potyvirus was sequenced from a Coriandrum sativum plant from India with viral symptoms. On analysis, this virus was shown to have greater than 85 % nucleotide sequence identity to vanilla distortion mosaic virus (VDMV). Analysis of the putative coat protein sequence confirmed that this virus was in fact VDMV, with greater than 91 % amino acid sequence identity. The genome appears to encode a 3083-amino-acid polyprotein potentially cleaved into the 10 mature proteins expected in potyviruses. Phylogenetic analysis confirmed that VDMV is a distinct but ungrouped member of the genus Potyvirus. PMID:25252813

  6. A non-radioactive multiprime sequencing method for HIV genomes

    Microsoft Academic Search

    Jutta Huber; Wolfgang Hell; Hans Wolf

    1995-01-01

    A manual non-radioactive DNA sequencing protocol was developed for rapid analysis of variable HIV-1 genomes. Sets of up to ten primers were used in one sequencing reaction. After polyacrylamide gel electrophoresis and blotting onto nylon membranes the individual sequences were detected by hybridization with digoxigenin-labelled oligonucleotides and chemiluminescence. The method is applicable to any sequencing project where numerous variants of

  7. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  8. Genomic sequence analysis and characterization of Sneathia amnii sp. nov

    PubMed Central

    2012-01-01

    Background Bacteria of the genus Sneathia are emerging as potential pathogens of the female reproductive tract. Species of Sneathia, which were formerly grouped with Leptotrichia, can be part of the normal microbiota of the genitourinary tracts of men and women, but they are also associated with a variety of clinical conditions including bacterial vaginosis, preeclampsia, preterm labor, spontaneous abortion, post-partum bacteremia and other invasive infections. Sneathia species also exhibit a significant correlation with sexually transmitted diseases and cervical cancer. Because Sneathia species are fastidious and rarely cultured successfully in vitro; and the genomes of members of the genus had until now not been characterized, very little is known about the physiology or the virulence of these organisms. Results Here, we describe a novel species, Sneathia amnii sp. nov, which closely resembles bacteria previously designated "Leptotrichia amnionii". As part of the Vaginal Human Microbiome Project at VCU, a vaginal isolate of S. amnii sp. nov. was identified, successfully cultured and bacteriologically cloned. The biochemical characteristics and virulence properties of the organism were examined in vitro, and the genome of the organism was sequenced, annotated and analyzed. The analysis revealed a reduced circular genome of ~1.34 Mbp, containing ~1,282 protein-coding genes. Metabolic reconstruction of the bacterium reflected its biochemical phenotype, and several genes potentially associated with pathogenicity were identified. Conclusions Bacteria with complex growth requirements frequently remain poorly characterized and, as a consequence, their roles in health and disease are unclear. Elucidation of the physiology and identification of genes putatively involved in the metabolism and virulence of S. amnii may lead to a better understanding of the role of this potential pathogen in bacterial vaginosis, preterm birth, and other issues associated with vaginal and reproductive health. PMID:23281612

  9. Targeted deep resequencing of the human cancer genome using next-generation technologies

    PubMed Central

    MYLLYKANGAS, SAMUEL; JI, HANLEE P.

    2015-01-01

    Next generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, base pair resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations. PMID:21415896

  10. Alfresco---A Workbench for Comparative Genomic Sequence Analysis

    Microsoft Academic Search

    Niclas Jareborg; Richard Durbin

    2000-01-01

    Comparative analysis of genomic sequences provides a powerful tool for identifying regions of potential biologic function; by comparing corresponding regions of genomes from suitable species, protein coding or regulatory regions can be identified by their homology. This requires the use of several specific types of computational analysis tools. Many programs exist for these types of analysis; not many exist for

  11. A Cryptographic Approach to Securely Share and Query Genomic Sequences

    Microsoft Academic Search

    Murat Kantarcioglu; Ying Liu; Bradley Malin

    2008-01-01

    To support large-scale biomedical research projects, organizations need to share person-specific genomic sequences without violating the privacy of their data subjects. In the past, organizations protected subjects' identities by removing identifiers, such as name and social security number; however, recent investigations illustrate that deidentified genomic data can be ldquoreidentifiedrdquo to named individuals using simple automated methods. In this paper, we

  12. Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis

    Microsoft Academic Search

    J. M. Carlton; R. P. Hirt; J. C. Silva; A. L. Delcher; Michael Schatz; Qi Zhao; J. R. Wortman; S. L. Bidwell; U. C. M. Alsmark; Sébastien Besteiro; Thomas Sicheritz-Ponten; C. J. Noel; J. B. Dacks; P. G. Foster; Cedric Simillion; Y. Van de Peer; Diego Miranda-Saavedra; G. J. Barton; G. D. Westrop; S. Muller; Daniele Dessi; P. L. Fiori; Qinghu Ren; Ian Paulsen; Hanbang Zhang; F. D. Bastida-Corcuera; Augusto Simoes-Barbosa; M. T. Brown; R. D. Hayes; Mandira Mukherjee; C. Y. Okumura; Rachel Schneider; A. J. Smith; Stepanka Vanacova; Maria Villalvazo; B. J. Haas; Mihaela Pertea; Tamara V. Feldblyum; T. R. Utterback; Chung-Li Shu; Kazutoyo Osoegawa; P. J. de Jong; Ivan Hrdy; Lenka Horvathova; Zuzana Zubacova; Pavel Dolezal; Shehre-Banoo Malik; J. M. Logsdon; Katrin Henze; Arti Gupta; Ching C. Wang; R. L. Dunne; J. A. Upcroft; Peter Upcroft; Owen White; S. L. Salzberg; Petrus Tang; Cheng-Hsun Chiu; Ying-Shiung Lee; T. M. Embley; G. H. Coombs; J. C. Mottram; Jan Tachezy; C. M. Fraser-Liggett; P. J. Johnson

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the ~160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated

  13. Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...

  14. Draft Genome Sequence of Mycobacterium austroafricanum DSM 44191

    PubMed Central

    Croce, Olivier; Robert, Catherine; Raoult, Didier

    2014-01-01

    We announce the draft genome sequence of Mycobacterium austroafricanum DSM 44191T (= E9789-SA12441T), a non-tuberculosis species responsible for opportunistic infection. The genome described here has a size of 6,772,357 bp with a G+C content of 66.79% and contains 6,419 protein-coding genes and 112 RNA genes. PMID:24744336

  15. Draft Genome Sequence of Enterobacter cloacae Strain JD6301

    PubMed Central

    Wilson, Jessica G.; French, William T.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W.; Champlin, Franklin R.

    2014-01-01

    Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

  16. Draft genome sequences of 10 strains of the genus exiguobacterium.

    PubMed

    Vishnivetskaya, Tatiana A; Chauhan, Archana; Layton, Alice C; Pfiffner, Susan M; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C; Markowitz, Victor M; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W; Pati, Amrita; Stamatis, Dimitrios; Reddy, T B K; Shapiro, Nicole; Nordberg, Henrik P; Cantor, Michael N; Hua, X Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  17. Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium

    PubMed Central

    Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  18. Draft Genome Sequence of Entomopathogenic Serratia liquefaciens Strain FK01

    PubMed Central

    Taira, Erika; Mon, Hiroaki; Mori, Kazuki; Akasaka, Taiki; Tashiro, Kousuke; Yasunaga-Aoki, Chisa; Lee, Jae Man; Kusakabe, Takahiro

    2014-01-01

    In the present study, we determined the draft genome sequence of the entomopathogenic bacterium Serratia liquefaciens FK01, which is highly virulent to the silkworm. The draft genome is ~5.28 Mb in size, and the G+C content is 55.8%. PMID:24970828

  19. Genome Sequence of Xanthomonas citri pv. mangiferaeindicae Strain LMG 941

    PubMed Central

    Midha, Samriti; Ranjan, Manish; Sharma, Vikas; Pinnaka, Anil Kumar

    2012-01-01

    We report the 5.1-Mb genome sequence of Xanthomonas citri pv. mangiferaeindicae strain LMG 941, the causal agent of bacterial black spot in mango. Apart from evolutionary studies, the draft genome will be a valuable resource for the epidemiological studies and quarantine of this phytopathogen. PMID:22582385

  20. The genome sequence and structure of rice chromosome 1

    Microsoft Academic Search

    Takuji Sasaki; Takashi Matsumoto; Kimiko Yamamoto; Katsumi Sakata; Tomoya Baba; Yuichi Katayose; Jianzhong Wu; Yoshihito Niimura; Zhukuan Cheng; Yoshiaki Nagamura; Baltazar A. Antonio; Hiroyuki Kanamori; Satomi Hosokawa; Masatoshi Masukawa; Koji Arikawa; Yoshino Chiden; Mika Hayashi; Masako Okamoto; Tsuyu Ando; Hiroyoshi Aoki; Kohei Arita; Masao Hamada; Chizuko Harada; Saori Hijishita; Mikiko Honda; Yoko Ichikawa; Atsuko Idonuma; Masumi Iijima; Michiko Ikeda; Maiko Ikeno; Sachie Ito; Tomoko Ito; Yuichi Ito; Yukiyo Ito; Aki Iwabuchi; Kozue Kamiya; Wataru Karasawa; Satoshi Katagiri; Ari Kikuta; Noriko Kobayashi; Izumi Kono; Kayo Machita; Tomoko Maehara; Hiroshi Mizuno; Tatsumi Mizubayashi; Yoshiyuki Mukai; Hideki Nagasaki; Marina Nakashima; Yuko Nakama; Yumi Nakamichi; Mari Nakamura; Nobukazu Namiki; Manami Negishi; Isamu Ohta; Nozomi Ono; Shoko Saji; Kumiko Sakai; Michie Shibata; Takanori Shimokawa; Ayahiko Shomura; Jianyu Song; Yuka Takazaki; Kimihiro Terasawa; Kumiko Tsuji; Kazunori Waki; Harumi Yamagata; Hiroko Yamane; Shoji Yoshiki; Rie Yoshihara; Kazuko Yukawa; Huisun Zhong; Hisakazu Iwama; Toshinori Endo; Hidetaka Ito; Jang Ho Hahn; Ho-Il Kim; Moo-Young Eun; Masahiro Yano; Jiming Jiang; Takashi Gojobori

    2002-01-01

    The rice species Oryza sativa is considered to be a model plant because of its small genome size, extensive genetic map, relative ease of transformation and synteny with other cereal crops. Here we report the essentially complete sequence of chromosome 1, the longest chromosome in the rice genome. We summarize characteristics of the chromosome structure and the biological insight gained

  1. Draft Genome Sequence of Necropsobacter rosorum Strain P709T

    PubMed Central

    Padmanabhan, Roshan; Robert, Catherine; Fenollar, Florence; Raoult, Didier

    2014-01-01

    Necropsobacter is a recently described genus that contains a single species, N. rosorum, and belongs to the family Pasteurellaceae. Here, we present the draft genome of N. rosorum strain P709T, which is the first genome sequence from this species. PMID:25301642

  2. Draft Genome Sequence of "Candidatus Liberibacter asiaticus" from California.

    PubMed

    Zheng, Z; Deng, X; Chen, J

    2014-01-01

    We report here the draft genome sequence of "Candidatus Liberibacter asiaticus" strain HHCA, collected from a lemon tree in California. The HHCA strain has a genome size of 1,150,620 bp, 36.5% G+C content, 1,119 predicted open reading frames, and 51 RNA genes. PMID:25278540

  3. Draft Genome Sequence of “Candidatus Liberibacter asiaticus” from California

    PubMed Central

    Zheng, Z.

    2014-01-01

    We report here the draft genome sequence of “Candidatus Liberibacter asiaticus” strain HHCA, collected from a lemon tree in California. The HHCA strain has a genome size of 1,150,620 bp, 36.5% G+C content, 1,119 predicted open reading frames, and 51 RNA genes. PMID:25278540

  4. Genome sequence of the cultivated cotton Gossypium arboreum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

  5. The tomato genome sequence provides insight into fleshy fruit evolution

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...

  6. Complete Genome Sequence of Marinobacter sp. BSs20148.

    PubMed

    Song, Lai; Ren, Lufeng; Li, Xingang; Yu, Dan; Yu, Yong; Wang, Xumin; Liu, Guiming

    2013-01-01

    Marinobacter sp. BSs20148 was isolated from marine sediment collected from the Arctic Ocean at a water depth of 3,800 m. Here we report the complete genome sequence of Marinobacter sp. BSs20148. This genomic information will facilitate the study of the physiological metabolism, ecological roles, and evolution of the Marinobacter species. PMID:23682144

  7. Tandem Clusters of Membrane Proteins in Complete Genome Sequences

    E-print Network

    Kihara, Daisuke

    of genes coding for membrane proteins was investigated in 16 complete genomes: 4 archaea, 11 bacteria of isolated ATP-binding protein components in the ABC transporters. Possible implications of tandem clusterTandem Clusters of Membrane Proteins in Complete Genome Sequences Daisuke Kihara1 and Minoru

  8. Draft Genome Sequence of Highly Nematicidal Bacillus thuringiensis DB27

    PubMed Central

    Corton, Craig; Pickard, Derek J.; Dougan, Gordon

    2014-01-01

    Here, we report the genome sequence of nematicidal Bacillus thuringiensis DB27, which provides first insights into the genetic determinants of its pathogenicity to nematodes. The genome consists of a 5.7-Mb chromosome and seven plasmids, three of which contain genes encoding nematicidal proteins. PMID:24558243

  9. Draft Genome Sequence of Highly Nematicidal Bacillus thuringiensis DB27.

    PubMed

    Iatsenko, Igor; Corton, Craig; Pickard, Derek J; Dougan, Gordon; Sommer, Ralf J

    2014-01-01

    Here, we report the genome sequence of nematicidal Bacillus thuringiensis DB27, which provides first insights into the genetic determinants of its pathogenicity to nematodes. The genome consists of a 5.7-Mb chromosome and seven plasmids, three of which contain genes encoding nematicidal proteins. PMID:24558243

  10. Genome Sequence of a Thermophilic Bacillus, Geobacillus thermodenitrificans DSM465

    PubMed Central

    Yao, Nana; Ren, Yi

    2013-01-01

    Geobacillus thermodenitrificans NG80-2 encodes a LadA-mediated alkane degradation pathway, while G. thermodenitrificans DSM465 cannot utilize alkanes. Here, we report the draft genome sequence of G. thermodenitrificans DSM465, which may help reveal the genomic differences between these two strains in regards to the biodegradation of alkanes. PMID:24336381

  11. Complete Genome Sequence of Pronghorn Virus, a Pestivirus

    PubMed Central

    Ridpath, Julia F.; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

    2014-01-01

    The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

  12. Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila

    Microsoft Academic Search

    Hiromi Takarada; Mitsuo Sekine; Hiroki Kosugi; Yasunori Matsuo; Takatomo Fujisawa; Seiha Omata; Emi Kishi; Ai Shimizu; Naofumi Tsukatani; Satoshi Tanikawa; Nobuyuki Fujita; Shigeaki Harayama

    2008-01-01

    The soil actinomycete Kocuria rhizophila belongs to the suborder Micrococcineae, a divergent bacterial group for which only a limited amount of genomic information is currently available. K. rhizophila is also important in industrial applications; e.g., it is commonly used as a standard quality control strain for antimicrobial susceptibility testing. Sequencing and annotation of the genome of K. rhizophila DC2201 (NBRC

  13. Complete genome sequence of pronghorn virus, a pestivirus.

    PubMed

    Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

    2014-01-01

    The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

  14. Complete genome sequence of Pronghorn Virus, a Pestivirus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  15. The genome sequence of the rice blast fungus Magnaporthe grisea

    Microsoft Academic Search

    Ralph A. Dean; Nicholas J. Talbot; Daniel J. Ebbole; Mark L. Farman; Thomas K. Mitchell; Marc J. Orbach; Michael Thon; Resham Kulkarni; Jin-Rong Xu; Huaqin Pan; Nick D. Read; Yong-Hwan Lee; Ignazio Carbone; Doug Brown; Yeon Yee Oh; Nicole Donofrio; Jun Seop Jeong; Darren M. Soanes; Slavica Djonovic; Elena Kolomiets; Cathryn Rehmeyer; Weixi Li; Michael Harding; Soonok Kim; Marc-Henri Lebrun; Heidi Bohnert; Sean Coughlan; Jonathan Butler; Sarah Calvo; Li-Jun Ma; Robert Nicol; Seth Purcell; Chad Nusbaum; James E. Galagan; Bruce W. Birren

    2005-01-01

    Magnaporthe grisea is the most destructive pathogen of rice worldwide and the principal model organism for elucidating the molecular basis of fungal disease of plants. Here, we report the draft sequence of the M. grisea genome. Analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease. The genome encodes a large and

  16. Draft Genome Sequence of Pseudomonas sp. nov. H2

    PubMed Central

    Loftie-Eaton, Wesley; Suzuki, Haruo; Bashford, Kelsie; Heuer, Holger; Stragier, Pieter; De Vos, Paul; Settles, Matthew L.

    2015-01-01

    We report the draft genome sequence of Pseudomonas sp. nov. H2, isolated from creek sediment in Moscow, ID, USA. The strain is most closely related to Pseudomonas putida. However, it has a slightly smaller genome that appears to have been impacted by horizontal gene transfer and poorly maintains IncP-1 plasmids. PMID:25838493

  17. Genome Sequence of the Asiatic Species Borrelia persica

    PubMed Central

    Elbir, Haitham; Larsson, Pär; Normark, Johan; Upreti, Mukunda; Korenberg, Edward; Larsson, Christer

    2014-01-01

    We report the complete genome sequence of Borrelia persica, the causative agent of tick-borne relapsing fever borreliosis on the Asian continent. Its genome of 1,784,979 bp contains 1,850 open reading frames, three ribosomal RNAs, and 32 tRNAs. One clustered regularly interspaced short palindromic repeat (CRISPR) was detected. PMID:24407639

  18. Whole genome sequence of “Candidatus Liberibacter asiaticus” from Guangdong, China

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The draft genome sequence of “Candidatus Liberibacter asiaticus” strain A4, isolated from a mandarin citrus in Guangdong, P. R. China, is reported. The A4 strain has a genome size of 1,208,625 bp, G+C content of 36.4%, 1,107 predicted open reading frames, and 53 RNA genes....

  19. RESEARCH Open Access Genomic and small RNA sequencing of

    E-print Network

    Green, Pamela

    of sorghum as a reference genome sequence for Andropogoneae grasses Kankshita Swaminathan1,2 , Magdy origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome. Included within the Andropogoneae are major crops such as maize, Sorghum bicolor (sorghum), sugarcane

  20. A snapshot of the emerging tomato genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

  1. MAIZE CHLOROTIC DWARF VIRUS GENOME SEQUENCE AND POLYPROTEIN CLEAVAGE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomic sequence (11.8 kb) of the severe Ohio Maize chlorotic dwarf virus isolate (MCDV-S, genus Waikavirus) was determined from overlapping cDNA clones. Approximately 400 kDa polyprotein encoded by the viral genome is post-translationally cleaved into several smaller functional proteins. Wher...

  2. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    PubMed Central

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  3. Completion of the Porcine Epidemic Diarrhoea Coronavirus (PEDV) Genome Sequence

    Microsoft Academic Search

    Rolf Kocherhans; Anne Bridgen; Mathias Ackermann; Kurt Tobler

    2001-01-01

    The sequence of the replicase gene of porcine epidemic diarrhoea virus (PEDV) has been determined. This completes the sequence of the entire genome of strain CV777, which was found to be 28,033 nucleotides (nt) in length (excluding the poly A-tail). A cloning strategy, which involves primers based on conserved regions in the predicted ORF1 products from other coronaviruses whose genome

  4. The Genome Sequence of the SARS-Associated Coronavirus

    Microsoft Academic Search

    Marco A. Marra; Steven J. M. Jones; Caroline R. Astell; Robert A. Holt; Angela Brooks-Wilson; Yaron S. N. Butterfield; Jaswinder Khattra; Jennifer K. Asano; Sarah A. Barber; Susanna Y. Chan; Alison Cloutier; Shaun M. Coughlin; Doug Freeman; Noreen Girn; Obi L. Griffith; Stephen R. Leach; Michael Mayo; Helen McDonald; Stephen B. Montgomery; Pawan K. Pandoh; Anca S. Petrescu; A. Gordon Robertson; Jacqueline E. Schein; Asim Siddiqui; Duane E. Smailus; Jeff M. Stott; George S. Yang; Francis Plummer; Anton Andonov; Harvey Artsob; Nathalie Bastien; Kathy Bernard; Timothy F. Booth; Donnie Bowness; Michael Drebot; Lisa Fernando; Ramon Flick; Michael Garbutt; Michael Garbutt; Allen Grolla; Heinz Feldmann; Adrienne Meyers; Amin Kabani; Yan Li; Susan Normand; Ute Stroher; Graham A. Tipples; Shaun Tyler; Robert Vogrig; Diane Ward; Robert C. Brunham; Mel Krajden; Martin Petric; Danuta M. Skowronski; Chris Upton; Rachel L. Roper

    2003-01-01

    We sequenced the 29,751-base genome of the severe acute respiratory syndrome (SARS)-associated coronavirus known as the Tor2 isolate. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously

  5. Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii

    Microsoft Academic Search

    Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

    1996-01-01

    The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

  6. Use of information theory to study genome sequences

    NASA Astrophysics Data System (ADS)

    Ohya, Masanori; Sato, Keiko

    2000-12-01

    The genome sequence carries information about life as an order of four bases. It is considered that this order indicates a special code structure. In this paper we discuss how the mutual entropy, the main concept in Shannon's communication theory, can be used to study genome sequences, and how a measure introduced in our previous paper [10] for the analysis of similarities of code structures is applied for examining the coding structure of several species, in particular, HIV-1.

  7. Reference genome sequence of the model plant Setaria

    Microsoft Academic Search

    Tuskan; Gerald A

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The â400-Mb assembly covers â80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We

  8. Salmonella serotype determination utilizing high-throughput genome sequencing data.

    PubMed

    Zhang, Shaokang; Yin, Yanlong; Jones, Marcus B; Zhang, Zhenzhen; Deatherage Kaiser, Brooke L; Dinsmore, Blake A; Fitzgerald, Collette; Fields, Patricia I; Deng, Xiangyu

    2015-05-01

    Serotyping forms the basis of national and international surveillance networks for Salmonella, one of the most prevalent foodborne pathogens worldwide (1-3). Public health microbiology is currently being transformed by whole-genome sequencing (WGS), which opens the door to serotype determination using WGS data. SeqSero (www.denglab.info/SeqSero) is a novel Web-based tool for determining Salmonella serotypes using high-throughput genome sequencing data. SeqSero is based on curated databases of Salmonella serotype determinants (rfb gene cluster, fliC and fljB alleles) and is predicted to determine serotype rapidly and accurately for nearly the full spectrum of Salmonella serotypes (more than 2,300 serotypes), from both raw sequencing reads and genome assemblies. The performance of SeqSero was evaluated by testing (i) raw reads from genomes of 308 Salmonella isolates of known serotype; (ii) raw reads from genomes of 3,306 Salmonella isolates sequenced and made publicly available by GenomeTrakr, a U.S. national monitoring network operated by the Food and Drug Administration; and (iii) 354 other publicly available draft or complete Salmonella genomes. We also demonstrated Salmonella serotype determination from raw sequencing reads of fecal metagenomes from mice orally infected with this pathogen. SeqSero can help to maintain the well-established utility of Salmonella serotyping when integrated into a platform of WGS-based pathogen subtyping and characterization. PMID:25762776

  9. Genomics of chromophobe renal cell carcinoma: implications from a rare tumor for pan-cancer studies

    PubMed Central

    Rathmell, Kimryn W.; Chen, Fengju; Creighton, Chad J.

    2015-01-01

    Chromophobe Renal Cell Carcinoma (ChRCC) is a rare subtype of the renal cell carcinomas, a heterogenous group of cancers arising from the nephron. Recently, The Cancer Genome Atlas (TCGA) profiled this understudied disease using multiple data platforms, including whole exome sequencing, whole genome sequencing (WGS), and mitochondrial DNA (mtDNA) sequencing. The insights gained from this study would have implications for other types of kidney cancer as well as for cancer biology in general. Global molecular patterns in ChRCC provided clues as to this cancer's cell of origin, which is distinct from that of the other renal cell carcinomas, illustrating an approach that might be applied towards elucidating the cell of origin of other cancer types. MtDNA sequencing revealed loss-of-function mutations in NADH dehydrogenase subunits, highlighting the role of deregulated metabolism in this and other cancers. Analysis of WGS data led to the discovery of recurrent genomic rearrangements involving TERT promoter region, which were associated with very high expression levels of TERT, pointing to a potential mechanism for TERT deregulation that might be found in other cancers. WGS data, generated by large scale efforts such as TCGA and the International Cancer Genomics Consortium (ICGC), could be more extensively mined across various cancer types, to uncover structural variants, mtDNA mutations, themes of tumor metabolic properties, as well as noncoding point mutations. TCGA's data on ChRCC should continue to serve as a resource for future pan-cancer as well as kidney cancer studies, and highlight the value of investigations into rare tumor types to globally inform principals of cancer biology. PMID:25859550

  10. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    SciTech Connect

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin, since not only are their genomes available, but they are also accompanied by data on environment and physiology that can be used to understand the resulting data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into this effort. Efforts to sequence cultivated isolates should target characterized isolates from culture collections for which biochemical data are available, as well as other cultures of lasting value from personal collections. The genomes of type strains should be among the first targets for sequencing, but creative culture methods, novel cell isolation, and sorting methods would all be helpful in obtaining organisms we have not yet been able to cultivate for sequencing. The data that should be provided for strains targeted for sequencing will depend on the phylogenetic context of the organism and the amount of information available about its nearest relatives. Annotation is an important part of transforming genome sequences into useful resources, but it represents the most significant bottleneck to the field of comparative genomics right now and must be addressed. Furthermore, there is a need for more consistency in both annotation and achieving annotation data. As new annotation tools become available over time, re-annotation of genomes should be implemented, taking advantage of advancements in annotation techniques in order to capitalize on the genome sequences and increase both the societal and scientific benefit of genomics work. Given the proper resources, the knowledge and ability exist to be able to select model systems, some simple, some less so, and dissect them so that we may understand the processes and interactions at work in them. Colloquium participants suggest a five-pronged, coordinated initiative to exhaustively describe six different microbial ecosystems, designed to describe all the gene diversity, across genomes. In this effort, sequencing should be complemented by other experimental data, particularly transcriptomics and metabolomics data, all of which

  11. Mulan: multiple-sequence alignment to predict functional elements in genomic sequences.

    PubMed

    Loots, Gabriela G; Ovcharenko, Ivan

    2007-01-01

    Multiple sequence alignment analysis is a powerful approach for translating the evolutionary selective power into phylogenetic relationships to localize functional coding and noncoding genomic elements. The tool Mulan (http://mulan.dcode.org/) has been designed to effectively perform multiple comparisons of genomic sequences necessary to facilitate bioinformatic-driven biological discoveries. The Mulan network server is capable of comparing both closely and distantly related genomes to identify conserved elements over a broad range of evolutionary time. Several novel algorithms are brought together in this tool: the tba multisequence aligner program used to rapidly identify local sequence conservation and the multiTF program to detect evolutionarily conserved transcription factor binding sites in alignments. Mulan is integrated with the ERC Browser, the UCSC Genome Browser for quick uploads of available sequences and supports two-way communication with the GALA database to overlay GALA functional genome annotation with sequence conservation profiles. Local multiple alignments computed by Mulan ensure reliable representation of short- and large-scale genomic rearrangements in distant organisms. Recently, we have also introduced the ability to handle duplications to permit the reliable reconstruction of evolutionary events that underlie the genome sequence data. Here, we describe the main features of the Mulan tool that include the interactive modification of critical conservation parameters, visualization options, and dynamic access to sequence data from visual graphs for flexible and easy-to-perform analysis of differentially evolving genomic regions. PMID:17993678

  12. Concordance of genomic alterations between primary and recurrent breast cancer.

    PubMed

    Meric-Bernstam, Funda; Frampton, Garrett M; Ferrer-Lozano, Jaime; Yelensky, Roman; Pérez-Fidalgo, Jose A; Wang, Ying; Palmer, Gary A; Ross, Jeffrey S; Miller, Vincent A; Su, Xiaoping; Eroles, Pilar; Barrera, Juan Antonio; Burgues, Octavio; Lluch, Ana M; Zheng, Xiaofeng; Sahin, Aysegul; Stephens, Philip J; Mills, Gordon B; Cronin, Maureen T; Gonzalez-Angulo, Ana M

    2014-05-01

    There is growing interest in delivering genomically informed cancer therapy. Our aim was to determine the concordance of genomic alterations between primary and recurrent breast cancer. Targeted next-generation sequencing was performed on formalin-fixed paraffin-embedded (FFPE) samples, profiling 3,320 exons of 182 cancer-related genes plus 37 introns from 14 genes often rearranged in cancer. Point mutations, indels, copy-number alterations (CNA), and select rearrangements were assessed in 74 tumors from 43 patients (36 primary and 38 recurrence/metastases). Alterations potentially targetable with established or investigational therapeutics were considered "actionable." Alterations were detected in 55 genes (mean 3.95 alterations/sample, range 1-12), including mutations in PIK3CA, TP53, ARID1A, PTEN, AKT1, NF1, FBXW7, and FGFR3 and amplifications in MCL1, CCND1, FGFR1, MYC, IGF1R, MDM2, MDM4, AKT3, CDK4, and AKT2. In 33 matched primary and recurrent tumors, 97 of 112 (86.6%) somatic mutations were concordant. Of identified CNAs, 136 of 159 (85.5%) were concordant: 37 (23.3%) were concordant, but below the reporting threshold in one of the matched samples, and 23 (14.5%) discordant. There was an increased frequency of CDK4/MDM2 amplifications in recurrences, as well as gains and losses of other actionable alterations. Forty of 43 (93%) patients had actionable alterations that could inform targeted treatment options. In conclusion, deep genomic profiling of cancer-related genes reveals potentially actionable alterations in most patients with breast cancer. Overall there was high concordance between primary and recurrent tumors. Analysis of recurrent tumors before treatment may provide additional insights, as both gains and losses of targets are observed. PMID:24608573

  13. Mitochondrial Genome Sequence of the Legume Vicia faba

    PubMed Central

    Negruk, Valentine

    2013-01-01

    The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000?bp with a genome complexity of 387,745?bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806?bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs. PMID:23675376

  14. Cancer genomics: why rare is valuable.

    PubMed

    Jamshidi, Farzad; Nielsen, Torsten O; Huntsman, David G

    2015-04-01

    Rare conditions are sometimes ignored in biomedical research because of difficulties in obtaining specimens and limited interest from fund raisers. However, the study of rare diseases such as unusual cancers has again and again led to breakthroughs in our understanding of more common diseases. It is therefore unsurprising that with the development and accessibility of next-generation sequencing, much has been learnt from studying cancers that are rare and in particular those with uniform biological and clinical behavior. Herein, we describe how shotgun sequencing of cancers such as granulosa cell tumor, endometrial stromal sarcoma, epithelioid hemangioendothelioma, ameloblastoma, small-cell carcinoma of the ovary, clear-cell carcinoma of the ovary, nonepithelial ovarian tumors, chondroblastoma, and giant cell tumor of the bone has led to rapidly translatable discoveries in diagnostics and tumor taxonomies, as well as providing insights into cancer biology. PMID:25676695

  15. Genome sequence of the date palm Phoenix dactylifera L

    PubMed Central

    Al-Mssallem, Ibrahim S.; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M.; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O.; Jia, Shangang; Yin, An; Alhuzimi, Eman M.; Alsaihati, Burair A.; Al-Owayyed, Saad A.; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A.; Sun, Gaoyuan; Majrashi, Majed A.; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A.; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F.; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R.; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

    2013-01-01

    Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm’s unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

  16. DNA Replication Timing, Genome Stability and Cancer

    PubMed Central

    Donley, Nathan

    2013-01-01

    Normal cellular division requires that the genome be faithfully replicated to ensure that unaltered genomic information is passed from one generation to the next. DNA replication initiates from thousands of origins scattered throughout the genome every cell cycle; however, not all origins initiate replication at the same time. A vast amount of work over the years indicates that different origins along each eukaryotic chromosome are activated in early, middle or late S phase. This temporal control of DNA replication is referred to as the replication-timing program. The replication-timing program represents a very stable epigenetic feature of chromosomes. Recent evidence has indicated that the replication-timing program can influence the spatial distribution of mutagenic events such that certain regions of the genome experience increased spontaneous mutagenesis compared to surrounding regions. This influence has helped shape the genomes of humans and other multicellular organisms and can affect the distribution of mutations in somatic cells. It is also becoming clear that the replication-timing program is deregulated in many disease states, including cancer. Aberrant DNA replication timing is associated with changes in gene expression, changes in epigenetic modifications and an increased frequency of structural rearrangements. Furthermore, certain replication timing changes can directly lead to overt genomic instability and may explain unique mutational signatures that are present in cells that have undergone the recently described processes of “chromothripsis” and “kataegis”. In this review, we will discuss how the normal replication timing program, as well as how alterations to this program, can contribute to the evolution of the genomic landscape in normal and cancerous cells. PMID:23327985

  17. Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences

    PubMed Central

    Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

    2002-01-01

    As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

  18. Endoplasmic Reticulum Stress, Genome Damage, and Cancer

    PubMed Central

    Dicks, Naomi; Gutierrez, Karina; Michalak, Marek; Bordignon, Vilceu; Agellon, Luis B.

    2015-01-01

    Endoplasmic reticulum (ER) stress has been linked to many diseases, including cancer. A large body of work has focused on the activation of the ER stress response in cancer cells to facilitate their survival and tumor growth; however, there are some studies suggesting that the ER stress response can also mitigate cancer progression. Despite these contradictions, it is clear that the ER stress response is closely associated with cancer biology. The ER stress response classically encompasses activation of three separate pathways, which are collectively categorized the unfolded protein response (UPR). The UPR has been extensively studied in various cancers and appears to confer a selective advantage to tumor cells to facilitate their enhanced growth and resistance to anti-cancer agents. It has also been shown that ER stress induces chromatin changes, which can also facilitate cell survival. Chromatin remodeling has been linked with many cancers through repression of tumor suppressor and apoptosis genes. Interplay between the classic UPR and genome damage repair mechanisms may have important implications in the transformation process of normal cells into cancer cells. PMID:25692096

  19. Cancer genomics: from discovery science to personalized medicine.

    PubMed

    Chin, Lynda; Andersen, Jannik N; Futreal, P Andrew

    2011-03-01

    Recent advances in genome technologies and the ensuing outpouring of genomic information related to cancer have accelerated the convergence of discovery science and clinical medicine. Successful examples of translating cancer genomics into therapeutics and diagnostics reinforce its potential to make possible personalized cancer medicine. However, the bottlenecks along the path of converting a genome discovery into a tangible clinical endpoint are numerous and formidable. In this Perspective, we emphasize the importance of establishing the biological relevance of a cancer genomic discovery in realizing its clinical potential and discuss some of the major obstacles to moving from the bench to the bedside. PMID:21383744

  20. The Cancer Genome Atlas: Generating a “Parts List” for Cancer

    Cancer.gov

    When I was working in a genomics laboratory in the 1990s, I sequenced a section of human chromosome 7. The part I focused on coded for a tumor-suppressor gene. At that time, we had to read the DNA sequence letter by letter in short stretches and then figure out where the genes resided. It was tedious and required extreme care and attention to detail, plus a fair amount of educated guesses.

  1. Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani

    Microsoft Academic Search

    Holger Br; Gerhard Gottschalkb

    The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

  2. Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani

    Microsoft Academic Search

    Holger Brüggemann; Gerhard Gottschalk

    2004-01-01

    The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

  3. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    SciTech Connect

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  4. Complete genome sequence of Serratia plymuthica strain AS12

    SciTech Connect

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  5. Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    Microsoft Academic Search

    Kevin Chen; Lior Pachter

    2005-01-01

    ABSTRACT T he application,of whole-genome,shotgun,sequencing to microbial,communities,represents,a major development in metagenomics, the study of uncultured,microbes,via the tools of modern,genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic,communities,from,an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding,to previously,published,work,on viral communities from,marine,and,fecal samples. The interpretation,of this new,kind of

  6. Massively parallel sequencing: the new frontier of hematologic genomics

    PubMed Central

    Nickerson, Deborah A.; Reiner, Alex P.

    2013-01-01

    Genomic technologies are becoming a routine part of human genetic analysis. The exponential growth in DNA sequencing capability has brought an unprecedented understanding of human genetic variation and the identification of thousands of variants that impact human health. In this review, we describe the different types of DNA variation and provide an overview of existing DNA sequencing technologies and their applications. As genomic technologies and knowledge continue to advance, they will become integral in clinical practice. To accomplish the goal of personalized genomic medicine for patients, close collaborations between researchers and clinicians will be essential to develop and curate deep databases of genetic variation and their associated phenotypes. PMID:24021669

  7. Chromosomally unstable mouse tumours have genomic alterations similar to diverse human cancers

    Microsoft Academic Search

    Richard S. Maser; Bhudipa Choudhury; Peter J. Campbell; Bin Feng; Kwok-Kin Wong; Alexei Protopopov; Jennifer O'Neil; Alejandro Gutierrez; Elena Ivanova; Ilana Perna; Eric Lin; Vidya Mani; Shan Jiang; Kate McNamara; Sara Zaghlul; Sarah Edkins; Claire Stevens; Cameron Brennan; Eric S. Martin; Ruprecht Wiedemeyer; Omar Kabbarah; Cristina Nogueira; Gavin Histen; Jon Aster; Marc Mansour; Veronique Duke; Letizia Foroni; Adele K. Fielding; Anthony H. Goldstone; Jacob M. Rowe; Yaoqi A. Wang; A. Thomas Look; Michael R. Stratton; Lynda Chin; P. Andrew Futreal; Ronald A. Depinho

    2007-01-01

    Highly rearranged and mutated cancer genomes present major challenges in the identification of pathogenetic events driving the neoplastic transformation process. Here we engineered lymphoma-prone mice with chromosomal instability to assess the usefulness of mouse models in cancer gene discovery and the extent of cross-species overlap in cancer-associated copy number aberrations. Along with targeted re-sequencing, our comparative oncogenomic studies identified FBXW7

  8. Ultradeep Sequencing of a Human Ultraconserved Region Reveals Somatic and Constitutional Genomic Instability

    PubMed Central

    De Grassi, Anna; Segala, Cinzia; Iannelli, Fabio; Volorio, Sara; Bertario, Lucio; Radice, Paolo; Bernard, Loris; Ciccarelli, Francesca D.

    2010-01-01

    Early detection of cancer-associated genomic instability is crucial, particularly in tumour types in which this instability represents the essential underlying mechanism of tumourigenesis. Currently used methods require the presence of already established neoplastic cells because they only detect clonal mutations. In principle, parallel sequencing of single DNA filaments could reveal the early phases of tumour initiation by detecting low-frequency mutations, provided an adequate depth of coverage and an effective control of the experimental error. We applied ultradeep sequencing to estimate the genomic instability of individuals with hereditary non-polyposis colorectal cancer (HNPCC). To overcome the experimental error, we used an ultraconserved region (UCR) of the human genome as an internal control. By comparing the mutability outside and inside the UCR, we observed a tendency of the ultraconserved element to accumulate significantly fewer mutations than the flanking segments in both neoplastic and nonneoplastic HNPCC samples. No difference between the two regions was detectable in cells from healthy donors, indicating that all three HNPCC samples have mutation rates higher than the healthy genome. This is the first, to our knowledge, direct evidence of an intrinsic genomic instability of individuals with heterozygous mutations in mismatch repair genes, and constitutes the proof of principle for the development of a more sensitive molecular assay of genomic instability. PMID:20052272

  9. Johns Hopkins scientists pair blood test and gene sequencing to detect cancer

    Cancer.gov

    Scientists at the Johns Hopkins Kimmel Cancer Center have combined the ability to detect cancer DNA in the blood with genome sequencing technology in a test that could be used to screen for cancers, monitor cancer patients for recurrence and find residual cancer left after surgery. A report describing the new approach appears in the Nov. 28 issue of Science Translational Medicine. To develop the test, the scientists took blood samples from late-stage colorectal and breast cancer patients and healthy individuals and looked for DNA that had been shed into the blood.

  10. Genome sequence of the cultivated cotton Gossypium arboreum.

    PubMed

    Li, Fuguang; Fan, Guangyi; Wang, Kunbo; Sun, Fengming; Yuan, Youlu; Song, Guoli; Li, Qin; Ma, Zhiying; Lu, Cairui; Zou, Changsong; Chen, Wenbin; Liang, Xinming; Shang, Haihong; Liu, Weiqing; Shi, Chengcheng; Xiao, Guanghui; Gou, Caiyun; Ye, Wuwei; Xu, Xun; Zhang, Xueyan; Wei, Hengling; Li, Zhifang; Zhang, Guiyin; Wang, Junyi; Liu, Kun; Kohel, Russell J; Percy, Richard G; Yu, John Z; Zhu, Yu-Xian; Wang, Jun; Yu, Shuxun

    2014-06-01

    The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells. PMID:24836287

  11. Genome instability in blood cells of a BRCA1+ breast cancer family

    PubMed Central

    2014-01-01

    Background BRCA1 plays an essential role in maintaining genome stability. Inherited BRCA1 germline mutation (BRCA1+) is a determined genetic predisposition leading to high risk of breast cancer. While BRCA1+ induces breast cancer by causing genome instability, most of the knowledge is known about somatic genome instability in breast cancer cells but not germline genome instability. Methods Using the exome-sequencing method, we analyzed the genomes of blood cells in a typical BRCA1+ breast cancer family with an exon 13-duplicated founder mutation, including six breast cancer-affected and two breast cancer unaffected members. Results We identified 23 deleterious mutations in the breast cancer-affected family members, which are absent in the unaffected members. Multiple mutations damaged functionally important and breast cancer-related genes, including transcriptional factor BPTF and FOXP1, ubiquitin ligase CUL4B, phosphorylase kinase PHKG2, and nuclear receptor activator SRA1. Analysis of the mutations between the mothers and daughters shows that most mutations were germline mutation inherited from the ancestor(s) while only a few were somatic mutation generated de novo. Conclusion Our study indicates that BRCA1+ can cause genome instability with both germline and somatic mutations in non-breast cells. PMID:24884718

  12. A DRAFT SEQUENCE OF THE RICE GENOME (ORYZA SATIVA L. SSP. INDICA)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the japonica subspecies of rice, an important cereal and model monocot, was sequenced and assembled by whole-genome shotgun sequencing. The assembled sequence covers 93% of the 420-megabase genome. Gene predictions on the assembled sequence suggest that the genome contains 32,000 to 50...

  13. Draft genome sequence of adzuki bean, Vigna angularis.

    PubMed

    Kang, Yang Jae; Satyawan, Dani; Shim, Sangrea; Lee, Taeyoung; Lee, Jayern; Hwang, Won Joo; Kim, Sue K; Lestari, Puji; Laosatit, Kularb; Kim, Kil Hyun; Ha, Tae Joung; Chitikineni, Annapurna; Kim, Moon Young; Ko, Jong-Min; Gwag, Jae-Gyun; Moon, Jung-Kyung; Lee, Yeong-Ho; Park, Beom-Seok; Varshney, Rajeev K; Lee, Suk-Ha

    2015-01-01

    Adzuki bean (Vigna angularis var. angularis) is a dietary legume crop in East Asia. The presumed progenitor (Vigna angularis var. nipponensis) is widely found in East Asia, suggesting speciation and domestication in these temperate climate regions. Here, we report a draft genome sequence of adzuki bean. The genome assembly covers 75% of the estimated genome and was mapped to 11 pseudo-chromosomes. Gene prediction revealed 26,857 high confidence protein-coding genes evidenced by RNAseq of different tissues. Comparative gene expression analysis with V. radiata showed that the tissue specificity of orthologous genes was highly conserved. Additional re-sequencing of wild adzuki bean, V. angularis var. nipponensis, and V. nepalensis, was performed to analyze the variations between cultivated and wild adzuki bean. The determined divergence time of adzuki bean and the wild species predated archaeology-based domestication time. The present genome assembly will accelerate the genomics-assisted breeding of adzuki bean. PMID:25626881

  14. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    PubMed Central

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70. PMID:23405337

  15. The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data.

    PubMed

    Wilks, Christopher; Cline, Melissa S; Weiler, Erich; Diehkans, Mark; Craft, Brian; Martin, Christy; Murphy, Daniel; Pierce, Howdy; Black, John; Nelson, Donavan; Litzinger, Brian; Hatton, Thomas; Maltbie, Lori; Ainsworth, Michael; Allen, Patrick; Rosewood, Linda; Mitchell, Elizabeth; Smith, Bradley; Warner, Jim; Groboske, John; Telc, Haifang; Wilson, Daniel; Sanford, Brian; Schmidt, Hannes; Haussler, David; Maltbie, Daniel

    2014-01-01

    The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4?PB of data, has grown at an average rate of 50?TB a month and serves >100?TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu. PMID:25267794

  16. Complete Genome Sequence of Equine Herpesvirus Type 9

    PubMed Central

    Yamaguchi, Tsuyoshi; Yamada, Souichi

    2012-01-01

    Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

  17. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  18. Finding diagnostic phenotypic features of Photobacterium in the genome sequences.

    PubMed

    Amaral, Gilda Rose S; Campeão, Mariana E; Swings, Jean; Thompson, Fabiano L; Thompson, Cristiane C

    2015-05-01

    Photobacterium species are ubiquitous in the aquatic environment and can be found in association with animal hosts including pathogenic and mutualistic associations. The traditional phenotypic characterization of Photobacterium is expensive, time-consuming and restricted to a limited number of features. An alternative is to infer phenotypic information directly from whole genome sequences. The present study evaluates the usefulness of whole genome sequences as a source of phenotypic information and compares diagnostic phenotypes of the Photobacterium species from the literature with the predicted phenotypes obtained from whole genome sequences. All genes coding for the specific proteins involved in metabolic pathways responsible for positive phenotypes of the seventeen diagnostic features were found in the majority of the Photobacterium genomes. In the Photobacterium species that were negative for a given phenotype, at least one or several genes involved in the respective biochemical pathways were absent. PMID:25724129

  19. Complete genome sequence of equine herpesvirus type 9.

    PubMed

    Fukushi, Hideto; Yamaguchi, Tsuyoshi; Yamada, Souichi

    2012-12-01

    Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

  20. Complete genome sequence of Streptobacillus moniliformis type strain (9901T)

    SciTech Connect

    Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Sims, David [Los Alamos National Laboratory (LANL); Meincke, Linda [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sproer, Cathrin [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL)

    2009-01-01

    Streptobacillus moniliformis Levaditi et al. 1925 is the sole and type species of the genus, and is of phylogenetic interest because of its isolated location in the sparsely populated and neither taxonomically nor genomically much accessed family 'Leptotrichiaceae' within the phylum 'Fusobacteria'. S. moniliformis, a Gram-negative, non-motile and pleomorphic bacterium, is the etiologic agent of rat bite fever and Haverhill fever. Strain 9901T, the type strain of the species, was isolated from a patient with rat bite fever. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is only the second completed genome sequence of the order 'Fusobacteriales' and no more than the third sequence from the phylum 'Fusobacteria'. The 1,662,578 bp long chromosome and the 10,702 bp plasmid with a total of 1511 protein-coding and 55 RNA genes are part of the Genomic Encyclopedia of Bacteria and Archaea project.