Sample records for cancer genome sequences

  1. An improved understanding of cancer genomics through massively parallel sequencing

    PubMed Central

    Teer, Jamie K.; Lee, H.

    2015-01-01

    DNA sequencing technology advances have enabled genetic investigation of more samples in a shorter time than has previously been possible. Furthermore, the ability to analyze and understand large sequencing datasets has improved due to concurrent advances in sequence data analysis methods and software tools. Constant improvements to both technology and analytic approaches in this fast moving field are evidenced by many recent publications of computational methods, as well as biological results linking genetic events to human disease. Cancer in particular has been the subject of intense investigation, owing to the genetic underpinnings of this complex collection of diseases. New massively-parallel sequencing (MPS) technologies have enabled the investigation of thousands of samples, divided across tens of different tumor types, resulting in new driver gene identification, mutagenic pattern characterization, and other newly uncovered features of tumor biology. This review will focus both on methods and recent results: current analytical approaches to DNA and RNA sequencing will be presented followed by a review of recent pan-cancer sequencing studies. This overview of methods and results will not only highlight the recent advances in cancer genomics, but also the methods and tools used to accomplish these advancements in a constantly and rapidly improving field.

  2. Returning individual research results for genome sequences of pancreatic cancer

    PubMed Central

    2014-01-01

    Background Disclosure of individual results to participants in genomic research is a complex and contentious issue. There are many existing commentaries and opinion pieces on the topic, but little empirical data concerning actual cases describing how individual results have been returned. Thus, the real life risks and benefits of disclosing individual research results to participants are rarely if ever presented as part of this debate. Methods The Australian Pancreatic Cancer Genome Initiative (APGI) is an Australian contribution to the International Cancer Genome Consortium (ICGC), that involves prospective sequencing of tumor and normal genomes of study participants with pancreatic cancer in Australia. We present three examples that illustrate different facets of how research results may arise, and how they may be returned to individuals within an ethically defensible and clinically practical framework. This framework includes the necessary elements identified by others including consent, determination of the significance of results and which to return, delineation of the responsibility for communication and the clinical pathway for managing the consequences of returning results. Results Of 285 recruited patients, we returned results to a total of 25 with no adverse events to date. These included four that were classified as medically actionable, nine as clinically significant and eight that were returned at the request of the treating clinician. Case studies presented depict instances where research results impacted on cancer susceptibility, current treatment and diagnosis, and illustrate key practical challenges of developing an effective framework. Conclusions We suggest that return of individual results is both feasible and ethically defensible but only within the context of a robust framework that involves a close relationship between researchers and clinicians. PMID:24963353

  3. Identification of Novel Cancer Target Antigens Utilizing EST and Genome Sequence Databases

    Microsoft Academic Search

    Tapan K. Bera; Kristi A. Egland; B. K. Lee; Ira Pastan

    Completion of the human genome sequence has opened up an enormous opportunity to researchers all over the world. The Human\\u000a Genome Project, which includes the expressed sequence tags (ESTs) database and the genome sequence database, provides a huge\\u000a source of data that can be used to study and identify molecular targets for a wide range of diseases, including cancer. Major

  4. Genome sequence analysis of Helicobacter pylori strains associated with gastric ulceration and gastric cancer

    Microsoft Academic Search

    Mark S McClain; Carrie L Shaffer; Dawn A Israel; Richard M Peek

    2009-01-01

    BACKGROUND: Persistent colonization of the human stomach by Helicobacter pylori is associated with asymptomatic gastric inflammation (gastritis) and an increased risk of duodenal ulceration, gastric ulceration, and non-cardia gastric cancer. In previous studies, the genome sequences of H. pylori strains from patients with gastritis or duodenal ulcer disease have been analyzed. In this study, we analyzed the genome sequences of

  5. Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution.

    PubMed

    Bignell, Graham R; Santarius, Thomas; Pole, Jessica C M; Butler, Adam P; Perry, Janet; Pleasance, Erin; Greenman, Chris; Menzies, Andrew; Taylor, Sheila; Edkins, Sarah; Campbell, Peter; Quail, Michael; Plumb, Bob; Matthews, Lucy; McLay, Kirsten; Edwards, Paul A W; Rogers, Jane; Wooster, Richard; Futreal, P Andrew; Stratton, Michael R

    2007-09-01

    For decades, cytogenetic studies have demonstrated that somatically acquired structural rearrangements of the genome are a common feature of most classes of human cancer. However, the characteristics of these rearrangements at sequence-level resolution have thus far been subject to very limited description. One process that is dependent upon somatic genome rearrangement is gene amplification, a mechanism often exploited by cancer cells to increase copy number and hence expression of dominantly acting cancer genes. The mechanisms underlying gene amplification are complex but must involve chromosome breakage and rejoining. We sequenced 133 different genomic rearrangements identified within four cancer amplicons involving the frequently amplified cancer genes MYC, MYCN, and ERBB2. The observed architectures of rearrangement were diverse and highly distinctive, with evidence for sister chromatid breakage-fusion-bridge cycles, formation and reinsertion of double minutes, and the presence of bizarre clusters of small genomic fragments. There were characteristic features of sequences at the breakage-fusion junctions, indicating roles for nonhomologous end joining and homologous recombination-mediated repair mechanisms together with nontemplated DNA synthesis. Evidence was also found for sequence-dependent variation in susceptibility of the genome to somatic rearrangement. The results therefore provide insights into the DNA breakage and repair processes operative in somatic genome rearrangement and illustrate how the evolutionary histories of individual cancers can be reconstructed from large-scale cancer genome sequencing. PMID:17675364

  6. Whole genome sequencing as a means to assess pathogenic mutations in medical genetics and cancer.

    PubMed

    Royer-Bertrand, Beryl; Rivolta, Carlo

    2015-04-01

    The past decade has seen the emergence of next-generation sequencing (NGS) technologies, which have revolutionized the field of human molecular genetics. With NGS, significant portions of the human genome can now be assessed by direct sequence analysis, highlighting normal and pathological variants of our DNA. Recent advances have also allowed the sequencing of complete genomes, by a method referred to as whole genome sequencing (WGS). In this work, we review the use of WGS in medical genetics, with specific emphasis on the benefits and the disadvantages of this technique for detecting genomic alterations leading to Mendelian human diseases and to cancer. PMID:25548800

  7. Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer

    PubMed Central

    Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

    2012-01-01

    Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip PMID:22341448

  8. Cancer Genomics for Pediatric Cancers

    Cancer.gov

    Javed Khan, M.D., a molecular biologist at the National Cancer Institute (NCI) discusses programs such as TARGET (Therapeutically Applicable Research to Generate Effective Treatments), the Pediatric Cancer Genome Project, and TCGA (The Cancer Genome Atlas) that are sequencing the genomes of tumors from hundreds of children and adults with cancer to discover genetic changes causing or driving the disease. Genomic characterization will help clinicians prescribe appropriate treatments.

  9. Clinical applications of next generation sequencing in cancer: from panels, to exomes, to genomes

    PubMed Central

    Shen, Tony; Pajaro-Van de Stadt, Stefan Hans; Yeat, Nai Chien; Lin, Jimmy C.-H.

    2015-01-01

    This article will review recent impact of massively parallel next-generation sequencing (NGS) in our understanding and treatment of cancer. While whole exome sequencing (WES) remains popular and effective as a method of genetically profiling different cancers, advances in sequencing technology has enabled an increasing number of whole-genome based studies. Clinically, NGS has been used or is being developed for genetic screening, diagnostics, and clinical assessment. Though challenges remain, clinicians are in the early stages of using genetic data to make treatment decisions for cancer patients. As the integration of NGS in the study and treatment of cancer continues to mature, we believe that the field of cancer genomics will need to move toward more complete 100% genome sequencing. Current technologies and methods are largely limited to coding regions of the genome. A number of recent studies have demonstrated that mutations in non-coding regions may have direct tumorigenic effects or lead to genetic instability. Non-coding regions represent an important frontier in cancer genomics. PMID:26136771

  10. Discrepancies in cancer genomic sequencing highlight opportunities for driver mutation discovery.

    PubMed

    Hudson, Andrew M; Yates, Tim; Li, Yaoyong; Trotter, Eleanor W; Fawdar, Shameem; Chapman, Phil; Lorigan, Paul; Biankin, Andrew; Miller, Crispin J; Brognard, John

    2014-11-15

    Cancer genome sequencing is being used at an increasing rate to identify actionable driver mutations that can inform therapeutic intervention strategies. A comparison of two of the most prominent cancer genome sequencing databases from different institutes (Cancer Cell Line Encyclopedia and Catalogue of Somatic Mutations in Cancer) revealed marked discrepancies in the detection of missense mutations in identical cell lines (57.38% conformity). The main reason for this discrepancy is inadequate sequencing of GC-rich areas of the exome. We have therefore mapped over 400 regions of consistent inadequate sequencing (cold-spots) in known cancer-causing genes and kinases, in 368 of which neither institute finds mutations. We demonstrate, using a newly identified PAK4 mutation as proof of principle, that specific targeting and sequencing of these GC-rich cold-spot regions can lead to the identification of novel driver mutations in known tumor suppressors and oncogenes. We highlight that cross-referencing between genomic databases is required to comprehensively assess genomic alterations in commonly used cell lines and that there are still significant opportunities to identify novel drivers of tumorigenesis in poorly sequenced areas of the exome. Finally, we assess other reasons for the observed discrepancy, such as variations in dbSNP filtering and the acquisition/loss of mutations, to give explanations as to why there is a discrepancy in pharmacogenomic studies, given recent concerns with poor reproducibility of data. PMID:25256751

  11. Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events

    PubMed Central

    Liu, Jinfeng; Lee, William; Jiang, Zhaoshi; Chen, Zhongqiang; Jhunjhunwala, Suchit; Haverty, Peter M.; Gnad, Florian; Guan, Yinghui; Gilbert, Houston N.; Stinson, Jeremy; Klijn, Christiaan; Guillory, Joseph; Bhatt, Deepali; Vartanian, Steffan; Walter, Kimberly; Chan, Jocelyn; Holcomb, Thomas; Dijkgraaf, Peter; Johnson, Stephanie; Koeman, Julie; Minna, John D.; Gazdar, Adi F.; Stern, Howard M.; Hoeflich, Klaus P.; Wu, Thomas D.; Settleman, Jeff; de Sauvage, Frederic J.; Gentleman, Robert C.; Neve, Richard M.; Stokoe, David; Modrusan, Zora; Seshagiri, Somasekar; Shames, David S.; Zhang, Zemin

    2012-01-01

    Lung cancer is a highly heterogeneous disease in terms of both underlying genetic lesions and response to therapeutic treatments. We performed deep whole-genome sequencing and transcriptome sequencing on 19 lung cancer cell lines and three lung tumor/normal pairs. Overall, our data show that cell line models exhibit similar mutation spectra to human tumor samples. Smoker and never-smoker cancer samples exhibit distinguishable patterns of mutations. A number of epigenetic regulators, including KDM6A, ASH1L, SMARCA4, and ATAD2, are frequently altered by mutations or copy number changes. A systematic survey of splice-site mutations identified 106 splice site mutations associated with cancer specific aberrant splicing, including mutations in several known cancer-related genes. RAC1b, an isoform of the RAC1 GTPase that includes one additional exon, was found to be preferentially up-regulated in lung cancer. We further show that its expression is significantly associated with sensitivity to a MAP2K (MEK) inhibitor PD-0325901. Taken together, these data present a comprehensive genomic landscape of a large number of lung cancer samples and further demonstrate that cancer-specific alternative splicing is a widespread phenomenon that has potential utility as therapeutic biomarkers. The detailed characterizations of the lung cancer cell lines also provide genomic context to the vast amount of experimental data gathered for these lines over the decades, and represent highly valuable resources for cancer biology. PMID:23033341

  12. Discrepancies in Cancer Genomic Sequencing Highlight Opportunities for Driver Mutation Discovery

    PubMed Central

    Hudson, Andrew M.; Yates, Tim; Fawdar, Shameem; Chapman, Phil; Lorigan, Paul; Biankin, Andrew; Miller, Crispin J.; Brognard, John

    2014-01-01

    Cancer genome sequencing is being employed at an increasing rate to identify actionable driver mutations that can inform therapeutic intervention strategies. A comparison of two of the most prominent cancer genome sequencing databases from different institutes (CCLE and COSMIC) revealed marked discrepancies in the detection of missense mutations in identical cell lines (57.38% conformity). The main reason for this discrepancy is inadequate sequencing of GC-rich areas of the exome. We have therefore mapped over 400 regions of consistent inadequate sequencing (cold-spots) in known cancer-causing genes and kinases, in 368 of which neither institute finds mutations. We demonstrate, using a newly identified PAK4 mutation as proof of principle, that specific targeting and sequencing of these GC-rich cold-spot regions can lead to the identification of novel driver mutations in known tumor suppressors and oncogenes. We highlight that cross-referencing between genomic databases is required to comprehensively assess genomic alterations in commonly used cell lines and that there are still significant opportunities to identify novel drivers of tumorigenesis in poorly sequenced areas of the exome. Finally we assess other reasons for the observed discrepancy, such as variations in dbSNP filtering and the acquisition/loss of mutations, to give explanations as to why there is discrepancy in pharmacogenomic studies given recent concerns with poor reproducibility of data. PMID:25256751

  13. Cancer Genomics for Pediatric Cancers | Office of Cancer Genomics

    Cancer.gov

    Javed Khan, M.D., a molecular biologist at the National Cancer Institute (NCI) discusses programs such as TARGET (Therapeutically Applicable Research to Generate Effective Treatments), the Pediatric Cancer Genome Project, and TCGA (The Cancer Genome Atlas) that are sequencing the genomes of tumors from hundreds of children and adults with cancer to discover genetic changes causing or driving the disease. Genomic characterization will help clinicians prescribe appropriate treatments.

  14. The complete mitochondrial genome sequence of an endometrial cancer inbred Donryu rat model.

    PubMed

    Liu, Xue-Mei; Huang, Xian-Xia; Wang, Hong-Mei; Wang, Jin-Yun; Wang, Xiao-Hong

    2014-09-18

    Abstract The Donryu rat strain is a commonly used model for endometrial cancer disease study. We sequenced this rat strain mitochondrial genome for the first time (GenBank Accession No. KM114605). Its mitogenome was 16,307?bp and coding 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes. A total of 96 SNPs were examined when compared to reference BN sequence. PMID:25230703

  15. Genome sequence analysis of Helicobacter pylori strains associated with gastric ulceration and gastric cancer

    PubMed Central

    McClain, Mark S; Shaffer, Carrie L; Israel, Dawn A; Peek, Richard M; Cover, Timothy L

    2009-01-01

    Background Persistent colonization of the human stomach by Helicobacter pylori is associated with asymptomatic gastric inflammation (gastritis) and an increased risk of duodenal ulceration, gastric ulceration, and non-cardia gastric cancer. In previous studies, the genome sequences of H. pylori strains from patients with gastritis or duodenal ulcer disease have been analyzed. In this study, we analyzed the genome sequences of an H. pylori strain (98-10) isolated from a patient with gastric cancer and an H. pylori strain (B128) isolated from a patient with gastric ulcer disease. Results Based on multilocus sequence typing, strain 98-10 was most closely related to H. pylori strains of East Asian origin and strain B128 was most closely related to strains of European origin. Strain 98-10 contained multiple features characteristic of East Asian strains, including a type s1c vacA allele and a cagA allele encoding an EPIYA-D tyrosine phosphorylation motif. A core genome of 1237 genes was present in all five strains for which genome sequences were available. Among the 1237 core genes, a subset of alleles was highly divergent in the East Asian strain 98-10, encoding proteins that exhibited <90% amino acid sequence identity compared to corresponding proteins in the other four strains. Unique strain-specific genes were identified in each of the newly sequenced strains, and a set of strain-specific genes was shared among H. pylori strains associated with gastric cancer or premalignant gastric lesions. Conclusion These data provide insight into the diversity that exists among H. pylori strains from diverse clinical and geographic origins. Highly divergent alleles and strain-specific genes identified in this study may represent useful biomarkers for analyzing geographic partitioning of H. pylori and for identifying strains capable of inducing malignant or premalignant gastric lesions. PMID:19123947

  16. Draft Genome Sequences of Helicobacter pylori Strains Isolated from Regions of Low and High Gastric Cancer Risk in Colombia

    E-print Network

    Sheh, Alexander

    The draft genome sequences of six Colombian Helicobacter pylori strains are presented. These strains were isolated from patients from regions of high and low gastric cancer risk in Colombia and were characterized by ...

  17. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing.

    PubMed

    Helman, Elena; Lawrence, Michael S; Stewart, Chip; Sougnez, Carrie; Getz, Gad; Meyerson, Matthew

    2014-07-01

    Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon insertions from sequencing data, to whole genomes from 200 tumor/normal pairs across 11 tumor types as part of The Cancer Genome Atlas (TCGA) Pan-Cancer Project. In addition to novel germline polymorphisms, we find 810 somatic retrotransposon insertions primarily in lung squamous, head and neck, colorectal, and endometrial carcinomas. Many somatic retrotransposon insertions occur in known cancer genes. We find that high somatic retrotransposition rates in tumors are associated with high rates of genomic rearrangement and somatic mutation. Finally, we developed TranspoSeq-Exome to interrogate an additional 767 tumor samples with hybrid-capture exome data and discovered 35 novel somatic retrotransposon insertions into exonic regions, including an insertion into an exon of the PTEN tumor suppressor gene. The results of this large-scale, comprehensive analysis of retrotransposon movement across tumor types suggest that somatic retrotransposon insertions may represent an important class of structural variation in cancer. PMID:24823667

  18. Breast cancer genomics from microarrays to massively parallel sequencing: paradigms and new insights.

    PubMed

    Ng, Charlotte K Y; Schultheis, Anne M; Bidard, Francois-Clement; Weigelt, Britta; Reis-Filho, Jorge S

    2015-05-01

    Rapid advancements in massively parallel sequencing methods have enabled the analysis of breast cancer genomes at an unprecedented resolution, which have revealed the remarkable heterogeneity of the disease. As a result, we now accept that despite originating in the breast, estrogen receptor (ER)-positive and ER-negative breast cancers are completely different diseases at the molecular level. It has become apparent that there are very few highly recurrently mutated genes such as TP53, PIK3CA, and GATA3, that no two breast cancers display an identical repertoire of somatic genetic alterations at base-pair resolution and that there might not be a single highly recurrently mutated gene that defines each of the "intrinsic" subtypes of breast cancer (ie, basal-like, HER2-enriched, luminal A, and luminal B). Breast cancer heterogeneity, however, extends beyond the diversity between tumors. There is burgeoning evidence to demonstrate that at least some primary breast cancers are composed of multiple, genetically diverse clones at diagnosis and that metastatic lesions may differ in their repertoire of somatic genetic alterations when compared with their respective primary tumors. Several biological phenomena may shape the reported intratumor genetic heterogeneity observed in breast cancers, including the different mutational processes and multiple types of genomic instability. Harnessing the emerging concepts of the diversity of breast cancer genomes and the phenomenon of intratumor genetic heterogeneity will be essential for the development of optimal methods for diagnosis, disease monitoring, and the matching of patients to the drugs that would benefit them the most. PMID:25713166

  19. Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers

    PubMed Central

    2013-01-01

    Background Driven by high throughput next generation sequencing technologies and the pressing need to decipher cancer genomes, computational approaches for detecting somatic single nucleotide variants (sSNVs) have undergone dramatic improvements during the past 2 years. The recently developed tools typically compare a tumor sample directly with a matched normal sample at each variant locus in order to increase the accuracy of sSNV calling. These programs also address the detection of sSNVs at low allele frequencies, allowing for the study of tumor heterogeneity, cancer subclones, and mutation evolution in cancer development. Methods We used whole genome sequencing (Illumina Genome Analyzer IIx platform) of a melanoma sample and matched blood, whole exome sequencing (Illumina HiSeq 2000 platform) of 18 lung tumor-normal pairs and seven lung cancer cell lines to evaluate six tools for sSNV detection: EBCall, JointSNVMix, MuTect, SomaticSniper, Strelka, and VarScan 2, with a focus on MuTect and VarScan 2, two widely used publicly available software tools. Default/suggested parameters were used to run these tools. The missense sSNVs detected in these samples were validated through PCR and direct sequencing of genomic DNA from the samples. We also simulated 10 tumor-normal pairs to explore the ability of these programs to detect low allelic-frequency sSNVs. Results Out of the 237 sSNVs successfully validated in our cancer samples, VarScan 2 and MuTect detected the most of any tools (that is, 204 and 192, respectively). MuTect identified 11 more low-coverage validated sSNVs than VarScan 2, but missed 11 more sSNVs with alternate alleles in normal samples than VarScan 2. When examining the false calls of each tool using 169 invalidated sSNVs, we observed >63% false calls detected in the lung cancer cell lines had alternate alleles in normal samples. Additionally, from our simulation data, VarScan 2 identified more sSNVs than other tools, while MuTect characterized most low allelic-fraction sSNVs. Conclusions Our study explored the typical false-positive and false-negative detections that arise from the use of sSNV-calling tools. Our results suggest that despite recent progress, these tools have significant room for improvement, especially in the discrimination of low coverage/allelic-frequency sSNVs and sSNVs with alternate alleles in normal samples. PMID:24112718

  20. Whole Genome Sequence Analysis Suggests Intratumoral Heterogeneity in Dissemination of Breast Cancer to Lymph Nodes

    PubMed Central

    Blighe, Kevin; Kenny, Laura; Patel, Naina; Guttery, David S.; Page, Karen; Gronau, Julian H.; Golshani, Cyrus; Stebbing, Justin; Coombes, R. Charles; Shaw, Jacqueline A.

    2014-01-01

    Background Intratumoral heterogeneity may help drive resistance to targeted therapies in cancer. In breast cancer, the presence of nodal metastases is a key indicator of poorer overall survival. The aim of this study was to identify somatic genetic alterations in early dissemination of breast cancer by whole genome next generation sequencing (NGS) of a primary breast tumor, a matched locally-involved axillary lymph node and healthy normal DNA from blood. Methods Whole genome NGS was performed on 12 µg (range 11.1–13.3 µg) of DNA isolated from fresh-frozen primary breast tumor, axillary lymph node and peripheral blood following the DNA nanoball sequencing protocol. Single nucleotide variants, insertions, deletions, and substitutions were identified through a bioinformatic pipeline and compared to CIN25, a key set of genes associated with tumor metastasis. Results Whole genome sequencing revealed overlapping variants between the tumor and node, but also variants that were unique to each. Novel mutations unique to the node included those found in two CIN25 targets, TGIF2 and CCNB2, which are related to transcription cyclin activity and chromosomal stability, respectively, and a unique frameshift in PDS5B, which is required for accurate sister chromatid segregation during cell division. We also identified dominant clonal variants that progressed from tumor to node, including SNVs in TP53 and ARAP3, which mediates rearrangements to the cytoskeleton and cell shape, and an insertion in TOP2A, the expression of which is significantly associated with tumor proliferation and can segregate breast cancers by outcome. Conclusion This case study provides preliminary evidence that primary tumor and early nodal metastasis have largely overlapping somatic genetic alterations. There were very few mutations unique to the involved node. However, significant conclusions regarding early dissemination needs analysis of a larger number of patient samples. PMID:25546409

  1. Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing

    PubMed Central

    Wang, Yong; Waters, Jill; Leung, Marco L.; Unruh, Anna; Roh, Whijae; Shi, Xiuqing; Chen, Ken; Scheet, Paul; Vattathil, Selina; Liang, Han; Multani, Asha; Zhang, Hong; Zhao, Rui; Michor, Franziska; Meric-Bernstam, Funda; Navin, Nicholas E.

    2014-01-01

    SUMMARY Sequencing studies of breast tumor cohorts have identified many prevalent mutations, but provide limited insight into the genomic diversity within tumors. Here, we developed a whole-genome and exome single cell sequencing approach called Nuc-Seq that utilizes G2/M nuclei to achieve 91% mean coverage breadth. We applied this method to sequence single normal and tumor nuclei from an estrogen-receptor positive breast cancer and a triple-negative ductal carcinoma. In parallel, we performed single nuclei copy number profiling. Our data show that aneuploid rearrangements occurred early in tumor evolution and remained highly stable as the tumor masses clonally expanded. In contrast, point mutations evolved gradually, generating extensive clonal diversity. Many of the diverse mutations were shown to occur at low frequencies (<10%) in the tumor mass by targeted single-molecule sequencing. Using mathematical modeling we found that the triple-negative tumor cells had an increased mutation rate (13.3X) while the ER+ tumor cells did not. These findings have important implications for the diagnosis, therapeutic treatment and evolution of chemoresistance in breast cancer. PMID:25079324

  2. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.

    PubMed

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2012-05-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology. PMID:22294438

  3. Use of Whole Genome Sequencing for Diagnosis and Discovery in the Cancer Genetics Clinic

    PubMed Central

    Foley, Samantha B.; Rios, Jonathan J.; Mgbemena, Victoria E.; Robinson, Linda S.; Hampel, Heather L.; Toland, Amanda E.; Durham, Leslie; Ross, Theodora S.

    2014-01-01

    Despite the potential of whole-genome sequencing (WGS) to improve patient diagnosis and care, the empirical value of WGS in the cancer genetics clinic is unknown. We performed WGS on members of two cohorts of cancer genetics patients: those with BRCA1/2 mutations (n = 176) and those without (n = 82). Initial analysis of potentially pathogenic variants (PPVs, defined as nonsynonymous variants with allele frequency < 1% in ESP6500) in 163 clinically-relevant genes suggested that WGS will provide useful clinical results. This is despite the fact that a majority of PPVs were novel missense variants likely to be classified as variants of unknown significance (VUS). Furthermore, previously reported pathogenic missense variants did not always associate with their predicted diseases in our patients. This suggests that the clinical use of WGS will require large-scale efforts to consolidate WGS and patient data to improve accuracy of interpretation of rare variants. While loss-of-function (LoF) variants represented only a small fraction of PPVs, WGS identified additional cancer risk LoF PPVs in patients with known BRCA1/2 mutations and led to cancer risk diagnoses in 21% of non-BRCA cancer genetics patients after expanding our analysis to 3209 ClinVar genes. These data illustrate how WGS can be used to improve our ability to discover patients' cancer genetic risks. PMID:26023681

  4. Overview | Office of Cancer Genomics

    Cancer.gov

    The Cancer Genome Characterization Initiative (CGCI) supports cutting-edge genomics research on adult and pediatric cancers. CGCI investigators develop and apply advanced sequencing methods that examine genomes, exomes, and transcriptomes of tumors. From the resulting molecular data, they can identify novel genetic abnormalities, which may contribute to cancer pathogenesis. Revealing the underlying causes of cancer will lead to better cancer detection, diagnosis, and treatment for patients in the US and different parts of the world.

  5. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    PubMed Central

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  6. The Cancer Genome Atlas - TCGA - Home Page

    Cancer.gov

    The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

  7. Genomic Datasets for Cancer Research

    Cancer.gov

    A variety of datasets from genome-wide association studies of cancer and other genotype-phenotype studies, including sequencing and molecular diagnostic assays, are available to approved investigators through the Extramural National Cancer Institute (NCI) Data Access Committee (DAC).

  8. Cancer Genome Anatomy Project

    NSDL National Science Digital Library

    The National Cancer Institute has launched the Cancer Genome Anatomy Project to "achieve a comprehensive molecular characterization of normal, precancerous, and malignant cells." Sequenced genes are held as library entries in a database and are available for downloading (fasta format). Each cDNA library entry may include biological source, number of sequences, and library construction detail information. Thousands of gene sequences are available for over 15 cancers, including breast, colon, and prostrate. Contact information for donating or obtaining tissue samples for research purposes is provided.

  9. Cancer Genome Characterization Initiative | Office of Cancer Genomics

    Cancer.gov

    CGCI supports cutting-edge genomics research on adult and pediatric cancers. Researchers develop and apply advanced sequencing and other genome-based methods to identify novel genetic abnormalities in tumors. The extensive genetic profiles generated by CGCI may inform better cancer diagnosis and treatment.

  10. Tumor-associated copy number changes in the circulation of patients with prostate cancer identified through whole-genome sequencing

    PubMed Central

    2013-01-01

    Background Patients with prostate cancer may present with metastatic or recurrent disease despite initial curative treatment. The propensity of metastatic prostate cancer to spread to the bone has limited repeated sampling of tumor deposits. Hence, considerably less is understood about this lethal metastatic disease, as it is not commonly studied. Here we explored whole-genome sequencing of plasma DNA to scan the tumor genomes of these patients non-invasively. Methods We wanted to make whole-genome analysis from plasma DNA amenable to clinical routine applications and developed an approach based on a benchtop high-throughput platform, that is, Illuminas MiSeq instrument. We performed whole-genome sequencing from plasma at a shallow sequencing depth to establish a genome-wide copy number profile of the tumor at low costs within 2 days. In parallel, we sequenced a panel of 55 high-interest genes and 38 introns with frequent fusion breakpoints such as the TMPRSS2-ERG fusion with high coverage. After intensive testing of our approach with samples from 25 individuals without cancer we analyzed 13 plasma samples derived from five patients with castration resistant (CRPC) and four patients with castration sensitive prostate cancer (CSPC). Results The genome-wide profiling in the plasma of our patients revealed multiple copy number aberrations including those previously reported in prostate tumors, such as losses in 8p and gains in 8q. High-level copy number gains in the AR locus were observed in patients with CRPC but not with CSPC disease. We identified the TMPRSS2-ERG rearrangement associated 3-Mbp deletion on chromosome 21 and found corresponding fusion plasma fragments in these cases. In an index case multiregional sequencing of the primary tumor identified different copy number changes in each sector, suggesting multifocal disease. Our plasma analyses of this index case, performed 13 years after resection of the primary tumor, revealed novel chromosomal rearrangements, which were stable in serial plasma analyses over a 9-month period, which is consistent with the presence of one metastatic clone. Conclusions The genomic landscape of prostate cancer can be established by non-invasive means from plasma DNA. Our approach provides specific genomic signatures within 2 days which may therefore serve as 'liquid biopsy'. PMID:23561577

  11. Genomic instability — an evolving hallmark of cancer

    Microsoft Academic Search

    Simona Negrini; Vassilis G. Gorgoulis; Thanos D. Halazonetis

    2010-01-01

    Genomic instability is a characteristic of most cancers. In hereditary cancers, genomic instability results from mutations in DNA repair genes and drives cancer development, as predicted by the mutator hypothesis. In sporadic (non-hereditary) cancers the molecular basis of genomic instability remains unclear, but recent high-throughput sequencing studies suggest that mutations in DNA repair genes are infrequent before therapy, arguing against

  12. Porcine Genomic Sequencing Initiative

    Microsoft Academic Search

    Gary Rohrer; Jonathan E. Beever; Max F. Rothschild; Lawrence Schook; Richard Gibbs; George Weinstock; W. Gregory

    A. Specific biological rationales for the utility of the porcine sequence information Rationale and Objectives. Completion of the human genome sequence provides the starting point for understanding the genetic complexity of humans and how genetic variation contributes to diverse phenotypes and disease. It is clear that model organisms have played an invaluable role in the synthesis of this understanding. It

  13. Microbial genome sequencing

    Microsoft Academic Search

    Claire M. Fraser; Jonathan A. Eisen; Steven L. Salzberg

    2000-01-01

    Complete genome sequences of 30 microbial species have been determined during the past five years, and work in progress indicates that the complete sequences of more than 100 further microbial species will be available in the next two to four years. These results have revealed a tremendous amount of information on the physiology and evolution of microbial species, and should

  14. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable

    E-print Network

    Raphael, Ben J.

    precision medicine Benjamin J Raphael1,2* , Jason R Dobson1,2,3 , Layla Oesper1 and Fabio Vandin1,2 Abstract with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis-15]. These advances hold promise for precision medicine, or precision oncology, where a cancer treatment could

  15. Office of Cancer Genomics |

    Cancer.gov

    The mission of the NCI’s Office of Cancer Genomics (OCG) is to enhance the understanding of the molecular mechanisms of cancer, advance and accelerate genomics science and technology development, and efficiently translate the genomics data to improve cancer prevention, early detection, diagnosis and treatment.

  16. | Office of Cancer Genomics

    Cancer.gov

    This past July, I started a journey into the fields of communications and cancer research when I joined the Office of Cancer Genomics (OCG) as a fellow in the National Cancer Institute (NCI) Health Communications Internship Program (HCIP).

  17. Resources | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers.

  18. Wheat and Barley Genome Sequencing

    Microsoft Academic Search

    Kellye Eversole; Andreas Graner; Nils Stein

    A high quality reference genome sequence is a prerequisite resource for accessing any gene, driving genomics-based approaches\\u000a to systems biology, and for efficient exploitation of natural and induced genetic diversity of an organism. Wheat and barley\\u000a possess genomes of a size that was long presumed to be not amenable for whole genome sequencing. So far, only limited genomic\\u000a sequencing of

  19. Whole-genome sequencing analysis of phenotypic heterogeneity and anticipation in Li–Fraumeni cancer predisposition syndrome

    PubMed Central

    Ariffin, Hany; Hainaut, Pierre; Puzio-Kuter, Anna; Choong, Soo Sin; Chan, Adelyne Sue Li; Tolkunov, Denis; Rajagopal, Gunaretnam; Kang, Wenfeng; Lim, Leon Li Wen; Krishnan, Shekhar; Chen, Kok-Siong; Achatz, Maria Isabel; Karsa, Mawar; Shamsani, Jannah; Levine, Arnold J.; Chan, Chang S.

    2014-01-01

    The Li–Fraumeni syndrome (LFS) and its variant form (LFL) is a familial predisposition to multiple forms of childhood, adolescent, and adult cancers associated with germ-line mutation in the TP53 tumor suppressor gene. Individual disparities in tumor patterns are compounded by acceleration of cancer onset with successive generations. It has been suggested that this apparent anticipation pattern may result from germ-line genomic instability in TP53 mutation carriers, causing increased DNA copy-number variations (CNVs) with successive generations. To address the genetic basis of phenotypic disparities of LFS/LFL, we performed whole-genome sequencing (WGS) of 13 subjects from two generations of an LFS kindred. Neither de novo CNV nor significant difference in total CNV was detected in relation with successive generations or with age at cancer onset. These observations were consistent with an experimental mouse model system showing that trp53 deficiency in the germ line of father or mother did not increase CNV occurrence in the offspring. On the other hand, individual records on 1,771 TP53 mutation carriers from 294 pedigrees were compiled to assess genetic anticipation patterns (International Agency for Research on Cancer TP53 database). No strictly defined anticipation pattern was observed. Rather, in multigeneration families, cancer onset was delayed in older compared with recent generations. These observations support an alternative model for apparent anticipation in which rare variants from noncarrier parents may attenuate constitutive resistance to tumorigenesis in the offspring of TP53 mutation carriers with late cancer onset. PMID:25313051

  20. | Office of Cancer Genomics

    Cancer.gov

    My name is Nicholas Griner and I am the Scientific Program Manager for the Cancer Genome Characterization Initiative (CGCI) in the Office of Cancer Genomics (OCG). Until recently, I spent most of my scientific career working in a cancer research laboratory. In my postdoctoral training, my research focused on identifying novel pathways that contribute to both prostate and breast cancers and studying proteins within these pathways that may be targeted with cancer drugs.

  1. Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer

    PubMed Central

    Liang, Han; Cheung, Lydia W.T.; Li, Jie; Ju, Zhenlin; Yu, Shuangxing; Stemke-Hale, Katherine; Dogruluk, Turgut; Lu, Yiling; Liu, Xiuping; Gu, Chao; Guo, Wei; Scherer, Steven E.; Carter, Hannah; Westin, Shannon N.; Dyer, Mary D.; Verhaak, Roeland G.W.; Zhang, Fan; Karchin, Rachel; Liu, Chang-Gong; Lu, Karen H.; Broaddus, Russell R.; Scott, Kenneth L.; Hennessy, Bryan T.; Mills, Gordon B.

    2012-01-01

    Endometrial cancer is the most common gynecological malignancy, with more than 280,000 cases occurring annually worldwide. Although previous studies have identified important common somatic mutations in endometrial cancer, they have primarily focused on a small set of known cancer genes and have thus provided a limited view of the molecular basis underlying this disease. Here we have developed an integrated systems-biology approach to identifying novel cancer genes contributing to endometrial tumorigenesis. We first performed whole-exome sequencing on 13 endometrial cancers and matched normal samples, systematically identifying somatic alterations with high precision and sensitivity. We then combined bioinformatics prioritization with high-throughput screening (including both shRNA-mediated knockdown and expression of wild-type and mutant constructs) in a highly sensitive cell viability assay. Our results revealed 12 potential driver cancer genes including 10 tumor-suppressor candidates (ARID1A, INHBA, KMO, TTLL5, GRM8, IGFBP3, AKTIP, PHKA2, TRPS1, and WNT11) and two oncogene candidates (ERBB3 and RPS6KC1). The results in the “sensor” cell line were recapitulated by siRNA-mediated knockdown in endometrial cancer cell lines. Focusing on ARID1A, we integrated mutation profiles with functional proteomics in 222 endometrial cancer samples, demonstrating that ARID1A mutations frequently co-occur with mutations in the phosphatidylinositol 3-kinase (PI3K) pathway and are associated with PI3K pathway activation. siRNA knockdown in endometrial cancer cell lines increased AKT phosphorylation supporting ARID1A as a novel regulator of PI3K pathway activity. Our study presents the first unbiased view of somatic coding mutations in endometrial cancer and provides functional evidence for diverse driver genes and mutations in this disease. PMID:23028188

  2. Cancer Genomics Overview

    Cancer.gov

    Genomic information about cancer is leading to better diagnoses and treatment strategies that are tailored to patients’ tumors. Precision medicine is the application of genomic insights to a therapeutic approach adapted specifically for each patient.

  3. Complete Genome Sequence of Bacilli bacterium Strain VT-13-104 Isolated from the Intestine of a Patient with Duodenal Cancer

    PubMed Central

    Tetz, Victor

    2015-01-01

    We report the complete genome sequence of Bacilli bacterium strain VT-13-104 isolated from the intestine of a patient with duodenal cancer. The genome is composed of 3,573,421 bp, with a G+C content of 35.7%. It possesses 3,254 predicted protein-coding genes encoding multidrug resistance transporters, resistance to antibiotics, and virulence factors. PMID:26139715

  4. | Office of Cancer Genomics

    Cancer.gov

    Welcome to the first National Cancer Institute (NCI) Office of Cancer Genomics (OCG) electronic newsletter. We are proud to launch this new communication tool to provide updates on ongoing projects, announce new projects, and highlight how OCG's efforts further the NCI mission to improve the lives of cancer patients by advancing the understanding of cancer's mechanisms at the molecular level.

  5. Towards systematic functional characterization of cancer genomes

    Cancer.gov

    Although in some cases this knowledge immediately illuminates a path towards diagnostic or therapeutic implementation, the bewildering lists of mutations in each tumour make it clear that systematic functional approaches are also necessary to obtain a comprehensive molecular understanding of cancer. Here we review the current range of methods, assays and approaches for genome-scale interrogation of gene function in cancer. We also discuss the integration of functional-genomics approaches with the outputs from cancer genome sequencing efforts.

  6. Projects | Office of Cancer Genomics

    Cancer.gov

    The goal of the Burkitt Lymphoma Genome Sequencing Project (BLGSP) is to explore potential genetic changes in patients with Burkitt lymphoma (BL)Opens in a New Tab that could lead to better prevention, detection, and treatment of this rare and aggressive cancer.

  7. Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden

    PubMed Central

    Cazier, J.-B.; Rao, S.R.; McLean, C.M.; Walker, A.L.; Wright, B.J.; Jaeger, E.E.M.; Kartsonaki, C.; Marsden, L.; Yau, C.; Camps, C.; Kaisaki, P.; Allan, Christopher; Attar, Moustafa; Bell, John; Bentley, David; Broxholme, John; Buck, David; Cazier, Jean-Baptiste; Copley, Richard; Cornall, Richard; Donnelly, Peter; Fiddy, Simon; Green, Angie; Gregory, Lorna; Grocock, Russell; Hatton, Edouard; Holmes, Chris; Hughes, Linda; Humburg, Peter; Humphray, Sean; Kanapin, Alexander; Kingsbury, Zoya; Knight, Julian; Lamble, Sarah; Lise, Stefano; Lonie, Lorne; Lunter, Gerton; Martin, Hilary; Murray, Lisa; McCarthy, Davis; McVean, Gil; Pagnamenta, Alistair; Piazza, Paolo; Polanco, Guadelupe; Ratcliffe, Peter; Rimmer, Andy; Sahgal, Natasha; Taylor, Jenny; Tomlinson, Ian; Trebes, Amy; Wilkie, Andrew; Wright, Ben; Yau, Chris; Taylor, J.; Catto, J.W.; Tomlinson, I.P.M.; Kiltie, A.E.; Hamdy, F.C.

    2014-01-01

    Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression. PMID:24777035

  8. | Office of Cancer Genomics

    Cancer.gov

    Dr. Louis Staudt, a member of the National Academy of Sciences, is a leading expert in lymphoma research within NCI’s intramural research program. He was recently named the Director of the Center for Cancer Genomics (CCG), the organization that encompasses the Office of Cancer Genomics. In this short interview, Dr. Staudt discusses the objectives, challenges, and future directions of the Center.

  9. NIH Launches Comprehensive Effort to Explore Cancer Genomics | Office of Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing.

  10. | Office of Cancer Genomics

    Cancer.gov

    My name is Subhashini Jagu, and I am the Scientific Program Manager for the Cancer Target Discovery and Development (CTD2) Network at the Office of Cancer Genomics (OCG). In my new role, I help CTD2 work toward its mission, which is to develop new scientific approaches to accelerate the translation of genomic discoveries into new treatments. Collaborative efforts that bring together a variety of expertise and infrastructure are needed to understand and successfully treat cancer, a highly complex disease.

  11. Comparative effectiveness of next generation genomic sequencing for disease diagnosis: Design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes?

    PubMed Central

    Gallego, Carlos J.; Bennette, Caroline S.; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M.; Bennett, Robin L.; Dorschner, Michael O.; Tarczy-Hornoch, Peter; Grady, William M.; Fullerton, S. Malia; Trinidad, Susan B.; Regier, Dean A.; Nickerson, Deborah A.; Burke, Wylie; Patrick, Donald L.; Jarvik, Gail P.; Veenstra, David L.

    2014-01-01

    Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of result visit, and findings not associated with colorectal cancer (incidental findings) during a second return of result visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

  12. Comparative effectiveness of next generation genomic sequencing for disease diagnosis: design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes.

    PubMed

    Gallego, Carlos J; Bennette, Caroline S; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M; Bennett, Robin L; Dorschner, Michael O; Tarczy-Hornoch, Peter; Grady, William M; Fullerton, S Malia; Trinidad, Susan B; Regier, Dean A; Nickerson, Deborah A; Burke, Wylie; Patrick, Donald L; Jarvik, Gail P; Veenstra, David L

    2014-09-01

    Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of results visit, and findings not associated with colorectal cancer (incidental findings) during a second return of results visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

  13. Somatic retrotransposition in the cancer genome

    E-print Network

    Helman, Elena

    2014-01-01

    Cancer is a complex disease of the genome exhibiting myriad somatic mutations, from single nucleotide changes to various chromosomal rearrangements. The technological advances of next-generation sequencing enable high-throughput ...

  14. Pig genome sequence - analysis and publication strategy

    Microsoft Academic Search

    Alan L Archibald; Lars Bolund; Carol Churcher; Merete Fredholm; Martien AM Groenen; Barbara Harlizius; Kyung-Tai Lee; Denis Milan; Jane Rogers; Max F Rothschild; Hirohide Uenishi; Jun Wang; Lawrence B Schook

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through

  15. Data Policies | Office of Cancer Genomics

    Cancer.gov

    OCG accelerates the discovery and development of better cancer diagnosis and treatment strategies by making data and materials from its programs available to the cancer research community. OCG enables researchers to search and download data generated by its active programs in databases that are easily accessible through program-specific data matrices. For the tumor genome characterization initiatives, CGCI and TARGET, the datasets contain clinical information, genomic characterization data, and high-throughput sequencing analysis of tumor genomes.

  16. Whole-genome sequences of DA and F344 rats with different susceptibilities to arthritis, autoimmunity, inflammation and cancer.

    PubMed

    Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S

    2013-08-01

    DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301

  17. | Office of Cancer Genomics

    Cancer.gov

    Ringing in the New Year is always a time for reflection. With several individual projects stirring in the Office and a new Center for Cancer Genomics recently inaugurated, 2011 was a prodigious year for OCG.

  18. Whole-genome sequencing of asian lung cancers: second-hand smoke unlikely to be responsible for higher incidence of lung cancer among Asian never-smokers.

    PubMed

    Krishnan, Vidhya G; Ebert, Philip J; Ting, Jason C; Lim, Elaine; Wong, Swee-Seong; Teo, Audrey S M; Yue, Yong G; Chua, Hui-Hoon; Ma, Xiwen; Loh, Gary S L; Lin, Yuhao; Tan, Joanna H J; Yu, Kun; Zhang, Shenli; Reinhard, Christoph; Tan, Daniel S W; Peters, Brock A; Lincoln, Stephen E; Ballinger, Dennis G; Laramie, Jason M; Nilsen, Geoffrey B; Barber, Thomas D; Tan, Patrick; Hillmer, Axel M; Ng, Pauline C

    2014-11-01

    Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status. PMID:25189529

  19. Burkitt Lymphoma | Office of Cancer Genomics

    Cancer.gov

    The goal of the Burkitt Lymphoma Genome Sequencing Project (BLGSP) is to explore potential genetic changes in patients with Burkitt lymphoma (BL)Opens in a New Tab that could lead to better prevention, detection, and treatment of this rare and aggressive cancer. The Office of Cancer Genomics (OCG) at the National Cancer Institute (NCI) initiated BLGSP in collaboration with the Foundation for Burkitt Lymphoma Research.

  20. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing

    E-print Network

    Helman, Elena

    Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon ...

  1. Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer

    Cancer.gov

    Endometrial cancer is the most common gynecological malignancy, with more than 280,000 cases occurring annually worldwide. Although previous studies have identified important common somatic mutations in endometrial cancer, they have primarily focused on a small set of known cancer genes and have thus provided a limited view of the molecular basis underlying this disease. Here we have developed an integrated systems-biology approach to identifying novel cancer genes contributing to endometrial tumorigenesis.

  2. Punctuated Evolution of Prostate Cancer Genomes

    PubMed Central

    Baca, Sylvan C.; Prandi, Davide; Lawrence, Michael S.; Mosquera, Juan Miguel; Romanel, Alessandro; Drier, Yotam; Park, Kyung; Kitabayashi, Naoki; MacDonald, Theresa Y.; Ghandi, Mahmoud; Van Allen, Eliezer; Kryukov, Gregory V.; Sboner, Andrea; Theurillat, Jean-Philippe; Soong, T. David; Nickerson, Elizabeth; Auclair, Daniel; Tewari, Ashutosh; Beltran, Himisha; Onofrio, Robert C.; Boysen, Gunther; Guiducci, Candace; Barbieri, Christopher E.; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L.; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Winckler, Wendy; Cipicchio, Michelle; Ardlie, Kristin; Kantoff, Philip W.; Berger, Michael F.; Gabriel, Stacey B.; Golub, Todd R.; Meyerson, Matthew; Lander, Eric S.; Elemento, Olivier; Getz, Gad; Demichelis, Francesca; Rubin, Mark A.; Garraway, Levi A.

    2013-01-01

    SUMMARY The analysis of exonic DNA from prostate cancers has identified recurrently mutated genes, but the spectrum of genome-wide alterations has not been profiled extensively in this disease. We sequenced the genomes of 57 prostate tumors and matched normal tissues to characterize somatic alterations and to study how they accumulate during oncogenesis and progression. By modeling the genesis of genomic rearrangements, we identified abundant DNA translocations and deletions that arise in a highly interdependent manner. This phenomenon, which we term “chromoplexy”, frequently accounts for the dysregulation of prostate cancer genes and appears to disrupt multiple cancer genes coordinately. Our modeling suggests that chromoplexy may induce considerable genomic derangement over relatively few events in prostate cancer and other neoplasms, supporting a model of punctuated cancer evolution. By characterizing the clonal hierarchy of genomic lesions in prostate tumors, we charted a path of oncogenic events along which chromoplexy may drive prostate carcinogenesis. PMID:23622249

  3. Using the Potato Genome Sequence! Robin Buell!

    E-print Network

    Douches, David S.

    Using the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant Biology! August 15, 2010! buell@msu.edu! 1 #12;Whole Genome Shotgun Sequencing 2 #12;New genomics & post-genomic biology genomes genera 2002 2010 3 #12;So, you say you can sequence-Now what

  4. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  5. | Office of Cancer Genomics

    Cancer.gov

    It's April, and that can only mean one thing at the NCI. No, it's not the DC Cherry Blossom Festival, but the Annual Meeting of the American Association for Cancer Research (AACR). This year's gathering of oncology-laden minds was inundated with a plethora of multiple symposia, educational and scientific sessions, workshops, talks and poster presentations that revolved around the theme of cancer genomics. The opening plenary session featured the NCI Director, Dr.

  6. | Office of Cancer Genomics

    Cancer.gov

    The Office of Cancer Genomics is proud to regularly support internship programs including The Health Communications Internship Program (HCIP). This past July the OCG welcomed a new HCIP intern to a one-year appointment. Gene Gillespie earned his Ph.D. from UCLA in 2011 and is interested in pursuing a career in science and medical writing. He presents a few personal and scientific thoughts on cancer in this month’s eNews perspective.

  7. The UCSC Cancer Genomics Browser: update 2013.

    PubMed

    Goldman, Mary; Craft, Brian; Swatloski, Teresa; Ellrott, Kyle; Cline, Melissa; Diekhans, Mark; Ma, Singer; Wilks, Chris; Stuart, Josh; Haussler, David; Zhu, Jingchun

    2013-01-01

    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a set of web-based tools to display, investigate and analyse cancer genomics data and its associated clinical information. The browser provides whole-genome to base-pair level views of several different types of genomics data, including some next-generation sequencing platforms. The ability to view multiple datasets together allows users to make comparisons across different data and cancer types. Biological pathways, collections of genes, genomic or clinical information can be used to sort, aggregate and zoom into a group of samples. We currently display an expanding set of data from various sources, including 201 datasets from 22 TCGA (The Cancer Genome Atlas) cancers as well as data from Cancer Cell Line Encyclopedia and Stand Up To Cancer. New features include a completely redesigned user interface with an interactive tutorial and updated documentation. We have also added data downloads, additional clinical heatmap features, and an updated Tumor Image Browser based on Google Maps. New security features allow authenticated users access to private datasets hosted by several different consortia through the public website. PMID:23109555

  8. Whole genome sequencing in pharmacogenomics.

    PubMed

    Katsila, Theodora; Patrinos, George P

    2015-01-01

    Pharmacogenomics aims to shed light on the role of genes and genomic variants in clinical treatment response. Although, several drug-gene relationships are characterized to date, many challenges still remain toward the application of pharmacogenomics in the clinic; clinical guidelines for pharmacogenomic testing are still in their infancy, whereas the emerging high throughput genotyping technologies produce a tsunami of new findings. Herein, the potential of whole genome sequencing on pharmacogenomics research and clinical application are highlighted. PMID:25859217

  9. Draft Genome Sequences of 24 Microbial Strains Assembled from Direct Sequencing from 4 Stool Samples

    PubMed Central

    Hernández, Álvaro; White, Bryan A.; O’Brien, Daniel; Ahlquist, David; Boardman, Lisa

    2015-01-01

    The ability to assemble genomes from metagenomic sequencing avoids the need for culture and any associated culture biases. We assembled 24 essentially complete draft genomes from metagenomic pair-end and size-selected mate pair sequencing from 4 stool samples, 2 from subjects diagnosed with colorectal cancer and 2 from healthy controls. PMID:26021920

  10. Draft genome sequences of 24 microbial strains assembled from direct sequencing from 4 stool samples.

    PubMed

    Jeraldo, Patricio; Hernández, Álvaro; White, Bryan A; O'Brien, Daniel; Ahlquist, David; Boardman, Lisa; Chia, Nicholas

    2015-01-01

    The ability to assemble genomes from metagenomic sequencing avoids the need for culture and any associated culture biases. We assembled 24 essentially complete draft genomes from metagenomic pair-end and size-selected mate pair sequencing from 4 stool samples, 2 from subjects diagnosed with colorectal cancer and 2 from healthy controls. PMID:26021920

  11. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan [University of Washington

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

  12. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan [University of Washington

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

  13. Educational Resources | Office of Cancer Genomics

    Cancer.gov

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers.

  14. Next generation sequencing in cancer research and clinical application

    PubMed Central

    2013-01-01

    The wide application of next-generation sequencing (NGS), mainly through whole genome, exome and transcriptome sequencing, provides a high-resolution and global view of the cancer genome. Coupled with powerful bioinformatics tools, NGS promises to revolutionize cancer research, diagnosis and therapy. In this paper, we review the recent advances in NGS-based cancer genomic research as well as clinical application, summarize the current integrative oncogenomic projects, resources and computational algorithms, and discuss the challenge and future directions in the research and clinical application of cancer genomic sequencing. PMID:23406336

  15. Genome Sequence of Salmonella Phage ?

    PubMed Central

    Ko, Ching-Chung; Jacobs-Sera, Deborah; Hatfull, Graham F.; Erhardt, Marc; Hughes, Kelly T.; Casjens, Sherwood R.

    2015-01-01

    Salmonella bacteriophage ? is a member of the Siphoviridae family that gains entry into its host cells by adsorbing to their flagella. We report the complete 59,578-bp sequence of the genome of phage ?, which together with its relatives, exemplifies a largely unexplored type of tailed bacteriophage. PMID:25720684

  16. Genome Sequence of Mycobacteriophage Momo

    PubMed Central

    Bina, Elizabeth A.; Brahme, Indraneel S.; Hill, Amy B.; Himmelstein, Philip H.; Hunsicker, Sara M.; Ish, Amanda R.; Le, Tinh S.; Martin, Mary M.; Moscinski, Catherine N.; Shetty, Sameer A.; Swierzewski, Tomasz; Iyengar, Varun B.; Kim, Hannah; Schafer, Claire E.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Momo is a newly discovered phage of Mycobacterium smegmatis mc2155. Momo has a double-stranded DNA genome 154,553 bp in length, with 233 predicted protein-encoding genes, 34 tRNA genes, and one transfer-messenger RNA (tmRNA) gene. Momo has a myoviral morphology and shares extensive nucleotide sequence similarity with subcluster C1 mycobacteriophages.

  17. Genome Sequence of Mycobacteriophage Phayonce

    PubMed Central

    Jacobetz, Emily; Johnson, Courtney A.; Kihle, Brooke L.; Sobeski, Margaret A.; Werner, Madison B.; Adkins, Nancy L.; Kramer, Zachary J.; Montgomery, Matthew T.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophage Phayonce is a newly isolated phage recovered from a soil sample in Pittsburgh, PA, using Mycobacterium smegmatis mc2155 as a host. Phayonce’s genome is 49,203 bp long and contains 77 protein-coding genes, 23 of them having predicted functions. Phayonce shares a strong similarity in nucleotide sequence with phages of cluster P.

  18. Colon cancer-derived oncogenic EGFR G724S mutant identified by whole genome sequence analysis is dependent on asymmetric dimerization and sensitive to cetuximab

    PubMed Central

    2014-01-01

    Background Inhibition of the activated epidermal growth factor receptor (EGFR) with either enzymatic kinase inhibitors or anti-EGFR antibodies such as cetuximab, is an effective modality of treatment for multiple human cancers. Enzymatic EGFR inhibitors are effective for lung adenocarcinomas with somatic kinase domain EGFR mutations while, paradoxically, anti-EGFR antibodies are more effective in colon and head and neck cancers where EGFR mutations occur less frequently. In colorectal cancer, anti-EGFR antibodies are routinely used as second-line therapy of KRAS wild-type tumors. However, detailed mechanisms and genomic predictors for pharmacological response to these antibodies in colon cancer remain unclear. Findings We describe a case of colorectal adenocarcinoma, which was found to harbor a kinase domain mutation, G724S, in EGFR through whole genome sequencing. We show that G724S mutant EGFR is oncogenic and that it differs from classic lung cancer derived EGFR mutants in that it is cetuximab responsive in vitro, yet relatively insensitive to small molecule kinase inhibitors. Through biochemical and cellular pharmacologic studies, we have determined that cells harboring the colon cancer-derived G719S and G724S mutants are responsive to cetuximab therapy in vitro and found that the requirement for asymmetric dimerization of these mutant EGFR to promote cellular transformation may explain their greater inhibition by cetuximab than small-molecule kinase inhibitors. Conclusion The colon-cancer derived G719S and G724S mutants are oncogenic and sensitive in vitro to cetuximab. These data suggest that patients with these mutations may benefit from the use of anti-EGFR antibodies as part of the first-line therapy. PMID:24894453

  19. Genome Sequences of Eight Morphologically Diverse Alphaproteobacteria?

    PubMed Central

    Brown, Pamela J. B.; Kysela, David T.; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V.

    2011-01-01

    The Alphaproteobacteriacomprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium. PMID:21705585

  20. The Sequence of the Human Genome

    Microsoft Academic Search

    J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

    2001-01-01

    A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

  1. Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project

    E-print Network

    Brendel, Volker

    Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project Vicki L. Chandler Genome Sequencing Project. The momentum for this endeavor has been building within the maize (Zea mays and human genomes (Gregory et al., 2002). Our current picture of the maize genome is largely derived from

  2. Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,

    E-print Network

    Purugganan, Michael D.

    COMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant, Arabidopsis thaliana, was published ,6 years ago (Arabidopsis Genome Initiative, 2000). Since Information Entrez Genome Projects website reports that sequencing of several more plant genomes is in prog

  3. Programs | Office of Cancer Genomics

    Cancer.gov

    OCG facilitates cancer genomics research through a series of highly-focused programs. These programs generate and disseminate genomic data for use by the cancer research community. OCG programs also promote advances in technology-based infrastructure and create valuable experimental reagents and tools. OCG programs encourage collaboration by interconnecting with other genomics and cancer projects in order to accelerate translation of findings into the clinic.

  4. Genomic medicine for cancer diagnosis.

    PubMed

    Gordon, Benjamin L; Finnerty, Brendan M; Aronova, Anna; Fahey, Thomas J

    2015-01-01

    Genomic diagnostics in cancer has evolved since the completion of the Human Genome Project and the advancements made in diagnosis and therapy in chronic myelogenous leukemia. Among the diseases to achieve limited success or potentially benefit from diagnostic genetic testing are thyroid cancer, Burkitt's lymphoma, gastrointestinal stromal tumors, adrenocortical carcinoma, and colorectal cancer. With increased understanding of genomics, genetic tests should improve diagnosis and help guide medical and surgical management. PMID:25346009

  5. A sequence-based survey of the complex structural organization of tumor genomes

    Microsoft Academic Search

    Benjamin J Raphael; Stanislav Volik; Peng Yu; Chunxiao Wu; Guiqing Huang; Elena V Linardopoulou; Barbara J Trask; Frederic Waldman; Joseph Costello; Kenneth J Pienta; Gordon B Mills; Krystyna Bajsarowicz; Yasuko Kobayashi; Shivaranjani Sridharan; Pamela L Paris; Quanzhou Tao; Sarah J Aerni; Raymond P Brown; Ali Bashir; Joe W Gray; Jan-Fang Cheng; Pieter de Jong; Mikhail Nefedov; Thomas Ried; Hesed M Padilla-Nash; Colin C Collins

    2008-01-01

    BACKGROUND: The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using end sequencing profiling, which relies on paired-end sequencing of cloned tumor genomes. RESULTS: In the present study brain, breast, ovary, and prostate tumors, along with three breast cancer cell lines, were surveyed using end sequencing profiling, yielding the largest available

  6. The genomic evolution of human prostate cancer.

    PubMed

    Mitchell, T; Neal, D E

    2015-07-14

    Prostate cancers are highly prevalent in the developed world, with inheritable risk contributing appreciably to tumour development. Genomic heterogeneity within individual prostate glands and between patients derives predominantly from structural variants and copy-number aberrations. Subtypes of prostate cancers are being delineated through the increasing use of next-generation sequencing, but these subtypes are yet to be used to guide the prognosis or therapeutic strategy. Herein, we review our current knowledge of the mutational landscape of human prostate cancer, describing what is known of the common mutations underpinning its development. We evaluate recurrent prostate-specific mutations prior to discussing the mutational events that are shared both in prostate cancer and across multiple cancer types. From these data, we construct a putative overview of the genomic evolution of human prostate cancer. PMID:26125442

  7. Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project

    Microsoft Academic Search

    Mark D. Adams; Jenny M. Kelley; Jeannine D. Gocayne; Mark Dubnick; Mihael H. Polymeropoulos; Hong Xiao; Carl R. Merril; Andrew Wu; Bjorn Olde; Ruben F. Moreno; Anthony R. Kerlavage; W. Richard McCombie; J. Craig Venter

    1991-01-01

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity

  8. Cancer Genomics Research Laboratory

    Cancer.gov

    CGR’s high throughput laboratory is equipped with state-of-the-art laboratory equipment and automation systems for a large number of applications. CGR supports DCEG in all stages of cancer research from planning to publishing, including experimental design and project management, sample handling, genotyping and sequencing assay design and execution, development and implementation of bioinformatic pipelines, and downstream scientific research and analytical support.

  9. Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with high-throughput sequencing

    PubMed Central

    2010-01-01

    Background Cancer cells undergo massive alterations to their DNA methylation patterns that result in aberrant gene expression and malignant phenotypes. However, the mechanisms that underlie methylome changes are not well understood nor is the genomic distribution of DNA methylation changes well characterized. Results Here, we performed methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq) to obtain whole-genome DNA methylation profiles for eight human breast cancer cell (BCC) lines and for normal human mammary epithelial cells (HMEC). The MeDIP-seq analysis generated non-biased DNA methylation maps by covering almost the entire genome with sufficient depth and resolution. The most prominent feature of the BCC lines compared to HMEC was a massively reduced methylation level particularly in CpG-poor regions. While hypomethylation did not appear to be associated with particular genomic features, hypermethylation preferentially occurred at CpG-rich gene-related regions independently of the distance from transcription start sites. We also investigated methylome alterations during epithelial-to-mesenchymal transition (EMT) in MCF7 cells. EMT induction was associated with specific alterations to the methylation patterns of gene-related CpG-rich regions, although overall methylation levels were not significantly altered. Moreover, approximately 40% of the epithelial cell-specific methylation patterns in gene-related regions were altered to those typical of mesenchymal cells, suggesting a cell-type specific regulation of DNA methylation. Conclusions This study provides the most comprehensive analysis to date of the methylome of human mammary cell lines and has produced novel insights into the mechanisms of methylome alteration during tumorigenesis and the interdependence between DNA methylome alterations and morphological changes. PMID:20181289

  10. Genome sequencing and functional genomics approaches in tomato

    Microsoft Academic Search

    Daisuke Shibata

    2005-01-01

    Tomato genome sequencing has been taking place through an international, 10-year initiative entitled the “International Solanaceae Genome Project” (SOL). The strategy proposed by the SOL consortium is to sequence the approximately 220?Mb of euchromatin that contains the majority of genes, rather than the entire tomato genome. Tomato and other Solanaceae plants have unique developmental aspects, such as the formation of

  11. Genomic Sequence Analysis Using Gap Sequences and Pattern Filtering

    Microsoft Academic Search

    Shih-chieh Su; Chia H. Yeh; C.-C. Jay Kuo

    2003-01-01

    A new pattern filtering technique is developed to ana- lyze the genomic sequence in this research based on gap sequences, in which the distance of the same symbol is re- corded consecutively as a sequence of integers. Sequence alignment and similarity testing can be performed on a family of gap sequences over selected patterns. The gap sequence offers a new

  12. Making sense of cancer genomic data | Office of Cancer Genomics

    Cancer.gov

    Both large-scale and focused efforts have identified new targets of translational potential. The deluge of information that emerges from these genome-scale investigations has stimulated a parallel development of new analytical frameworks and tools. The complexity of somatic genomic alterations in cancer genomes also requires the development of robust methods for the interrogation of the function of genes identified by these genomics efforts.

  13. Whole genome sequencing reveals potential targets for therapy in patients with refractory KRAS mutated metastatic colorectal cancer

    PubMed Central

    2014-01-01

    Background The outcome of patients with metastatic colorectal carcinoma (mCRC) following first line therapy is poor, with median survival of less than one year. The purpose of this study was to identify candidate therapeutically targetable somatic events in mCRC patient samples by whole genome sequencing (WGS), so as to obtain targeted treatment strategies for individual patients. Methods Four patients were recruited, all of whom had received?>?2 prior therapy regimens. Percutaneous needle biopsies of metastases were performed with whole blood collection for the extraction of constitutional DNA. One tumor was not included in this study as the quality of tumor tissue was not sufficient for further analysis. WGS was performed using Illumina paired end chemistry on HiSeq2000 sequencing systems, which yielded coverage of greater than 30X for all samples. NGS data were processed and analyzed to detect somatic genomic alterations including point mutations, indels, copy number alterations, translocations and rearrangements. Results All 3 tumor samples had KRAS mutations, while 2 tumors contained mutations in the APC gene and the PIK3CA gene. Although we did not identify a TCF7L2-VTI1A translocation, we did detect a TCF7L2 mutation in one tumor. Among the other interesting mutated genes was INPPL1, an important gene involved in PI3 kinase signaling. Functional studies demonstrated that inhibition of INPPL1 reduced growth of CRC cells, suggesting that INPPL1 may promote growth in CRC. Conclusions Our study further supports potential molecularly defined therapeutic contexts that might provide insights into treatment strategies for refractory mCRC. New insights into the role of INPPL1 in colon tumor cell growth have also been identified. Continued development of appropriate targeted agents towards specific events may be warranted to help improve outcomes in CRC. PMID:24943349

  14. Fuzzy Genome Sequence Assembly for Single and Environmental Genomes

    E-print Network

    Nicolescu, Monica

    Fuzzy Genome Sequence Assembly for Single and Environmental Genomes Sara Nasser, Adrienne Breland. Traditional methods obtain a microorganism's DNA by culturing it in- dividually. Recent advances in genomics microbial commu- nities are often very complex with tens and hundreds of species. Assembling these genomes

  15. Recurrent Targeted Genes of Hepatitis B Virus in the Liver Cancer Genomes Identified by a Next-Generation Sequencing–Based Approach

    PubMed Central

    Ding, Dong; Lou, Xiaoyan; Hua, Dasong; Yu, Wei; Li, Lisha; Wang, Jun; Gao, Feng; Zhao, Na; Ren, Guoping; Li, Lanjuan; Lin, Biaoyang

    2012-01-01

    Integration of the viral DNA into host chromosomes was found in most of the hepatitis B virus (HBV)–related hepatocellular carcinomas (HCCs). Here we devised a massive anchored parallel sequencing (MAPS) method using next-generation sequencing to isolate and sequence HBV integrants. Applying MAPS to 40 pairs of HBV–related HCC tissues (cancer and adjacent tissues), we identified 296 HBV integration events corresponding to 286 unique integration sites (UISs) with precise HBV–Human DNA junctions. HBV integration favored chromosome 17 and preferentially integrated into human transcript units. HBV targeted genes were enriched in GO terms: cAMP metabolic processes, T cell differentiation and activation, TGF beta receptor pathway, ncRNA catabolic process, and dsRNA fragmentation and cellular response to dsRNA. The HBV targeted genes include 7 genes (PTPRJ, CNTN6, IL12B, MYOM1, FNDC3B, LRFN2, FN1) containing IPR003961 (Fibronectin, type III domain), 7 genes (NRG3, MASP2, NELL1, LRP1B, ADAM21, NRXN1, FN1) containing IPR013032 (EGF-like region, conserved site), and three genes (PDE7A, PDE4B, PDE11A) containing IPR002073 (3?, 5?-cyclic-nucleotide phosphodiesterase). Enriched pathways include hsa04512 (ECM-receptor interaction), hsa04510 (Focal adhesion), and hsa04012 (ErbB signaling pathway). Fewer integration events were found in cancers compared to cancer-adjacent tissues, suggesting a clonal expansion model in HCC development. Finally, we identified 8 genes that were recurrent target genes by HBV integration including fibronectin 1 (FN1) and telomerase reverse transcriptase (TERT1), two known recurrent target genes, and additional novel target genes such as SMAD family member 5 (SMAD5), phosphatase and actin regulator 4 (PHACTR4), and RNA binding protein fox-1 homolog (C. elegans) 1 (RBFOX1). Integrating analysis with recently published whole-genome sequencing analysis, we identified 14 additional recurrent HBV target genes, greatly expanding the HBV recurrent target list. This global survey of HBV integration events, together with recently published whole-genome sequencing analyses, furthered our understanding of the HBV–related HCC. PMID:23236287

  16. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  17. Sequencing and analysis of bacterial genomes

    Microsoft Academic Search

    Eugene V. Koonin; Arcady R. Mushegian; Kenneth E. Rudd

    1996-01-01

    The complete sequences of two small bacterial genomes have recently become available, and those of several more species should follow within the next two years. Sequence comparisons show that the most bacterial proteins are highly conserved in evolution, allowing predictions to be made about the functions of most products of an uncharacterized genome. Bacterial genomes differ vastly in their gene

  18. Expanding the computational toolbox for mining cancer genomes

    PubMed Central

    Ding, Li; Wendl, Michael C.; McMichael, Joshua F.; Raphael, Benjamin J.

    2014-01-01

    High-throughput DNA sequencing has revolutionized cancer genomics with numerous discoveries relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations including single nucleotide variants (SNVs), insertions and deletions (indels), structural aberrations, and gene fusions. Additional computational techniques have proved useful to define those mutations, genes, and molecular networks that drive diverse cancer phenotypes as well as determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic, epigenomic alterations and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application. PMID:25001846

  19. The Human Genome Project: Sequencing the Future

    E-print Network

    #12;The Human Genome Project: Sequencing the Future I n 1986, the U.S. Department of Energy (DOE and unilateral step by announcing its Human Genome Initiative--forerunner of the Human Genome Project critical areas, including those important to DOE missions. The Human Genome Project and DOE's complementary

  20. Genomics at the Ontario Institute for Cancer Research

    SciTech Connect

    Ali, Johar [Ontario Institute for Cancer Research

    2010-06-02

    Johar Ali of the Ontario Institute for Cancer Research discusses genomics and next-gen applications at the OICR on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  1. The Genome Sequencing Center at NCGR

    SciTech Connect

    Schilkey, Faye [National Center for Genome Resources

    2010-06-02

    Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  2. Genomics of Lung Cancer

    Microsoft Academic Search

    Alain C. Borczuk; Rebecca L. Toonkel; Charles A. Powell

    2009-01-01

    opment Lung cancer is the leading cause of cancer death in both men and women in the United States, despite its incidence being less than that of prostate cancer in men and breast cancer in women. With 166,000 deaths expected in 2008, the sum total of lung cancer deaths exceeds those of prostate, breast, and colon cancer combined (1). Prostate,

  3. Sequencing a Genome by Walking With Clone-end Sequences

    E-print Network

    Batzoglou, Serafim

    genome is (i) to sequence a collection of non- overlapping 'seeds' chosen from a genomic library of large of seed clones and the depth of the genomic library used for walking, affect the cost and time, Massachusetts lnsmute of Technology, Cambridge MA 02139. * To whom correspondence should be addressed. 45 #12;

  4. Next-generation sequencing, cancer and molecular diagnostics: an interview with Elaine Mardis.

    PubMed

    Mardis, Elaine; Raison, Claire

    2015-04-01

    Elaine Mardis, co-director of the Genome Institute at Washington University (St Louis, MO, USA), is an expert in genome sequencing technologies, having been involved in developing and automating the methods employed in sequencing the human genome. Professor Mardis has made key contributions to the Human Genome Project and more recently, to the field of cancer genomics, including work in The Cancer Genome Atlas. Her current research interests lie in next-generation sequencing and analysis of cancer genomes and the translation of these findings to support therapeutic decision making. PMID:25795041

  5. Progress in Arabidopsis genome sequencing and functional genomics

    Microsoft Academic Search

    R. Wambutt; G. Murphy; G. Volckaert; T. Pohl; A Düsterhöft; W Stiekema; K.-D Entian; N Terryn; B Harris; W Ansorge; P Brandt; L Grivell; M Rieger; M Weichselgartner; V de Simone; B Obermaier; R Mache; M Müller; M Kreis; M Delseny; P Puigdomenech; M Watson; T Schmidtheini; B Reichert; D Portatelle; M Perez-Alonso; M Boutry; I Bancroft; P Vos; J Hoheisel; W Zimmermann; H Wedler; P Ridley; S.-A Langham; B McCullagh; L Bilham; J Robben; J Van der Schueren; B Grymonprez; Y.-J Chuang; F Vandenbussche; M Braeken; I Weltjens; M Voet; I Bastiaens; R Aert; E Defoor; T Weitzenegger; G Bothe; U Ramsperger; H Hilbert; M Braun; E Holzer; A Brandt; S Peters; M van Staveren; W Dirkse; P Mooijman; R Klein Lankhorst; M Rose; J Hauf; P Kötter; S Berneiser; S Hempel; M Feldpausch; S Lamberth; H Van den Daele; A De Keyser; C Buysshaert; J Gielen; R Villarroel; R De Clercq; M Van Montagu; J Rogers; A Cronin; M Quail; S Bray-Allen; L Clark; J Doggett; S Hall; M Kay; N Lennard; K McLay; R Mayes; A Pettett; M.-A Rajandream; M Lyne; V Benes; S Rechmann; D Borkova; H Blöcker; M Scharfe; M Grimm; T.-H Löhnert; S Dose; M de Haan; A Maarse; M Schäfer; S Müller-Auer; C Gabel; M Fuchs; B Fartmann; K Granderath; D Dauner; A Herzl; S Neumann; A Argiriou; D Vitale; R Liguori; E Piravandi; O Massenet; F Quigley; G Clabauld; A Mündlein; R Felber; S Schnabl; R Hiller; W Schmidt; A Lecharny; S Aubourg; I Gy; R Cooke; C Berger; A Monfort; E Casacuberta; T Gibbons; N Weber; M Vandenbol; M Bargues; J Terol; A Torres; A Perez-Perez; B Purnelle; E Bent; S Johnson; D Tacon; T Jesse; L Heijnen; S Schwarz; P Scholler; S Heber; C Bielke; D Frishmann; D Haase; K Lemcke; H. W Mewes; S Stocker; P Zaccaria; K Mayer; C Schüller; M Bevan

    2000-01-01

    Arabidopsis thaliana has a relatively small genome of approximately 130 Mb containing about 10% repetitive DNA. Genome sequencing studies reveal a gene-rich genome, predicted to contain approximately 25?000 genes spaced on average every 4.5 kb. Between 10 to 20% of the predicted genes occur as clusters of related genes, indicating that local sequence duplication and subsequent divergence generates a significant

  6. Expressed sequence tags: alternative or complement to whole genome sequences?

    Microsoft Academic Search

    Stephen Rudd

    2003-01-01

    Over three million sequences from approximately 200 plant species have been deposited in the publicly available plant expressed sequence tag (EST) sequence databases. Many of the ESTs have been sequenced as an alternative to complete genome sequencing or as a substrate for cDNA array-based expression analyses. This creates a formidable resource from both biodiversity and gene-discovery standpoints. Bioinformatics-based sequence analysis

  7. Genomes and evolution From sequence to organism

    E-print Network

    Patel, Nipam H.

    Genomes and evolution From sequence to organism Editorial overview Evan E Eichler and Nipam H Patel, Center for Computational Genomics, Case Western Reserve University School of Medicine and University research is to understand the evolution, pathology and mechanisms of recent genome duplication in human

  8. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    PubMed Central

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences. PMID:22247527

  9. BSMAP: whole genome bisulfite sequence MAPping program

    Microsoft Academic Search

    Yuanxin Xi; Wei Li

    2009-01-01

    BACKGROUND: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the

  10. Genomic Resources for Cancer Epidemiology

    Cancer.gov

    The goal of the 1000 genomes project is to provide a comprehensive resource on human genetic variation. The Project is sequencing the genomes of approximately 2,500 samples at 4x coverage, to provide data on genetic variants with frequencies of at least 1% in the populations studied.

  11. Research | Office of Cancer Genomics

    Cancer.gov

    Continuing advances in high-throughput genomic technologies and tools provide researchers an increasingly more detailed view of the genetic alterations found in cancers. CGCI researchers develop some of these emerging approaches and apply them towards the characterization of certain pediatric and adult cancers.

  12. BAC as tools for genome sequencing

    Microsoft Academic Search

    Hong-Bin Zhang; Chengcang Wu

    2001-01-01

    Genome sequencing represents the state-of-the-art technology for large-scale gene discovery, cloning and decoding. Bacteria-based large-insert clones, including bacterial artificial chromosome (BAC), bacteriophage P1-derived artificial chromosome (PAC) and large-insert conventional plasmid-based clone (PBC), are desirable resources and have offered numerous potentials for accelerated sequencing of large, complex genomes. They are not only capable of cloning large DNA fragments of complex genomes

  13. Comparison of 61 Sequenced Escherichia coli Genomes

    Microsoft Academic Search

    Oksana Lukjancenko; Trudy M. Wassenaar; David W. Ussery

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution.\\u000a Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the\\u000a pan- and core genomes of this set of sequenced strains. A hierarchical

  14. Next generation sequencing of viral RNA genomes

    PubMed Central

    2013-01-01

    Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

  15. Human Genome Sequencing in Health and Disease

    PubMed Central

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  16. The Genetic Basis of Pancreas Cancer Development and Progression: Insights From Whole-Exome and Whole-Genome Sequencing

    PubMed Central

    Iacobuzio-Donahue, Christine A.; Velculescu, Victor E.; Wolfgang, Christopher L.; Hruban, Ralph H.

    2012-01-01

    Pancreatic cancer is caused by inherited and acquired mutations in specific cancer-associated genes. The discovery of the most common genetic alterations in pancreatic cancer has not only provided insight into the fundamental pathways driving the progression from a normal cell, to non-invasive precursor lesions, to widely metastatic disease, but recent genetic discoveries have also opened new opportunities for gene-based approaches to early detection, personalized treatment, and molecular classification of pancreatic neoplasms. PMID:22896692

  17. Genome sequence of the palaeopolyploid Jeremy Schmutz1,2

    E-print Network

    Bhattacharyya, Madan Kumar

    ). The soybean genome is the largest whole-genome shotgun- sequenced plant genome so far and compares favourably to all other high-quality draft whole-genome shotgun-sequenced plant genomes (Supplementary Table 4ARTICLES Genome sequence of the palaeopolyploid soybean Jeremy Schmutz1,2 , Steven B. Cannon3

  18. Genomic sequencing of Pleistocene cave bears

    SciTech Connect

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  19. Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes

    PubMed Central

    Barthelson, Roger; McFarlin, Adam J.; Rounsley, Steven D.; Young, Sarah

    2011-01-01

    Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. PMID:22174807

  20. Reconstruction of Ancestral Genomic Sequences Using Likelihood

    Microsoft Academic Search

    Isaac Elias; Tamir Tuller

    2007-01-01

    A challenging task in computational biology is the reconstruction of genomic sequences of extinct ancestors, given the phylogenetic tree and the sequences at the leafs. This task is best solved by calculating the most likely estimate of the ancestral sequences, along with the most likely edge lengths. We deal with this problem and also the variant in which the phylogenetic

  1. Solvable Sequence Evolution Models and Genomic Correlations

    NASA Astrophysics Data System (ADS)

    Messer, Philipp W.; Arndt, Peter F.; Lässig, Michael

    2005-04-01

    We study a minimal model for genome evolution whose elementary processes are single site mutation, duplication and deletion of sequence regions, and insertion of random segments. These processes are found to generate long-range correlations in the composition of letters as long as the sequence length is growing; i.e., the combined rates of duplications and insertions are higher than the deletion rate. For constant sequence length, on the other hand, all initial correlations decay exponentially. These results are obtained analytically and by simulations. They are compared with the long-range correlations observed in genomic DNA, and the implications for genome evolution are discussed.

  2. The complete genome sequence of the carcinogenic bacterium Helicobacter hepaticus.

    PubMed

    Suerbaum, Sebastian; Josenhans, Christine; Sterzenbach, Torsten; Drescher, Bernd; Brandt, Petra; Bell, Monica; Droge, Marcus; Fartmann, Berthold; Fischer, Hans-Peter; Ge, Zhongming; Horster, Andrea; Holland, Rudi; Klein, Kerstin; Konig, Jochen; Macko, Ludwig; Mendz, George L; Nyakatura, Gerald; Schauer, David B; Shen, Zeli; Weber, Jacqueline; Frosch, Matthias; Fox, James G

    2003-06-24

    Helicobacter hepaticus causes chronic hepatitis and liver cancer in mice. It is the prototype enterohepatic Helicobacter species and a close relative of Helicobacter pylori, also a recognized carcinogen. Here we report the complete genome sequence of H. hepaticus ATCC51449. H. hepaticus has a circular chromosome of 1,799,146 base pairs, predicted to encode 1,875 proteins. A total of 938, 953, and 821 proteins have orthologs in H. pylori, Campylobacter jejuni, and both pathogens, respectively. H. hepaticus lacks orthologs of most known H. pylori virulence factors, including adhesins, the VacA cytotoxin, and almost all cag pathogenicity island proteins, but has orthologs of the C. jejuni adhesin PEB1 and the cytolethal distending toxin (CDT). The genome contains a 71-kb genomic island (HHGI1) and several genomic islets whose G+C content differs from the rest of the genome. HHGI1 encodes three basic components of a type IV secretion system and other virulence protein homologs, suggesting a role of HHGI1 in pathogenicity. The genomic variability of H. hepaticus was assessed by comparing the genomes of 12 H. hepaticus strains with the sequenced genome by microarray hybridization. Although five strains, including all those known to have caused liver disease, were indistinguishable from ATCC51449, other strains lacked between 85 and 229 genes, including large parts of HHGI1, demonstrating extensive variation of genome content within the species. PMID:12810954

  3. International network of cancer genome projects

    Microsoft Academic Search

    Thomas J. Hudson; Warwick Anderson; Axel Aretz; Anna D. Barker; Cindy Bell; Rosa R. Bernabé; M. K. Bhan; Iiro Eerola; Daniela S. Gerhard; Alan Guttmacher; Mark Guyer; Fiona M. Hemsley; Jennifer L. Jennings; David Kerr; Peter Klatt; Patrik Kolar; Jun Kusuda; Frank Laplace; Youyong Lu; Gerd Nettekoven; Brad Ozenberger; Jane Peterson; T. S. Rao; Jacques Remacle; Alan J. Schafer; Tatsuhiro Shibata; Michael R. Stratton; Joseph G. Vockley; Koichi Watanabe; Huanming Yang; Martin Bobrow; Anne Cambon-Thomsen; Lynn G. Dressler; Stephanie O. M. Dyke; Yann Joly; Kazuto Kato; Karen L. Kennedy; Pilar Nicolás; Michael J. Parker; Emmanuelle Rial-Sebbag; Carlos M. Romeo-Casabona; Kenna M. Shaw; Susan Wallace; Georgia L. Wiesner; Andrew V. Biankin; Christian Chabannon; Lynda Chin; Bruno Clément; Enrique de Alava; Françoise Degos; Martin L. Ferguson; Peter Geary; D. Neil Hayes; Amber L. Johns; Arek Kasprzyk; Hidewaki Nakagawa; Robert Penny; Miguel A. Piris; Rajiv Sarin; Aldo Scarpa; Hiroyuki Aburatani; Mónica Bayés; David D. L. Bowtell; Peter J. Campbell; Xavier Estivill; Ivo Gut; Martin Hirst; Carlos López-Otín; Partha Majumder; Marco Marra; John D. McPherson; Zemin Ning; Xose S. Puente; Yijun Ruan; Hendrik G. Stunnenberg; Harold Swerdlow; Victor E. Velculescu; Richard K. Wilson; Hong H. Xue; Paul T. Spellman; Gary D. Bader; Paul C. Boutros; Paul Flicek; Gad Getz; Roderic Guigó; Guangwu Guo; David Haussler; Simon Heath; Tim J. Hubbard; Tao Jiang; Steven M. Jones; Qibin Li; Nuria López-Bigas; Ruibang Luo; Lakshmi Muthuswamy; B. F. Francis Ouellette; John V. Pearson; Victor Quesada; Benjamin J. Raphael; Chris Sander; Terence P. Speed; Joshua M. Stuart; Jon W. Teague; Yasushi Totoki; Tatsuhiko Tsunoda; Alfonso Valencia; David A. Wheeler; Honglong Wu; Shancen Zhao; Mark Lathrop; Gilles Thomas; Myles Axton; Chris Gunter; Linda J. Miller; Junjun Zhang; Syed A. Haider; Jianxin Wang; Christina K. Yung; Anthony Cross; Yong Liang; Saravanamuttu Gnaneshan; Jonathan Guberman; Don R. C. Chalmers; Karl W. Hasel; Terry S. H. Kaan; William W. Lowrance; Tohru Masui; Laura Lyman Rodriguez; Catherine Vergely; Nicole Cloonan; Anna Defazio; James R. Eshleman; Dariush Etemadmoghadam; Brooke A. Gardiner; James G. Kench; Robert L. Sutherland; Margaret A. Tempero; Nicola J. Waddell; Steve Gallinger; Ming-Sound Tsao; Patricia A. Shaw; Gloria M. Petersen; Debabrata Mukhopadhyay; Ronald A. Depinho; Sarah Thayer; Kamran Shazand; Timothy Beck; Michelle Sam; Lee Timms; Jiafu Ji; Xiuqing Zhang; Feng Chen; Xueda Hu; Guangyu Zhou; Qi Yang; Geng Tian; Lianhai Zhang; Xiaofang Xing; Xianghong Li; Zhenggang Zhu; Yingyan Yu; Jun Yu; Jörg Tost; Paul Brennan; Ivana Holcatova; David Zaridze; Alvis Brazma; Lars Egevad; Egor Prokhortchouk; Rosamonde Elizabeth Banks; Mathias Uhlén; Juris Viksna; Fredrik Ponten; Ewan Birney; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A. Foekens; Sancha Martin; Jorge S. Reis-Filho; Andrea L. Richardson; Christos Sotiriou; Marc van de Vijver; Daniel Birnbaum; Hélène Blanche; Pascal Boucher; Sandrine Boyault; Jocelyne D. Masson-Jacquemier; Iris Pauporté; Xavier Pivot; Anne Vincent-Salomon; Eric Tabone; Charles Theillet; Paulette Bioulac-Sage; Thomas Decaens; Dominique Franco; Marta Gut; Didier Samuel; Benedikt Brors; Jan O. Korbel; Andrey Korshunov; Pablo Landgraf; Hans Lehrach; Stefan Pfister; Bernhard Radlwimmer; Guido Reifenberger; Michael D. Taylor; Paolo Pederzoli; Rita T. Lawlor; Massimo Delledonne; Alberto Bardelli; Thomas Gress; David Klimstra; Yusuke Nakamura; Satoru Miyano; Akihiro Fujimoto; Silvia de Sanjosé; Emili Montserrat; Marcos González-Díaz; Pedro Jares; Heinz Himmelbaue; Samuel Aparicio; Laura van't Veer; Douglas F. Easton; Francis S. Collins; Carolyn C. Compton; Eric S. Lander; Wylie Burke; Anthony R. Green; Olli P. Kallioniemi; Timothy J. Ley; Edison T. Liu; Brandon J. Wainwright

    2010-01-01

    The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and\\/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic

  4. POSTDOCTORAL POSITION IN BIOINFORMATICS AND EVOLUTIONARY GENOMICS: Next generation sequencing and analysis of complex polyploid genomes

    E-print Network

    Rennes, Université de

    POSTDOCTORAL POSITION IN BIOINFORMATICS AND EVOLUTIONARY GENOMICS: Next generation sequencing and analysis of complex polyploid genomes The research group Genome Evolution and Speciation (Team) to work on the analysis of genome and transcriptome sequence data (generated using 454 Roche

  5. Genome sequence of Coxiella burnetii strain Namibia

    PubMed Central

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  6. NIH researchers complete whole-exome sequencing of skin cancer;

    Cancer.gov

    A team led by researchers at NIH is the first to systematically survey the landscape of the melanoma genome, the DNA code of the deadliest form of skin cancer. The researchers have made surprising new discoveries using whole-exome sequencing, an approach that decodes the 1-2 percent of the genome that contains protein-coding genes.

  7. Pairwise Comparison Between Genomic Sequences and

    E-print Network

    Mohri, Mehryar

    similar translated genomic sequences using the stable-marriage algorithm (SM) as an alignment filter learned from him how to ask questions and express my ideas. He showed me different ways to approach

  8. First Complete Sequence of the Human Genome

    NSDL National Science Digital Library

    de Nie, Michael Willem.

    On April 6, Celera Genomics announced that it had completed the sequencing phase of one person's genome. It will now begin the process of assembling the sequenced fragments into their proper order with the aid of powerful computers. Work on this project began in September 1999 using a method called "whole genome shotgun sequencing," a quicker method than that used by the international Human Genome Project, which has completed about two-thirds of its own, more thorough, sequence of the human genome. Although talks between Celera and the Human Genome Project over the sharing of data broke down earlier this year, they have since resumed and the company has stated that it will cooperate. While this is just the first step towards understanding the human genome, it only reveals the order of the nucleotides, not what the genes do, it is certainly an important milestone, with broad implications for biology and medicine. Users can begin with the company's press release and then read reports from the BBC, the New York Times (free registration required), CNN, National Public Radio's All Things Considered, and the Times of India. Additional related resources are available from the Human Genome Project site and Doubletwist.com.

  9. Clinical implications of genomics for cancer risk genetics.

    PubMed

    Thomas, David M; James, Paul A; Ballinger, Mandy L

    2015-06-01

    The study of human genetics has provided substantial insight into cancer biology. With an increase in sequencing capacity and a reduction in sequencing costs, genomics will probably transform clinical cancer genetics. A heritable basis for many cancers is accepted, but so far less than half the genetic drivers have been identified. Genomics will increasingly be applied to populations irrespective of family history, which will change the framework of phenotype-directed genetic testing. Panel testing and whole genome sequencing will identify novel, polygenic, and de-novo determinants of cancer risk, often with lower penetrance, which will challenge present binary clinical classification systems and management algorithms. In the future, genotype-stratified public screening and prevention programmes could form part of tailored population risk management. The integration of research with clinical practice will result in so-called discovery cohorts that will help identify clinically significant genetic variation. PMID:26065615

  10. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  11. | Office of Cancer Genomics

    Cancer.gov

    The advent of highly active anti-retroviral therapy (HAART) has considerably slowed disease progression from HIV to full-blown AIDS, thereby increasing the number of people living with HIV. It is not known why the incidence of certain cancers, but not others, increases in patients with HIV infection. Among the cancers with elevated prevalence is aggressive B-cell Non-Hodgkin lymphoma (NHL) and late-stage lung cancer.

  12. Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

    Microsoft Academic Search

    Aaron C. E. Darling; Bob Mau; Frederick R. Blattner; Nicole T. Perna

    2004-01-01

    As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments

  13. The Wellcome Trust Sanger Institute: The Cancer Genome Project

    NSDL National Science Digital Library

    Supported by the Wellcome Trust Sanger Institute, the Cancer Genome Project (CGP) "is using the human genome sequence and high throughput mutation detection techniques to identify somatically acquired sequence variants/mutations and hence identify genes critical in the development of human cancers. This initiative will ultimately provide the paradigm for the detection of germline mutations in non-neoplastic human genetic diseases through genome-wide mutation detection approaches." The CGP website links to a number of Data Resources including the Cancer Gene Census, Cancer Cell Line Project, Catalogue of Somatic Mutations in Cancer (reported on in the March 4, 2005 NSDL Scout Report for Life Sciences), Somatic Mutations in Protein Kinase Genes, and more. The site also contains an extensive listing of publications from 1998 to 2004 with links to PubMed Abstracts.

  14. Overview | Office of Cancer Genomics

    Cancer.gov

    The Cancer Target Discovery and Development (CTD2) initiative is a collaborative network of OCG-supported entities, or Centers. The program strives to functionally validate discoveries from large-scale genomic initiatives and advance them toward precision medicine through the efforts of the Centers and open access data sharing.

  15. What are we learning from the cancer genome?

    PubMed Central

    Collisson, Eric A.; Cho, Raymond J.; Gray, Joe W.

    2013-01-01

    Massively parallel approaches to nucleic acid sequencing have matured from proof-of-concept to commercial products during the past 5 years. These technologies are now widely accessible, increasingly affordable, and have already exerted a transformative influence on the study of human cancer. Here, we review new features of cancer genomes that are being revealed by large-scale applications of these technologies. We focus on those insights most likely to affect future clinical practice. Foremost among these lessons, we summarize the formidable genetic heterogeneity within given cancer types that is appreciable with higher resolution profiling and larger sample sets. We discuss the inherent challenges of defining driving genomic events in a given cancer genome amidst thousands of other somatic events. Finally, we explore the organizational, regulatory and societal challenges impeding precision cancer medicine based on genomic profiling from assuming its place as standard-of-care. PMID:22965149

  16. Genetic variation in the genome-wide predicted estrogen response element-related sequences is associated with breast cancer development

    Microsoft Academic Search

    Jyh-Cherng Yu; Chia-Ni Hsiung; Huan-Ming Hsu; Bo-Ying Bao; Shou-Tung Chen; Giu-Cheng Hsu; Wen-Cheng Chou; Ling-Yueh Hu; Shian-Ling Ding; Chun-Wen Cheng; Pei-Ei Wu; Chen-Yang Shen

    2011-01-01

    Introduction  Estrogen forms a complex with the estrogen receptor (ER) that binds to estrogen response elements (EREs) in the promoter region\\u000a of estrogen-responsive genes, regulates their transcription, and consequently mediates physiological or tumorigenic effects.\\u000a Thus, sequence variants in EREs have the potential to affect the estrogen-ER-ERE interaction. In this study, we examined the\\u000a hypothesis that genetic variations of EREs are associated

  17. Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships

    PubMed Central

    2014-01-01

    Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia. PMID:25001059

  18. A Workshop Report on Wheat Genome Sequencing

    PubMed Central

    Gill, Bikram S.; Appels, Rudi; Botha-Oberholster, Anna-Maria; Buell, C. Robin; Bennetzen, Jeffrey L.; Chalhoub, Boulos; Chumley, Forrest; Dvo?ák, Jan; Iwanaga, Masaru; Keller, Beat; Li, Wanlong; McCombie, W. Richard; Ogihara, Yasunari; Quetier, Francis; Sasaki, Takuji

    2004-01-01

    Sponsored by the National Science Foundation and the U.S. Department of Agriculture, a wheat genome sequencing workshop was held November 10–11, 2003, in Washington, DC. It brought together 63 scientists of diverse research interests and institutions, including 45 from the United States and 18 from a dozen foreign countries (see list of participants at http://www.ksu.edu/igrow). The objectives of the workshop were to discuss the status of wheat genomics, obtain feedback from ongoing genome sequencing projects, and develop strategies for sequencing the wheat genome. The purpose of this report is to convey the information discussed at the workshop and provide the basis for an ongoing dialogue, bringing forth comments and suggestions from the genetics community. PMID:15514080

  19. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    SciTech Connect

    Nierman, William C.

    2000-02-14

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phred Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.

  20. Complete Genome Sequences of 63 Mycobacteriophages

    PubMed Central

    2013-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. The current collection of sequenced mycobacteriophages—all isolated on a single host strain, Mycobacterium smegmatis mc2155, reveals substantial genetic diversity. The complete genome sequences of 63 newly isolated mycobacteriophages expand the resolution of our understanding of phage diversity. PMID:24285655

  1. Draft Genome Sequence of Tombunodavirus UC1

    PubMed Central

    DeRisi, Joseph L.

    2015-01-01

    We report here the draft genome sequence of tombunodavirus UC1 assembled from metagenomic sequencing of organisms in San Francisco wastewater. This virus shares hallmarks of members of the Tombusviridae and the nodavirus-like Plasmopara halstedii and Sclerophthora macrospora viruses. PMID:26139709

  2. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  3. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    E-print Network

    2011-01-01

    plants have large and complex genomes with an abundance of repeated sequences.plants have large and complex genomes with a great abundance of repeated sequences.Sequence composition, organization, and evolution of the core Triticeae genome. Plant

  4. Initial genome sequencing and analysis of multiple myeloma.

    PubMed

    Chapman, Michael A; Lawrence, Michael S; Keats, Jonathan J; Cibulskis, Kristian; Sougnez, Carrie; Schinzel, Anna C; Harview, Christina L; Brunet, Jean-Philippe; Ahmann, Gregory J; Adli, Mazhar; Anderson, Kenneth C; Ardlie, Kristin G; Auclair, Daniel; Baker, Angela; Bergsagel, P Leif; Bernstein, Bradley E; Drier, Yotam; Fonseca, Rafael; Gabriel, Stacey B; Hofmeister, Craig C; Jagannath, Sundar; Jakubowiak, Andrzej J; Krishnan, Amrita; Levy, Joan; Liefeld, Ted; Lonial, Sagar; Mahan, Scott; Mfuko, Bunmi; Monti, Stefano; Perkins, Louise M; Onofrio, Robb; Pugh, Trevor J; Rajkumar, S Vincent; Ramos, Alex H; Siegel, David S; Sivachenko, Andrey; Stewart, A Keith; Trudel, Suzanne; Vij, Ravi; Voet, Douglas; Winckler, Wendy; Zimmerman, Todd; Carpten, John; Trent, Jeff; Hahn, William C; Garraway, Levi A; Meyerson, Matthew; Lander, Eric S; Getz, Gad; Golub, Todd R

    2011-03-24

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the data set. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-?B signalling was indicated by mutations in 11 members of the NF-?B pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge. PMID:21430775

  5. Using comparative genomics to reorder the human genome sequence into a virtual sheep genome

    Microsoft Academic Search

    Brian P Dalrymple; Ewen F Kirkness; Mikhail Nefedov; Sean McWilliam; Abhirami Ratnakumar; Wes Barris; Shaying Zhao; Jyoti Shetty; Jillian F Maddox; Margaret O'Grady; Frank Nicholas; Allan M Crawford; Tim Smith; Pieter J de Jong; John McEwan; V Hutton Oddy; Noelle E Cockett

    2007-01-01

    BACKGROUND: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? RESULTS: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the

  6. Genome Sequence of Mycobacteriophage Mindy

    PubMed Central

    Bernstein, Nicholas I.; Fasolas, Christina S.; Mezghani, Nadia; Pressimone, Catherine A.; Selvakumar, Priyanga; Stanton, Ann-Catherine J.; Lapin, Jonathan S.; Prout, Ashley K.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophage Mindy is a newly isolated phage of Mycobacterium smegmatis, recovered from a soil sample in Pittsburgh, Pennsylvania, USA. Mindy has a genome length of 75,796 bp, encodes 147 predicted proteins and two tRNAs, and is closely related to mycobacteriophages in cluster E.

  7. Complete genome sequence of Borrelia crocidurae.

    PubMed

    Elbir, Haitham; Gimenez, Grégory; Robert, Catherine; Bergström, Sven; Cutler, Sally; Raoult, Didier; Drancourt, Michel

    2012-07-01

    We announce the draft genome sequence of Borrelia crocidurae (strain Achema). The 1,557,560-bp genome (27% GC content) comprises one 919,477-bp linear chromosome and 638,083-bp plasmids that together carry 1,472 open reading frames, 32 tRNAs, and three complete rRNAs, with almost complete colinearity between B. crocidurae and Borrelia duttonii chromosomes. PMID:22740657

  8. Accelerating Genome Sequencing 100X with FPGAs

    SciTech Connect

    Storaasli, Olaf O [ORNL; Strenski, Dave [Cray, Inc.

    2007-01-01

    The performance of two Cray XD1 systems with Virtex-II Pro 50 and Virtex-4 LX160 FPGAs was evaluated using the FASTA computational biology program for human genome (DNA and protein) sequence comparisons. FPGA speedups of 50X (Virtex-II Pro 50) and 100X (Virtex-4 LX160) over a 2.2 GHz Opteron were obtained. FPGA coding issues for human genome data are described.

  9. Genome instability, cancer and aging

    PubMed Central

    Maslov, Alexander Y.; Vijg, Jan

    2015-01-01

    DNA damage-driven genome instability underlies the diversity of life forms generated by the evolutionary process but is detrimental to the somatic cells of individual organisms. The cellular response to DNA damage can be roughly divided in two parts. First, when damage is severe, programmed cell death may occur or, alternatively, temporary or permanent cell cycle arrest. This protects against cancer but can have negative effects on the long term, e.g., by depleting stem cell reservoirs. Second, damage can be repaired through one or more of the many sophisticated genome maintenance pathways. However, erroneous DNA repair and incomplete restoration of chromatin after damage is resolved, produce mutations and epimutations, respectively, both of which have been shown to accumulate with age. An increased burden of mutations and/or epimutations in aged tissues increases cancer risk and adversely affects gene transcriptional regulation, leading to progressive decline in organ function. Cellular degeneration and uncontrolled cell proliferation are both major hallmarks of aging. Despite the fact that one seems to exclude the other, they both may be driven by a common mechanism. Here, we review age related changes in the mammalian genome and their possible functional consequences, with special emphasis on genome instability in stem/progenitor cells. PMID:19344750

  10. Noninvasive fetal genome sequencing: a primer.

    PubMed

    Snyder, Matthew W; Simmons, LaVone E; Kitzman, Jacob O; Santillan, Donna A; Santillan, Mark K; Gammill, Hilary S; Shendure, Jay

    2013-06-01

    We recently demonstrated whole genome sequencing of a human fetus using only parental DNA samples and plasma from the pregnant mother. This proof-of-concept study demonstrated how samples obtained noninvasively in the first or second trimester can be analyzed to yield a highly accurate and substantially complete genetic profile of the fetus, including both inherited and de novo variation. Here, we revisit our original study from a clinical standpoint, provide an overview of the scientific approach, and describe opportunities and challenges along the path toward clinical adoption of noninvasive fetal whole genome sequencing. PMID:23553552

  11. Noninvasive fetal genome sequencing: a primer

    PubMed Central

    Snyder, Matthew W.; Simmons, LaVone E.; Kitzman, Jacob O.; Santillan, Donna A.; Santillan, Mark K.; Gammill, Hilary S.; Shendure, Jay

    2013-01-01

    We recently demonstrated whole genome sequencing of a human fetus using only parental DNA samples and plasma from the pregnant mother. This proof-of-concept study demonstrated how samples obtained noninvasively in the first or second trimester can be analyzed to yield a highly accurate and substantially complete genetic profile of the fetus, including both inherited and de novo variation. Here, we revisit our original study from a clinical standpoint, provide an overview of the scientific approach, and describe opportunities and challenges along the path towards clinical adoption of noninvasive fetal whole genome sequencing (NIFWGS). PMID:23553552

  12. | Office of Cancer Genomics

    Cancer.gov

    Caused by infection with human immunodeficiency virus (HIV), acquired immunodeficiency syndrome (AIDS) is a complex and devastating disease brought about by the systematic destruction of a person's immune response. A weakened immune system can lead to a variety of opportunistic infections in affected persons, as well as a distinct spectrum of tumors known as AIDS-defining cancers. Some of these malignancies, such as Kaposi sarcoma, are also observed in other immunocompromised populations, while others are seen at increased rates only in AIDS patients.

  13. Controlling Size When Aligning Multiple Genomic Sequences with Duplications

    E-print Network

    Miller, Webb

    - ments in 1% of the human genome. As part of the project, genomic sequence data from a number of mammals;Controlling Size When Aligning Multiple Genomic Sequences 139 relationship among aligned sequences to be the same as the phylogenetic tree relating the species for those sequences. A main (and probably the main

  14. Genome Sequence of Mercury-Methylating and Pleomorphic Desulfovibrio africanus

    E-print Network

    Genome Sequence of Mercury-Methylating and Pleomorphic Desulfovibrio africanus Contact: Steven D. africanus genome sequence to allow us to gain insights into the physiological states genomics using the sequence information for D. africanus and the previously sequenced mercury methylator D

  15. Integrative clinical genomics of advanced prostate cancer.

    PubMed

    Robinson, Dan; Van Allen, Eliezer M; Wu, Yi-Mi; Schultz, Nikolaus; Lonigro, Robert J; Mosquera, Juan-Miguel; Montgomery, Bruce; Taplin, Mary-Ellen; Pritchard, Colin C; Attard, Gerhardt; Beltran, Himisha; Abida, Wassim; Bradley, Robert K; Vinson, Jake; Cao, Xuhong; Vats, Pankaj; Kunju, Lakshmi P; Hussain, Maha; Feng, Felix Y; Tomlins, Scott A; Cooney, Kathleen A; Smith, David C; Brennan, Christine; Siddiqui, Javed; Mehra, Rohit; Chen, Yu; Rathkopf, Dana E; Morris, Michael J; Solomon, Stephen B; Durack, Jeremy C; Reuter, Victor E; Gopalan, Anuradha; Gao, Jianjiong; Loda, Massimo; Lis, Rosina T; Bowden, Michaela; Balk, Stephen P; Gaviola, Glenn; Sougnez, Carrie; Gupta, Manaswi; Yu, Evan Y; Mostaghel, Elahe A; Cheng, Heather H; Mulcahy, Hyojeong; True, Lawrence D; Plymate, Stephen R; Dvinge, Heidi; Ferraldeschi, Roberta; Flohr, Penny; Miranda, Susana; Zafeiriou, Zafeiris; Tunariu, Nina; Mateo, Joaquin; Perez-Lopez, Raquel; Demichelis, Francesca; Robinson, Brian D; Schiffman, Marc; Nanus, David M; Tagawa, Scott T; Sigaras, Alexandros; Eng, Kenneth W; Elemento, Olivier; Sboner, Andrea; Heath, Elisabeth I; Scher, Howard I; Pienta, Kenneth J; Kantoff, Philip; de Bono, Johann S; Rubin, Mark A; Nelson, Peter S; Garraway, Levi A; Sawyers, Charles L; Chinnaiyan, Arul M

    2015-05-21

    Toward development of a precision medicine framework for metastatic, castration-resistant prostate cancer (mCRPC), we established a multi-institutional clinical sequencing infrastructure to conduct prospective whole-exome and transcriptome sequencing of bone or soft tissue tumor biopsies from a cohort of 150 mCRPC affected individuals. Aberrations of AR, ETS genes, TP53, and PTEN were frequent (40%-60% of cases), with TP53 and AR alterations enriched in mCRPC compared to primary prostate cancer. We identified new genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, ?-catenin, and ZBTB16/PLZF. Moreover, aberrations of BRCA2, BRCA1, and ATM were observed at substantially higher frequencies (19.3% overall) compared to those in primary prostate cancers. 89% of affected individuals harbored a clinically actionable aberration, including 62.7% with aberrations in AR, 65% in other cancer-related genes, and 8% with actionable pathogenic germline alterations. This cohort study provides clinically actionable information that could impact treatment decisions for these affected individuals. PMID:26000489

  16. Telomeric repeat-containing RNA/G-quadruplex-forming sequences cause genome-wide alteration of gene expression in human cancer cells in vivo.

    PubMed

    Hirashima, Kyotaro; Seimiya, Hiroyuki

    2015-02-27

    Telomere erosion causes cell mortality, suggesting that longer telomeres enable more cell divisions. In telomerase-positive human cancer cells, however, telomeres are often kept shorter than those of surrounding normal tissues. Recently, we showed that cancer cell telomere elongation represses innate immune genes and promotes their differentiation in vivo. This implies that short telomeres contribute to cancer malignancy, but it is unclear how such genetic repression is caused by elongated telomeres. Here, we report that telomeric repeat-containing RNA (TERRA) induces a genome-wide alteration of gene expression in telomere-elongated cancer cells. Using three different cell lines, we found that telomere elongation up-regulates TERRA signal and down-regulates innate immune genes such as STAT1, ISG15 and OAS3 in vivo. Ectopic TERRA oligonucleotides repressed these genes even in cells with short telomeres under three-dimensional culture conditions. This appeared to occur from the action of G-quadruplexes (G4) in TERRA, because control oligonucleotides had no effect and a nontelomeric G4-forming oligonucleotide phenocopied the TERRA oligonucleotide. Telomere elongation and G4-forming oligonucleotides showed similar gene expression signatures. Most of the commonly suppressed genes were involved in the innate immune system and were up-regulated in various cancers. We propose that TERRA G4 counteracts cancer malignancy by suppressing innate immune genes. PMID:25653161

  17. The UCSC cancer genomics browser: update 2011

    PubMed Central

    Sanborn, J. Zachary; Benz, Stephen C.; Craft, Brian; Szeto, Christopher; Kober, Kord M.; Meyer, Laurence; Vaske, Charles J.; Goldman, Mary; Smith, Kayla E.; Kuhn, Robert M.; Karolchik, Donna; Kent, W. James; Stuart, Joshua M.; Haussler, David; Zhu, Jingchun

    2011-01-01

    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu) comprises a suite of web-based tools to integrate, visualize and analyze cancer genomics and clinical data. The browser displays whole-genome views of genome-wide experimental measurements for multiple samples alongside their associated clinical information. Multiple data sets can be viewed simultaneously as coordinated ‘heatmap tracks’ to compare across studies or different data modalities. Users can order, filter, aggregate, classify and display data interactively based on any given feature set including clinical features, annotated biological pathways and user-contributed collections of genes. Integrated standard statistical tools provide dynamic quantitative analysis within all available data sets. The browser hosts a growing body of publicly available cancer genomics data from a variety of cancer types, including data generated from the Cancer Genome Atlas project. Multiple consortiums use the browser on confidential prepublication data enabled by private installations. Many new features have been added, including the hgMicroscope tumor image viewer, hgSignature for real-time genomic signature evaluation on any browser track, and ‘PARADIGM’ pathway tracks to display integrative pathway activities. The browser is integrated with the UCSC Genome Browser; thus inheriting and integrating the Genome Browser’s rich set of human biology and genetics data that enhances the interpretability of the cancer genomics data. PMID:21059681

  18. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    E-print Network

    Timme, Ruth E.

    2009-01-01

    genomes are crop plants, their complete genome sequence willchloroplast genome sequence for any plant within the largersequence of Glycine max and comparative analyses with other legume genomes. Plant

  19. Genome Sequence Assembly Using Trace Signals and Additional Sequence Information

    Microsoft Academic Search

    Bastien Chevreux; Thomas Wetter; Sándor Suhai

    1999-01-01

    Motivation: This article presents a method for as- sembling shotgun sequences which primarily uses high confidence regions whilst taking advantage of additional available information such as low con- fidence regions, quality values or repetitive region tags. Conflict situations are resolved with routines for analysing trace signals. Results: Initial tests with different human and mouse genome projects showed promising results but

  20. The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group

    PubMed Central

    Ahn, Sung-Min; Kim, Tae-Hyung; Lee, Sunghoon; Kim, Deokhoon; Ghang, Ho; Kim, Dae-Soo; Kim, Byoung-Chul; Kim, Sang-Yoon; Kim, Woo-Yeon; Kim, Chulhong; Park, Daeui; Lee, Yong Seok; Kim, Sangsoo; Reja, Rohit; Jho, Sungwoong; Kim, Chang Geun; Cha, Ji-Young; Kim, Kyung-Hee; Lee, Bonghee; Bhak, Jong; Kim, Seong-Jin

    2009-01-01

    We present the first Korean individual genome sequence (SJK) and analysis results. The diploid genome of a Korean male was sequenced to 28.95-fold redundancy using the Illumina paired-end sequencing method. SJK covered 99.9% of the NCBI human reference genome. We identified 420,083 novel single nucleotide polymorphisms (SNPs) that are not in the dbSNP database. Despite a close similarity, significant differences were observed between the Chinese genome (YH), the only other Asian genome available, and SJK: (1) 39.87% (1,371,239 out of 3,439,107) SNPs were SJK-specific (49.51% against Venter's, 46.94% against Watson's, and 44.17% against the Yoruba genomes); (2) 99.5% (22,495 out of 22,605) of short indels (< 4 bp) discovered on the same loci had the same size and type as YH; and (3) 11.3% (331 out of 2920) deletion structural variants were SJK-specific. Even after attempting to map unmapped reads of SJK to unanchored NCBI scaffolds, HGSV, and available personal genomes, there were still 5.77% SJK reads that could not be mapped. All these findings indicate that the overall genetic differences among individuals from closely related ethnic groups may be significant. Hence, constructing reference genomes for minor socio-ethnic groups will be useful for massive individual genome sequencing. PMID:19470904

  1. DNA secondary structures and epigenetic determinants of cancer genome evolution

    PubMed Central

    De, Subhajyoti; Michor, Franziska

    2014-01-01

    An unstable genome is a hallmark of many cancers. It is unclear, however, whether some mutagenic features driving somatic alterations in cancer are encoded in the genome sequence and whether they can operate in a tissue-specific manner. We performed a genome-wide analysis of 663,446 DNA breakpoints associated with somatic copy-number alterations (SCNAs) from 2,792 cancer samples classified into 26 cancer types. Many SCNA breakpoints are spatially clustered in cancer genomes. We observed a significant enrichment for G-quadruplex sequences (G4s) in the vicinity of SCNA breakpoints and established that SCNAs show a strand bias consistent with G4-mediated structural alterations. Notably, abnormal hypomethylation near G4s-rich regions is a common signature for many SCNA breakpoint hotspots. We propose a mechanistic hypothesis that abnormal hypomethylation in genomic regions enriched for G4s acts as a mutagenic factor driving tissue-specific mutational landscapes in cancer. PMID:21725294

  2. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    PubMed Central

    Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W.; Aarestrup, Frank M.; Lund, Ole

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

  3. Comparison of Sample Sequences of the Salmonella typhi Genome to the Sequence of the Complete Escherichia coli K-12 Genome

    Microsoft Academic Search

    MICHAEL MCCLELLAND; RICHARD K. WILSON

    1998-01-01

    Raw sequence data representing the majority of a bacterial genome can be obtained at a tiny fraction of the cost of a completed sequence. To demonstrate the utility of such a resource, 870 single-stranded M13 clones were sequenced from a shotgun library of the Salmonella typhi Ty2 genome. The sequence reads averaged over 400 bases and sampled the genome with

  4. Draft Genome Sequence of Virgibacillus halodenitrificans 1806

    PubMed Central

    Lee, Sang-Jae; Lee, Yong-Jik; Jeong, Haeyoung; Lee, Sang Jun; Lee, Han-Seung; Pan, Jae-Gu

    2012-01-01

    Virgibacillus halodenitrificans 1806 is an endospore-forming halophilic bacterium isolated from salterns in Korea. Here, we report the draft genome sequence of V. halodenitrificans 1806, which may reveal the molecular basis of osmoadaptation and insights into carbon and anaerobic metabolism in moderate halophiles. PMID:23105070

  5. Feature Opinion From complete genome sequence to

    E-print Network

    Levin, Judith G.

    bacteria, 61 archaea, and 23 eukaryotes) were completely sequenced, deposited in the public nucleotide prokaryotic lin- eage (the Genomic Encyclopedia of Bacteria and Archaea: www.jgi.doe.gov/programs/GEBA/, [4]. Similarly, in structural geno- mics projects, the chances of discovering a new protein fold or even a new

  6. Hidden ribozymes in eukaryotic genome sequence

    PubMed Central

    2010-01-01

    The small self-cleaving ribozymes fold into complex tertiary structures to promote autocatalytic cleavage or ligation at a precise position within their sequence. Until recently, relatively few examples had been identified. Two papers now reveal that self-cleaving ribozymes are prevalent in eukaryotic genomes and, in some cases, might play a role in regulating gene expression. PMID:20948783

  7. Genome Sequence of Lactobacillus amylovorus GRL1112?

    PubMed Central

    Kant, Ravi; Paulin, Lars; Alatalo, Edward; de Vos, Willem M.; Palva, Airi

    2011-01-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium. PMID:21131492

  8. Genome sequence of Lactobacillus amylovorus GRL1112.

    PubMed

    Kant, Ravi; Paulin, Lars; Alatalo, Edward; de Vos, Willem M; Palva, Airi

    2011-02-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium. PMID:21131492

  9. Cancer Vulnerabilities Unveiled by Genomic Loss

    E-print Network

    Nijhawan, Deepak

    Due to genome instability, most cancers exhibit loss of regions containing tumor suppressor genes and collateral loss of other genes. To identify cancer-specific vulnerabilities that are the result of copy number losses, ...

  10. Assigning genomic sequences to CATH

    Microsoft Academic Search

    Frances M. G. Pearl; David Lee; James E. Bray; Ian Sillitoe; Annabel E. Todd; Andrew P. Harrison; Janet M. Thornton; Christine A. Orengo

    2000-01-01

    We report the latest release (version 1.6) of the CATH protein domains database (http:\\/\\/www.biochem.ucl. ac.uk\\/bsm\\/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo- logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and

  11. Matrix factorization methods for integrative cancer genomics.

    PubMed

    Zhang, Shihua; Zhou, Xianghong Jasmine

    2014-01-01

    With the rapid development of high-throughput sequencing technologies, many groups are generating multi-platform genomic profiles (e.g., DNA methylation and gene expression) for their biological samples. This activity has generated a huge number of so-called "multidimensional genomic datasets," providing unique opportunities and challenges to study coordination among different regulatory levels and discover underlying combinatorial patterns of cellular systems. We summarize a matrix factorization framework to address the challenge of integrating multiple genomic datasets, as well as a semi-supervised variant of the method that can incorporate prior knowledge. The basic idea is to project the different kinds of genomic data onto a common coordinate system, wherein genetic variables that are strongly correlated in a subset of samples form a multidimensional module. In the context of cancer biology, such modules reveal perturbed pathways and clinically distinct patient subgroups that would have been overlooked with only a single type of data. In summary, the matrix factorization framework can uncover associations between distinct layers of cellular activity and explain their biological implications in multidimensional data. PMID:25030932

  12. Defining Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    Chain, Patrick [DOE-JGI

    2009-05-27

    Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

  13. CGCI Investigators Reveal Comprehensive Landscape of Diffuse Large B-Cell Lymphoma (DLBCL) Genomes | Office of Cancer Genomics

    Cancer.gov

    Researchers from British Columbia Cancer Agency used whole genome sequencing to analyze 40 DLBCL cases and 13 cell lines in order to fill in the gaps of the complex landscape of DLBCL genomes. Their analysis, “Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing,” was published online in Blood on May 22. The authors are Ryan Morin, Marco Marra, and colleagues.

  14. Dominant short repeated sequences in bacterial genomes.

    PubMed

    Avershina, Ekaterina; Rudi, Knut

    2015-03-01

    We use a novel multidimensional searching approach to present the first exhaustive search for all possible repeated sequences in 166 genomes selected to cover the bacterial domain. We found an overrepresentation of repeated sequences in all but one of the genomes. The most prevalent repeats by far were related to interspaced short palindromic repeats (CRISPRs)—conferring bacterial adaptive immunity. We identified a deep branching clade of thermophilic Firmicutes containing the highest number of CRISPR repeats. We also identified a high prevalence of tandem repeated heptamers. In addition, we identified GC-rich repeats that could potentially be involved in recombination events. Finally, we identified repeats in a 16322 amino acid mega protein (involved in biofilm formation) and inverted repeats flanking miniature transposable elements (MITEs). In conclusion, the exhaustive search for repeated sequences identified new elements and distribution of these, which has implications for understanding both the ecology and evolution of bacteria. PMID:25561351

  15. Draft Genome Sequence of Mycobacterium elephantis Strain Lipa

    PubMed Central

    Greninger, Alexander L.; Cunningham, Gail; Yu, Joanna M.; Hsu, Elaine D.; Chiu, Charles Y.

    2015-01-01

    We report the draft genome sequence of Mycobacterium elephantis strain Lipa from a sputum sample of a patient with pulmonary disease. This is the first draft genome sequence of M. elephantis, a rapidly growing mycobacterium. PMID:26112791

  16. Draft Genome Sequence of Mycobacterium arupense Strain GUC1

    PubMed Central

    Greninger, Alexander L.; Cunningham, Gail; Yu, Joanna M.; Hsu, Elaine D.; Chiu, Charles Y.

    2015-01-01

    We report the draft genome sequence of Mycobacterium arupense strain GUC1 from a sputum sample of a patient with bronchiectasis. This is the first draft genome sequence of Mycobacterium arupense, a rapidly growing nonchromogenic mycobacteria. PMID:26067970

  17. The Genome Sequence DataBase (GSDB): meeting the challenge of genomic sequencing

    Microsoft Academic Search

    Gifford Keen; Jillian Burton; David Crowley; Emily Dickinson; Ada Espinosa-lujan; Ed Franks; Carol Harger; Mo Manning; Shelley March; Mia Mcleod; John O'neill; Alicia Power; Maria Pumilia; Rhonda Reinert; David Rider; John Rohrlich; Jolene Schwertfeger; Linda Smyth; Nina Thayer; Charles Troup; Chris A. Fields

    1996-01-01

    The genome sequence database (GSDB) is a complete, publicly available relational database of DNA se- quences and annotation maintained by the National Center for Genome Resources (NCGR) under a Coop- erative Agreement with the US Department of Energy (DOE). GSDB provides direct, client-server access to the database for data contributions, community an- notation and SQL queries. The GSDB Annotator, a

  18. Genlight: Interactive high-throughput sequence analysis and comparative genomics

    Microsoft Academic Search

    Michael Beckstette; Jens T. Mailänder; Richard J. Marhöfer; Alexander Sczyrba; Enno Ohlebusch; Robert Giegerich; Paul M. Selzer

    2004-01-01

    With rising numbers of fully sequenced genomes the importance of comparative genom- ics is constantly increasing. Although several software systems for genome comparison analyses do exist, their functionality and flexibility is still limited, compared to the mani- fold possible applications. Therefore, we developed Genlight, a Client\\/Server based pro- gram suite for large scale sequence analysis and comparative genomics. Genlight uses

  19. Genome sequencing and analysis of the model grass Brachypodium distachyon

    E-print Network

    Green, Pamela

    ARTICLES Genome sequencing and analysis of the model grass Brachypodium distachyon) and contains three independent genomes8 . This has prohibited genome-scale comparisons spanning the three most describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our

  20. Initial sequencing and comparative analysis of the mouse genome

    Microsoft Academic Search

    Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

    2002-01-01

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

  1. The diploid genome sequence of an Asian individual

    Microsoft Academic Search

    Jun Wang; Wei Wang; Ruiqiang Li; Yingrui Li; Geng Tian; Laurie Goodman; Wei Fan; Junqing Zhang; Jun Li; Juanbin Zhang; Yiran Guo; Binxiao Feng; Heng Li; Yao Lu; Xiaodong Fang; Huiqing Liang; Zhenglin Du; Dong Li; Yiqing Zhao; Yujie Hu; Zhenzhen Yang; Hancheng Zheng; Ines Hellmann; Michael Inouye; John Pool; Xin Yi; Jing Zhao; Jinjie Duan; Yan Zhou; Junjie Qin; Lijia Ma; Guoqing Li; Zhentao Yang; Guojie Zhang; Bin Yang; Chang Yu; Fang Liang; Wenjie Li; Shaochuan Li; Dawei Li; Peixiang Ni; Jue Ruan; Qibin Li; Hongmei Zhu; Dongyuan Liu; Zhike Lu; Ning Li; Guangwu Guo; Jianguo Zhang; Jia Ye; Lin Fang; Qin Hao; Quan Chen; Yu Liang; Yeyang Su; A. San; Cuo Ping; Shuang Yang; Fang Chen; Li Li; Ke Zhou; Hongkun Zheng; Yuanyuan Ren; Ling Yang; Guohua Yang; Zhuo Li; Xiaoli Feng; Karsten Kristiansen; Gane Ka-Shu Wong; Rasmus Nielsen; Richard Durbin; Lars Bolund; Xiuqing Zhang; Songgang Li; Huanming Yang; Jian Wang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the

  2. The Genome Sequence of Drosophila melanogaster

    NSDL National Science Digital Library

    Ramanujan, Krishna.

    On Thursday March 23, 2000, a historic milestone was marked as researchers announced they have completed mapping the genome of the fruit fly, Drosophila melanogaster. The achievement, which was announced in a special issue of the journal Science, culminates close to 100 years of research. Drosophila melanogaster is the most complex animal thus far to have its genetic sequence deciphered. The findings have important implications for human medical research and for completing a map of the human genome. Mapping the fruit fly genome has been a broad collaborative effort between academia and industry in several countries. While a foundation was laid by US (Berkeley), European, and Canadian Drosophila Genome Projects, Celera Genomic finished the job over the last year by employing super-computers and state-of-the-art gene-sequencing machines. The techniques learned and used in this last phase of mapping may now be applied to more rapidly decode genes of other organisms, including humans. This week's In The News takes a closer look at this important landmark.

  3. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  4. The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags

    Microsoft Academic Search

    Helena Brentani; Otávia L. Caballero; Anamaria A. Camargo; Aline M. da Silva; Wilson Araújo da Silva Jr.; Emmanuel Dias Neto; Marco Grivet; Arthur Gruber; Pedro Edson Moreira Guimaraes; Winston Hide; Christian Iseli; C. Victor Jongeneel; Janet Kelso; Maria Aparecida Nagai; Elida Paula Benquique Ojopi; Elisson C. Osorio; Eduardo M. R. Reis; Gregory J. Riggins; Andrew John George Simpson; Sandro de Souza; Brian J. Stevenson; Robert L. Strausberg; Eloiza H. Tajara; Sergio Verjovski-Almeida; Marcio Luis Acencio; Mário Henrique Bengtson; Fabiana Bettoni; Walter F. Bodmer; Marcelo R. S. Briones; Luiz Paulo Camargo; Webster Cavenee; Janete M. Cerutti; Luís Eduardo Coelho Andrade; Paulo César Costa Dos Santos; Maria Cristina Ramos Costa; Israel Tojal da Silva; Marcos Roberto H. Estécio; Karine Sa Ferreira; Frank B. Furnari; Milton Faria Jr.; Pedro A. F. Galante; Gustavo S. Guimaraes; Adriano Jesus Holanda; Edna Teruko Kimura; Maarten R. Leerkes; Xin Lu; Rui M. B. Maciel; Elizabeth A. L. Martins; Katlin Brauer Massirer; Analy S. A. Melo; Carlos Alberto Mestriner; Elisabete Cristina Miracca; Leandro Lorenco Miranda; Francisco G. Nobrega; Paulo S. Oliveira; Apuã C. M. Paquola; José Rodrigo C. Pandolfi; Maria Inês de Moura Campos Pardini; Fabio Passetti; John Quackenbush; Beatriz Schnabel; Mari Cleide Sogayar; Jorge E. Souza; Sandro R. Valentini; Andre C. Zaiats; Elisabete Jorge Amaral; Liliane A. T. Arnaldi; Amélia Goes de Araújo; Simone Aparecida de Bessa; David C. Bicknell; Maria Eugenia Ribeiro de Camaro; Dirce Maria Carraro; Helaine Carrer; Alex F. Carvalho; Christian Colin; Fernando Costa; Cyntia Curcio; Ismael Dale Cotrim Guerreiro da Silva; Neusa Pereira da Silva; Márcia Dellamano; Hamza El-Dorry; Enilza Maria Espreafico; Ari José Scattone Ferreira; Cristiane Ayres Ferreira; Maria Angela H. Z. Fortes; Angelita Habr Gama; Daniel Giannella-Neto; Maria Lúcia C. C. Giannella; Ricardo R. Giorgi; Gustavo Henrique Goldman; Maria Helena S. Goldman; Christine Hackel; Paulo Lee Ho; Elza Myiuki Kimura; Luiz Paulo Kowalski; Jose E. Krieger; Luciana C. C. Leite; Ademar Lopes; Ana Mercedes S. C. Luna; Alan Mackay; Suely Kazue Nagahashi Mari; Adriana Aparecida Marques; Waleska K. Martins; André Montagnini; Mario Mourão Neto; Ana Lucia T. O. Nascimento; A. Munro Neville; Marina P. Nobrega; Mike J. O'Hare; Audrey Yumi Otsuka; Anna Izabel Ruas de Melo; Maria Luisa Paçó-Larson; Gonçalo Guimarães Pereira; João Bosco Pesquero; Juliana Gilbert Pessoa; Paula Rahal; Claudia Aparecida Rainho; Vanderlei Rodrigues; Silvia Regina Rogatto; Camila Malta Romano; Janaína Gusmão Romeiro; Benedito Mauro Rossi; Monica Rusticci; Renata Guerra de Sá; Simone Cristina Sant' Anna; Míriam L. Sarmazo; Teresa Cristina De Lima E. Silva; Fernando Augusto Soares; Maria de Fátima Sonati; Josane de Freitas Sousa; Diana Queiroz; Valéria Valente; André Luiz Vettore; Fabiola Elizabeth Villanova; Marco Antonio Zago; Heloisa Zalcberg

    2003-01-01

    Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the

  5. The Cancer Genome Atlas Data Portal Now Available

    Cancer.gov

    Published on Office of Cancer Genomics (http://ocg.cancer.gov) Home > The Cancer Genome Atlas Data Portal Now Available The Cancer Genome Atlas Data Portal Now Available [1] October 01, 2007 We provide 3 ways to download data: The Cancer Genome Atlas

  6. The Cancer Genome Atlas Data Portal Now Available

    Cancer.gov

    Published on Office of Cancer Genomics (https://ocg.cancer.gov) Home > The Cancer Genome Atlas Data Portal Now Available The Cancer Genome Atlas Data Portal Now Available [1] October 01, 2007 We provide 3 ways to download data: The Cancer Genome Atlas

  7. Next-Generation Sequencing and De Novo Assembly, Genome Organization, and Comparative Genomic Analyses of the Genomes of Two Helicobacter pylori Isolates from Duodenal Ulcer Patients in India

    PubMed Central

    Kumar, Narender; Mukhopadhyay, Asish K.; Patra, Rajashree; De, Ronita; Baddam, Ramani; Shaik, Sabiha; Alam, Jawed; Tiruvayipati, Suma

    2012-01-01

    The prevalence of different H. pylori genotypes in various geographical regions indicates region-specific adaptations during the course of evolution. Complete genomes of H. pylori from countries with high infection burdens, such as India, have not yet been described. Herein we present genome sequences of two H. pylori strains, NAB47 and NAD1, from India. In this report, we briefly mention the sequencing and finishing approaches, genome assembly with downstream statistics, and important features of the two draft genomes, including their phylogenetic status. We believe that these genome sequences and the comparative genomics emanating thereupon will help us to clearly understand the ancestry and biology of the Indian H. pylori genotypes, and this will be helpful in solving the so-called Indian enigma, by which high infection rates do not corroborate the minuscule number of serious outcomes observed, including gastric cancer. PMID:23045484

  8. PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly

    E-print Network

    Bader, David A.

    AN organism's genome consists of base pairs (bp) from two strands of complementary bases. Reading a sequencePASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly Xing Liu, Student Member of genomes has been revolutionized by sequencing machines that output many short overlapping substrings

  9. The dynamics of cancer chromosomes and genomes.

    PubMed

    Ye, C J; Liu, G; Bremer, S W; Heng, H H Q

    2007-01-01

    A key feature of cancer chromosomes and genomes is their high level of dynamics and the ability to constantly evolve. This unique characteristic forms the basis of genetic heterogeneity necessary for cancer formation, which presents major obstacles to current cancer diagnosis and treatment. It has been difficult to integrate such dynamics into traditional models of cancer progression. In this conceptual piece, we briefly discuss some of the recent exciting progress in the field of cancer genomics and genome research. In particular, a re-evaluation of the previously disregarded non-clonal chromosome aberrations (NCCAs) is reviewed, coupled with the progress of the detection of sub-chromosomal aberrations with array technologies. Clearly, the high level of genetic heterogeneity is directly caused by genome instability that is mediated by stochastic genomic changes, and genome variations defined by chromosome aberrations are the driving force of cancer progression. In addition to listing various types of non-recurrent chromosomal aberrations, we discuss the likely mechanism underlying cancer chromosome dynamics. Finally, we call for further examination of the features of dynamic genome diseases including cancer in the context of systems biology and the need to integrate this new knowledge into basic research and clinical applications. This genome centric concept will have a profound impact on the future of biological and medical research. PMID:18000376

  10. Data structures and compression algorithms for genomic sequence data

    Microsoft Academic Search

    Marty C. Brandon; Douglas C. Wallace; Pierre Baldi

    2009-01-01

    Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function, and evolution, but also for the storage, navigation, and privacy of genomic data. Here we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and

  11. Transcriptome sequencing in prostate cancer identifies inter-tumor heterogeneity

    PubMed Central

    Mendonca, Janet; Sharma, Anup; Kachhap, Sushant

    2015-01-01

    Given the dearth of gene mutations in prostate cancer,12 it is likely that genomic rearrangements play a significant role in the evolution of prostate cancer. However, in the search for recurrent genomic alterations, “private alterations” have received less attention. Such alterations may provide insights into the evolution, behavior, and clinical outcome of an individual tumor. In a recent report in “Genome Biology” Wyatt et al.3 defines unique alterations in a cohort of high-risk prostate cancer patient with a lethal phenotype. Utilizing a transcriptome sequencing approach they observe high inter-tumor heterogeneity; however, the genes altered distill into three distinct cancer-relevant pathways. Their analysis reveals the presence of several non-ETS fusions, which may contribute to the phenotype of individual tumors, and have significance for disease progression. PMID:25532579

  12. NIH Announces Two Integral Components of The Cancer Genome Atlas Pilot Project | Office of Cancer Genomics

    Cancer.gov

    The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both parts of the National Institutes of Health (NIH), today announced another two of the components of The Cancer Genome Atlas (TCGA) Pilot Project, a three-year, $100 million collaboration to test the feasibility of using large-scale genome analysis technologies to identify important genetic changes involved in cancer. Lung, brain (glioblastoma), and ovarian cancers have been chosen as the tumors for study by TCGA Pilot Project.

  13. The Norway spruce genome sequence and conifer genome evolution.

    PubMed

    Nystedt, Björn; Street, Nathaniel R; Wetterbom, Anna; Zuccolo, Andrea; Lin, Yao-Cheng; Scofield, Douglas G; Vezzi, Francesco; Delhomme, Nicolas; Giacomello, Stefania; Alexeyenko, Andrey; Vicedomini, Riccardo; Sahlin, Kristoffer; Sherwood, Ellen; Elfstrand, Malin; Gramzow, Lydia; Holmberg, Kristina; Hällman, Jimmie; Keech, Olivier; Klasson, Lisa; Koriabine, Maxim; Kucukoglu, Melis; Käller, Max; Luthman, Johannes; Lysholm, Fredrik; Niittylä, Totte; Olson, Ake; Rilakovic, Nemanja; Ritland, Carol; Rosselló, Josep A; Sena, Juliana; Svensson, Thomas; Talavera-López, Carlos; Theißen, Günter; Tuominen, Hannele; Vanneste, Kevin; Wu, Zhi-Qiang; Zhang, Bo; Zerbe, Philipp; Arvestad, Lars; Bhalerao, Rishikesh; Bohlmann, Joerg; Bousquet, Jean; Garcia Gil, Rosario; Hvidsten, Torgeir R; de Jong, Pieter; MacKay, John; Morgante, Michele; Ritland, Kermit; Sundberg, Björn; Thompson, Stacey Lee; Van de Peer, Yves; Andersson, Björn; Nilsson, Ove; Ingvarsson, Pär K; Lundeberg, Joakim; Jansson, Stefan

    2013-05-30

    Conifers have dominated forests for more than 200?million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000?base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding. PMID:23698360

  14. Functional genomics of tomato in a post-genome-sequencing phase

    PubMed Central

    Aoki, Koh; Ogata, Yoshiyuki; Igarashi, Kaori; Yano, Kentaro; Nagasaki, Hideki; Kaminuma, Eli; Toyoda, Atsushi

    2013-01-01

    Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv ‘Heinz 1706’ serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding. PMID:23641177

  15. Complete genome sequence of Candidatus Ruthia magnifica.

    PubMed

    Roeselers, Guus; Newton, Irene L G; Woyke, Tanja; Auchtung, Thomas A; Dilly, Geoffrey F; Dutton, Rachel J; Fisher, Meredith C; Fontanez, Kristina M; Lau, Evan; Stewart, Frank J; Richardson, Paul M; Barry, Kerrie W; Saunders, Elizabeth; Detter, John C; Wu, Dongying; Eisen, Jonathan A; Cavanaugh, Colleen M

    2010-01-01

    The hydrothermal vent clam Calyptogena magnifica (Bivalvia: Mollusca) is a member of the Vesicomyidae. Species within this family form symbioses with chemosynthetic Gammaproteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a rudimentary gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. The C. magnifica symbiont, Candidatus Ruthia magnifica, was the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced (Newton et al. 2007). Here we expand upon the original report and provide additional details complying with the emerging MIGS/MIMS standards. The complete genome exposed the genetic blueprint of the metabolic capabilities of the symbiont. Genes which were predicted to encode the proteins required for all the metabolic pathways typical of free-living chemoautotrophs were detected in the symbiont genome. These include major pathways including carbon fixation, sulfur oxidation, nitrogen assimilation, as well as amino acid and cofactor/vitamin biosynthesis. This genome sequence is invaluable in the study of these enigmatic associations and provides insights into the origin and evolution of autotrophic endosymbiosis. PMID:21304746

  16. Genome sequence of Leuconostoc pseudomesenteroides KCTC 3652.

    PubMed

    Kim, Dong-Wook; Choi, Sang-Haeng; Kang, Aram; Nam, Seong-Hyeuk; Kim, Ryong Nam; Kim, Aeri; Kim, Dae-Soo; Park, Hong-Seog

    2011-08-01

    We announce the genome sequence of one of the most prevalent lactic acid bacteria present during the manufacturing process of cane juice, the type strain Leuconostoc pseudomesenteroides KCTC 3652 (3,244,985 bp, with a G+C content of 38.3%), which consists of 1,160 large contigs (>100 bp in size). All of the contigs were assembled by the Newbler Assembler 2.3 software program (454 Life Sciences). PMID:21705609

  17. The genome sequence of Schizosaccharomyces pombe

    Microsoft Academic Search

    R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

    2002-01-01

    We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

  18. Why Assembling Plant Genome Sequences Is So Challenging

    PubMed Central

    Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

    2012-01-01

    In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

  19. Genome sequence of Halobacterium species NRC-1

    PubMed Central

    Ng, Wailap Victor; Kennedy, Sean P.; Mahairas, Gregory G.; Berquist, Brian; Pan, Min; Shukla, Hem Dutt; Lasky, Stephen R.; Baliga, Nitin S.; Thorsson, Vesteinn; Sbrogna, Jennifer; Swartzell, Steven; Weir, Douglas; Hall, John; Dahl, Timothy A.; Welti, Russell; Goo, Young Ah; Leithauser, Brent; Keller, Kim; Cruz, Randy; Danson, Michael J.; Hough, David W.; Maddocks, Deborah G.; Jablonski, Peter E.; Krebs, Mark P.; Angevine, Christine M.; Dale, Heather; Isenbarger, Thomas A.; Peck, Ronald F.; Pohlschroder, Mechthild; Spudich, John L.; Jung, Kwang-Hwan; Alam, Maqsudul; Freitas, Tracey; Hou, Shaobin; Daniels, Charles J.; Dennis, Patrick P.; Omer, Arina D.; Ebhardt, Holger; Lowe, Todd M.; Liang, Ping; Riley, Monica; Hood, Leroy; DasSarma, Shiladitya

    2000-01-01

    We report the complete sequence of an extreme halophile, Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-bp genome containing 91 insertion sequences representing 12 families and organized into a large chromosome and 2 related minichromosomes. The Halobacterium NRC-1 genome codes for 2,630 predicted proteins, 36% of which are unrelated to any previously reported. Analysis of the genome sequence shows the presence of pathways for uptake and utilization of amino acids, active sodium-proton antiporter and potassium uptake systems, sophisticated photosensory and signal transduction pathways, and DNA replication, transcription, and translation systems resembling more complex eukaryotic organisms. Whole proteome comparisons show the definite archaeal nature of this halophile with additional similarities to the Gram-positive Bacillus subtilis and other bacteria. The ease of culturing Halobacterium and the availability of methods for its genetic manipulation in the laboratory, including construction of gene knockouts and replacements, indicate this halophile can serve as an excellent model system among the archaea. PMID:11016950

  20. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Microsoft Academic Search

    Frank M You; Naxin Huo; Karin R Deal; Yong Q Gu; Ming-Cheng Luo; Patrick E McGuire; Jan Dvorak; Olin D Anderson

    2011-01-01

    BACKGROUND: Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS)

  1. Whole genome sequence (WGS) analysis for exploring plant relationships

    Microsoft Academic Search

    Nicole F Rice; Giovanni M Cordeiro; Catherine J Nock; Daniel LE Waters; Stirling Bowen; Robert J Henry

    2010-01-01

    Shotgun sequencing plant genomic DNA preparations generates large quantities of sequence data in a single run. Using the Illumina GAII, whole genome shot-gun sequence (WGS) data was generated for Oryza sativa cv Nipponbarre, and the rice wild relatives Oryza meridionalis and Oryza australiensis. Two other grass species were also sequenced, Potamophila parviflora, from the Oryzeae tribe and Microlaena stipoides from

  2. The UCSC Cancer Genomics Browser: update 2015

    PubMed Central

    Goldman, Mary; Craft, Brian; Swatloski, Teresa; Cline, Melissa; Morozova, Olena; Diekhans, Mark; Haussler, David; Zhu, Jingchun

    2015-01-01

    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a web-based application that integrates relevant data, analysis and visualization, allowing users to easily discover and share their research observations. Users can explore the relationship between genomic alterations and phenotypes by visualizing various -omic data alongside clinical and phenotypic features, such as age, subtype classifications and genomic biomarkers. The Cancer Genomics Browser currently hosts 575 public datasets from genome-wide analyses of over 227 000 samples, including datasets from TCGA, CCLE, Connectivity Map and TARGET. Users can download and upload clinical data, generate Kaplan–Meier plots dynamically, export data directly to Galaxy for analysis, plus generate URL bookmarks of specific views of the data to share with others. PMID:25392408

  3. Detecting somatic mutations in genomic sequences by means of Kolmogorov-Arnold analysis

    E-print Network

    Gurzadyan, V G; Vlahovic, G; Kashin, A; Killela, P; Reitman, Z; Sargsyan, S; Yegorian, G; Milledge, G; Vlahovic, B

    2015-01-01

    The Kolmogorov-Arnold stochasticity parameter technique is applied for the first time to the study of cancer genome sequencing, to reveal mutations. Using data generated by next generation sequencing technologies, we have analyzed the exome sequences of brain tumor patients with matched tumor and normal blood. We show that mutations contained in sequencing data can be revealed using this technique thus providing a new methodology for determining subsequences of given length containing mutations i.e. its value differs from those of subsequences without mutations. A potential application for this technique involves simplifying the procedure of finding segments with mutations, speeding up genomic research, and accelerating its implementation in clinical diagnostic. Moreover, the prediction of a mutation associated to a family of frequent mutations in numerous types of cancers based purely on the value of the Kolmogorov function, indicates that this applied marker may recognize genomic sequences that are in extre...

  4. Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective

    PubMed Central

    Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

    2011-01-01

    Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead. PMID:22147957

  5. Genome Sequence of the Pea Aphid Acyrthosiphon The International Aphid Genomics Consortium"

    E-print Network

    Paris-Sud XI, Université de

    Genome Sequence of the Pea Aphid Acyrthosiphon pisum The International Aphid Genomics Consortium we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple

  6. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries

    Microsoft Academic Search

    Tim T. Binnewies; Yair Motro; Peter F. Hallin; Ole Lund; David Dunn; Tom La; David J. Hampson; Matthew Bellgard; Trudy M. Wassenaar; David W. Ussery

    2006-01-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of

  7. The International Rice Genome Sequencing Project: progress and prospects

    Microsoft Academic Search

    T. Sasaki; T. Matsumoto; T. Baba; K. Yamamoto; J. Wu; Y. Katayose; K. Sakata

    The rice genome sequencing project has been pursued as a national project in Japan since 1998. At the same time, a desire to accelerate the sequenc- ing of the entire rice genome led to the formation of the International Rice Genome Sequencing Project (IRGSP), initially comprising five countries. The sequencing strategy is the conventional clone-by-clone shotgun method us- ing P1-derived

  8. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    PubMed Central

    Tattini, Lorenzo; D’Aurizio, Romina; Magi, Alberto

    2015-01-01

    Structural variants are genomic rearrangements larger than 50?bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events.

  9. The Jackson Laboratory: The Mouse Genome Sequence Project

    NSDL National Science Digital Library

    Part of the Mouse Genome Informatics program (last reported on in the NSDL Scout Report for the Life Sciences on March 19, 2004) at the Jackson Laboratory, this website presents The Mouse Genome Sequence (MGS) project. MGS is designed "to integrate emerging mouse genomic sequence data with the genetic and biological data available in MGD and GXD." The site links to Eukaryotic Genome Annotation Projects, as well as Sequence Analysis Tools including MouseBlast and Genome Analysis. The site also offers basic background information about the Mouse Genome Sequencing Initiative, and provides site users with access to groups involved in mouse genome sequencing, the BAC clone library, request forms for targeted sequencing, and more.

  10. Genome sequence of the Brown Norway rat yields insights into

    E-print Network

    Pachter, Lior

    Genome sequence of the Brown Norway rat yields insights into mammalian evolution Rat Genome ........................................................................................................................................................................................................................... The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development Norway (BN) rat strain. The sequence represents a high-quality `draft' covering over 90% of the genome

  11. Statistical Properties of Open Reading Frames in Complete Genome Sequences

    Microsoft Academic Search

    Wentian Li

    1999-01-01

    Some statistical properties of open reading frames in all currently available complete genome sequences are analyzed (seventeen prokatyotic genomes, and 16 chromosome sequences from the yeast genome). The size distribution of open reading frames is characterized by various techniques, such as quantile tables, QQ-plots, rank- size plots (Zipf's plots), and spatial densities. The issue of the influence of CG% on

  12. Analysis of Singleton ORFans in Fully Sequenced Microbial Genomes

    E-print Network

    Fischer, Daniel

    Analysis of Singleton ORFans in Fully Sequenced Microbial Genomes Naomi Siew1,2 and Daniel Fischer2 analysis of singleton ORFans in the first 60 fully sequenced microbial genomes. We show that al- though as more genomes of closely related organ- isms become available. To better address the ques- tions about

  13. Combined Evidence Annotation of Transposable Elements in Genome Sequences

    Microsoft Academic Search

    Hadi Quesneville; Olivier Andrieu; Delphine Autard; Danielle Nouaud; Michael Ashburner; Dominique Anxolabehere

    2005-01-01

    Transposable elements (TEs) are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from

  14. SBH Performance on Genomic Sequences 1 Sequencing by Hybridization A Simulation Study of

    E-print Network

    Shamir, Ron

    were randomly generated or randomly selected from the genomic databases of: a) S. cervisae, b) E.coliSBH Performance on Genomic Sequences 1 Sequencing by Hybridization ­ A Simulation Study of Performance on Genomic Sequences Doron Lipson1 , Ziv Nevo, Ari Frank, Dolev Dotan, Zohar Yakhini2 Computer

  15. The Genome Database Organism-centered listing of available genomic sequence records and projects

    E-print Network

    Levin, Judith G.

    The Genome Database Organism-centered listing of available genomic sequence records and projects http://www.ncbi.nlm.nih.gov/genome National Center for Biotechnology Information · National Library | NCBI Genome | Last Update August 19, 2013 Contact: info@ncbi.nlm.nih.gov Scope Since 2011, the Genome

  16. Rapid modelling of cooperating genetic events in cancer through somatic genome editing

    E-print Network

    Papagiannakopoulos, Thales

    Cancer is a multistep process that involves mutations and other alterations in oncogenes and tumour suppressor genes. Genome sequencing studies have identified a large collection of genetic alterations that occur in human ...

  17. Insights into cancer biology through next-generation sequencing.

    PubMed

    Nik-Zainal, Serena

    2014-12-01

    Cancer is the ultimate disorder of the genome, characterised not by just one or two mutations, but by hundreds to thousands of acquired mutations that have been accrued through the development of a tumour. Thanks to the recent increase in the speed of sequencing offered by modern sequencing technologies, we are no longer restricted to exploring tiny fragments of protein-coding portions of the human genome. We can now read all the genetic material in human cells. Here, the framework of a next-generation sequencing experiment is explained, giving insight into the advances and difficulties posed by processing the enormous datasets generated through these methods. Some of the recent insights into tumour biology, that exploit the extraordinary surge in scale and the digital nature of next-generation sequencing, are highlighted, including cancer gene discovery, the detection of mutation signatures and cancer evolution. Technological and intellectual developments are starting to shape the personalized cancer genomic profiles of tomorrow. Let's train the next-generation of clinicians to be able to read them from today. PMID:25468925

  18. Simple sequence repeats in bryophyte mitochondrial genomes.

    PubMed

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2014-02-01

    Abstract Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

  19. Porcine parvovirus: DNA sequence and genome organization.

    PubMed

    Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I

    1989-10-01

    We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV. PMID:2794971

  20. Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences

    Microsoft Academic Search

    Robert K. Jansen; Linda A. Raubeson; Jeffrey L. Boore; Claude W. dePamphilis; Timothy W. Chumley; Rosemarie C. Haberle; Stacia K. Wyman; Andrew J. Alverson; Rhiannon Peery; Sallie J. Herman; H. Matthew Fourcade; Jennifer V. Kuehl; Joel R. McNeal; James Leebens-Mack; Liying Cui

    2005-01-01

    During the past decade, there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. There are 45 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next 5 years. Several groups of researchers including ours have

  1. Initial sequencing and comparative analysis of the mouse genome

    E-print Network

    Eddy, Sean

    and knockin techniques17­22 . For these and other reasons, the Human Genome Project (HGP) recognized from its ........................................................................................................................................................................................................................... The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from

  2. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  3. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission.

    PubMed

    Giongo, Adriana; Tyler, Heather L; Zipperer, Ursula N; Triplett, Eric W

    2010-01-01

    Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures. PMID:21304715

  4. All Resources | Office of Cancer Genomics

    Cancer.gov

    CGAP generated a wide range of genomics data on cancerous cells that are accessible through easy-to-use online tools. Researchers, educators, and students can find "in silico" answers to biological questions through the CGAP website.

  5. Building on Discoveries in Cancer Genomics

    Cancer.gov

    Deciphering the genomes of many cancers is necessary to understand the extent of their complexity and diversity. These molecular analyses are leading to a new classification of tumors, which may have therapeutic implications.

  6. AACR 2014: NCI/NIH-Sponsored Session: Large-Scale Genomics Data for the Research Community through the NCI Center for Cancer Genomics

    Cancer.gov

    The NCI’s Center for Cancer Genomics (CCG), which includes the Office of Cancer Genomics and The Cancer Genome Atlas Program Office, provides the research community access to large-scale molecular characterization data, which is largely sequence-based. CCG programs aim to improve patient outcome through identification of valid molecular targets and associated molecular markers (prognostic or diagnostic), in and across diseases investigated, which should ultimately lead to the rapid development of novel, more effective therapies.

  7. Draft Genome Sequence of Bacillus amyloliquefaciens B-1895

    PubMed Central

    Melnikov, Vyacheslav G.; Chistyakov, Vladimir A.

    2014-01-01

    In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

  8. Draft Genome Sequence of Bacillus amyloliquefaciens B-1895.

    PubMed

    Karlyshev, Andrey V; Melnikov, Vyacheslav G; Chistyakov, Vladimir A

    2014-01-01

    In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

  9. Genome sequencing of the important oilseed crop Sesamum indicum L

    PubMed Central

    2013-01-01

    The Sesame Genome Working Group (SGWG) has been formed to sequence and assemble the sesame (Sesamum indicum L.) genome. The status of this project and our planned analyses are described. PMID:23369264

  10. Initial impact of the sequencing of the human genome

    E-print Network

    Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

    The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

  11. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter [University of Chicago

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  12. Validation of rice genome sequence by optical mapping

    Microsoft Academic Search

    Shiguo Zhou; Michael C Bechner; Chris P Churas; Louise Pape; Sally A Leong; Rod Runnheim; Dan K Forrest; Steve Goldstein; Miron Livny; David C Schwartz

    2007-01-01

    BACKGROUND: Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. RESULTS: To facilitate ongoing sequencing finishing and validation

  13. Draft Genome Sequence of the Archiascomycetous Yeast Saitoella complicata.

    PubMed

    Yamauchi, Kenta; Kondo, Shinji; Hamamoto, Makiko; Takahashi, Yurika; Ogura, Yoshitoshi; Hayashi, Tetsuya; Nishida, Hiromi

    2015-01-01

    The draft genome sequence of the archiasomycetous yeast Saitoella complicata was determined. The assembly of newly and previously sequenced data sets resulted in 104 contigs (total of 14.1 Mbp; N 50, 239 kbp). On the newly assembled genome, a total of 6,933 protein-coding sequences (7,119 transcripts, including alternative splicing forms) were identified. PMID:26021914

  14. Draft Genome Sequence of the Archiascomycetous Yeast Saitoella complicata

    PubMed Central

    Yamauchi, Kenta; Hamamoto, Makiko; Takahashi, Yurika; Ogura, Yoshitoshi; Hayashi, Tetsuya

    2015-01-01

    The draft genome sequence of the archiasomycetous yeast Saitoella complicata was determined. The assembly of newly and previously sequenced data sets resulted in 104 contigs (total of 14.1 Mbp; N50, 239 kbp). On the newly assembled genome, a total of 6,933 protein-coding sequences (7,119 transcripts, including alternative splicing forms) were identified. PMID:26021914

  15. MIPS: a database for genomes and protein sequences

    Microsoft Academic Search

    Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

    2000-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

  16. Genome-Wide Epigenetic Modifications in Cancer

    Microsoft Academic Search

    Yoon Jung Park; Rainer Claus; Dieter Weichenhan; Christoph Plass

    \\u000a Epigenetic alterations in cancer include changes in DNA methylation and associated histone modifications that influence the\\u000a chromatin states and impact gene expression patterns. Due to recent technological advantages, the scientific community is\\u000a now obtaining a better picture of the genome-wide epigenetic changes that occur in a cancer genome. These epigenetic alterations\\u000a are associated with chromosomal instability and changes in transcriptional

  17. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    PubMed Central

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki

    2015-01-01

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches. PMID:26083936

  18. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    SciTech Connect

    Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions fr...

  20. Toolbox for Mobile-Element Insertion Detection on Cancer Genomes

    PubMed Central

    Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

    2014-01-01

    Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram. PMID:25452688

  1. A new workflow for whole-genome sequencing of single human cells.

    PubMed

    Binder, Vera; Bartenhagen, Christoph; Okpanyi, Vera; Gombert, Michael; Moehlendick, Birte; Behrens, Bianca; Klein, Hans-Ulrich; Rieder, Harald; Ida Krell, Pina Fanny; Dugas, Martin; Stoecklein, Nikolas Hendrik; Borkhardt, Arndt

    2014-10-01

    Unbiased amplification of the whole-genome amplification (WGA) of single cells is crucial to study cancer evolution and genetic heterogeneity, but is challenging due to the high complexity of the human genome. Here, we present a new workflow combining an efficient adapter-linker PCR-based WGA method with second-generation sequencing. This approach allows comparison of single cells at base pair resolution. Amplification recovered up to 74% of the human genome. Copy-number variants and loss of heterozygosity detected in single cell genomes showed concordance of up to 99% to pooled genomic DNA. Allele frequencies of mutations could be determined accurately due to an allele dropout rate of only 2%, clearly demonstrating the low bias of our PCR-based WGA approach. Sequencing with paired-end reads allowed genome-wide analysis of structural variants. By direct comparison to other WGA methods, we further endorse its suitability to analyze genetic heterogeneity. PMID:25066732

  2. Next-Generation Sequence Analysis of Cancer Xenograft Models

    PubMed Central

    Rossello, Fernando J.; Tothill, Richard W.; Britt, Kara; Marini, Kieren D.; Falzon, Jeanette; Thomas, David M.; Peacock, Craig D.; Marchionni, Luigi; Li, Jason; Bennett, Samara; Tantoso, Erwin; Brown, Tracey; Chan, Philip; Martelotto, Luciano G.; Watkins, D. Neil

    2013-01-01

    Next-generation sequencing (NGS) studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC), a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations. PMID:24086345

  3. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus

    E-print Network

    2007-01-01

    genome sequences and make comparisons (within angiosperms, seed plants,genome sequence from Korean Ginseng (Panax schiseng Nees) and comparative analysis of sequence evolution among 17 vascular plants.genomes of all other vascular plant taxa examined, a similar sequence

  4. Comparative DNA Sequence Analysis of Wheat and Rice Genomes

    PubMed Central

    Sorrells, Mark E.; La Rota, Mauricio; Bermudez-Kandianis, Catherine E.; Greene, Robert A.; Kantety, Ramesh; Munkvold, Jesse D.; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J.; Qi, Lili L.; Echalier, Benjamin; Gill, Bikram S.; Matthews, David E.; Lazo, Gerard R.; Chao, Shiaoman; Anderson, Olin D.; Edwards, Hugh; Linkiewicz, Anna M.; Dubcovsky, Jorge; Akhunov, Eduard D.; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T.; Peng, Junhua; Lapitan, Nora L.V.; Gonzalez-Hernandez, Jose L.; Anderson, James A.; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F.; Choi, Dong-Woog; Close, Timothy J.; Dilbirligi, Muharrem; Gill, Kulvinder S.; Steber, Camille; Walker-Simmons, Mary K.; McGuire, Patrick E.; Qualset, Calvin O.

    2003-01-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  5. Comparative DNA sequence analysis of wheat and rice genomes.

    PubMed

    Sorrells, Mark E; La Rota, Mauricio; Bermudez-Kandianis, Catherine E; Greene, Robert A; Kantety, Ramesh; Munkvold, Jesse D; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J; Qi, Lili L; Echalier, Benjamin; Gill, Bikram S; Matthews, David E; Lazo, Gerard R; Chao, Shiaoman; Anderson, Olin D; Edwards, Hugh; Linkiewicz, Anna M; Dubcovsky, Jorge; Akhunov, Eduard D; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T; Peng, Junhua; Lapitan, Nora L V; Gonzalez-Hernandez, Jose L; Anderson, James A; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F; Choi, Dong-Woog; Close, Timothy J; Dilbirligi, Muharrem; Gill, Kulvinder S; Steber, Camille; Walker-Simmons, Mary K; McGuire, Patrick E; Qualset, Calvin O

    2003-08-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  6. Sequencing and Assembly of the 22-Gb Loblolly Pine Genome

    PubMed Central

    Zimin, Aleksey; Stevens, Kristian A.; Crepeau, Marc W.; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L.; de Jong, Pieter J.; Neale, David B.; Salzberg, Steven L.; Yorke, James A.; Langley, Charles H.

    2014-01-01

    Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer “super-reads,” rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp. PMID:24653210

  7. The reference genome sequence of Saccharomyces cerevisiae: then and now.

    PubMed

    Engel, Stacia R; Dietrich, Fred S; Fisk, Dianna G; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C; Dwight, Selina S; Hitz, Benjamin C; Karra, Kalpana; Nash, Robert S; Weng, Shuai; Wong, Edith D; Lloyd, Paul; Skrzypek, Marek S; Miyasato, Stuart R; Simison, Matt; Cherry, J Michael

    2014-03-01

    The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called "S288C 2010," was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639

  8. The Cancer Genome Atlas Pan-Cancer analysis project

    E-print Network

    Lander, Eric S.

    The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a ...

  9. Mapping the Human Reference Genome’s Missing Sequence by Three-Way Admixture in Latino Genomes

    PubMed Central

    Genovese, Giulio; Handsaker, Robert E.; Li, Heng; Kenny, Eimear E.; McCarroll, Steven A.

    2013-01-01

    A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

  10. Complete genome sequences of cellular life forms: glimpses of theoretical evolutionary genomics

    Microsoft Academic Search

    Eugene V Koonin; Arcady R Mushegian

    1996-01-01

    The availability of complete genome sequences of cellular life forms creates the opportunity to explore the functional content of the genomes and evolutionary relationships between them at a new qualitative level. With the advent of these sequences, the construction of a minimal gene set sufficient for sustaining cellular life and reconstruction of the genome of the last common ancestor of

  11. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    Microsoft Academic Search

    Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

    2002-01-01

    Background: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D.

  12. Insights from twenty years of bacterial genome sequencing

    SciTech Connect

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.

  13. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-04

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  14. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Joe Gray

    2009-08-07

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  15. Genome Science and Personalized Cancer Treatment

    ScienceCinema

    Joe Gray

    2010-01-08

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

  16. CTD² Publication Guidelines | Office of Cancer Genomics

    Cancer.gov

    The Cancer Target Discovery and Development (CTD2) Network is a “community resource project” supported by the National Cancer Institute’s Office of Cancer Genomics. Members of the Network release data to the broader research community by depositing data into NCI-supported or public databases. Data deposition is NOT equivalent to publishing in a peer-reviewed journal. Unless there is a manuscript associated with a dataset, the Network considers data to be formally unpublished.

  17. Genome-Wide Association Studies of Cancer

    PubMed Central

    Stadler, Zsofia K.; Thom, Peter; Robson, Mark E.; Weitzel, Jeffrey N.; Kauff, Noah D.; Hurley, Karen E.; Devlin, Vincent; Gold, Bert; Klein, Robert J.; Offit, Kenneth

    2010-01-01

    Knowledge of the inherited risk for cancer is an important component of preventive oncology. In addition to well-established syndromes of cancer predisposition, much remains to be discovered about the genetic variation underlying susceptibility to common malignancies. Increased knowledge about the human genome and advances in genotyping technology have made possible genome-wide association studies (GWAS) of human diseases. These studies have identified many important regions of genetic variation associated with an increased risk for human traits and diseases including cancer. Understanding the principles, major findings, and limitations of GWAS is becoming increasingly important for oncologists as dissemination of genomic risk tests directly to consumers is already occurring through commercial companies. GWAS have contributed to our understanding of the genetic basis of cancer and will shed light on biologic pathways and possible new strategies for targeted prevention. To date, however, the clinical utility of GWAS-derived risk markers remains limited. PMID:20585100

  18. [Cancer Genome Atlas Pan-cancer Analysis Project].

    PubMed

    Zhang, Kun; Wang, Hong

    2015-03-20

    Cancer can exhibit different forms depending on the site of origin, cell types, the different forms of genetic mutations which also affect cancer therapeutic effect. Although many genes have been demonstrated to change a direct result of the change in phenotype, however, many cancers lineage complex molecular mechanisms are still not fully elucidated. Therefore, The Cancer Genome Atlas (TCGA) Research Network analyzed a large human tumors, in order to find the molecular changes in DNA, RNA, protein and epigenetic level, The results contain a wealth of data provides us with an opportunity for common, personality and new ideas throughout the cancer lineages form a whole description. Pan-cancer genome program first compares the 12 kinds of cancer types. Analysis of different tumor molecular changes and their functions, will tell us how effective treatment method is applied to a similar phenotype of the tumor. PMID:25936886

  19. First complete genome sequence of infectious laryngotracheitis virus

    Microsoft Academic Search

    Sang-Won Lee; Philip F Markham; John F Markham; Ivonne Petermann; Amir H Noormohammadi; Glenn F Browning; Nino P Ficorilli; Carol A Hartley; Joanne M Devlin

    2011-01-01

    Background  Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide.\\u000a To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial\\u000a sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not\\u000a been determined previously. This study aimed

  20. Compressing Genomic Sequence Fragments Using SlimGene

    Microsoft Academic Search

    Christos Kozanitis; Chris Saunders; Semyon Kruglyak; Vineet Bafna; George Varghese

    2010-01-01

    \\u000a With the advent of next generation sequencing technologies, the cost of sequencing whole genomes is poised to go below $1000\\u000a per human individual in a few years. As more and more genomes are sequenced, analysis methods are undergoing rapid development,\\u000a making it tempting to store sequencing data for long periods of time so that the data can be re-analyzed with

  1. Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.

  2. Finishing The Euchromatic Sequence Of The Human Genome

    SciTech Connect

    Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

    2004-09-07

    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

  3. Accurate whole human genome sequencing using reversible terminator chemistry

    Microsoft Academic Search

    David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

    2008-01-01

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

  4. Complete Genome Sequences of Helicobacter pylori Clarithromycin-Resistant Strains

    PubMed Central

    Binh, Tran Thanh; Suzuki, Rumiko; Shiota, Seiji; Kwon, Dong Hyeon

    2013-01-01

    We report the complete genome sequences of two Helicobacter pylori clarithromycin-resistant strains. Clarithromycin (CLR)-resistant strains were obtained under the exposure of H. pylori strain 26695 on agar plates with low clarithromycin concentrations. The genome data provide insights into the genomic changes of H. pylori under selection by clarithromycin in vitro. PMID:24233587

  5. Complete Genome Sequences of Helicobacter pylori Clarithromycin-Resistant Strains.

    PubMed

    Binh, Tran Thanh; Suzuki, Rumiko; Shiota, Seiji; Kwon, Dong Hyeon; Yamaoka, Yoshio

    2013-01-01

    We report the complete genome sequences of two Helicobacter pylori clarithromycin-resistant strains. Clarithromycin (CLR)-resistant strains were obtained under the exposure of H. pylori strain 26695 on agar plates with low clarithromycin concentrations. The genome data provide insights into the genomic changes of H. pylori under selection by clarithromycin in vitro. PMID:24233587

  6. Complete Genome Sequences of Helicobacter pylori Rifampin-Resistant Strains

    PubMed Central

    Chelysheva, Vera; Selezneva, Oksana; Akopian, Tatyana; Alexeev, Dmitry; Govorun, Vadim

    2013-01-01

    Here we present the complete genome sequences of two Helicobacter pylori rifampin-resistant (Rifr) strains (Rif1 and Rif2). Rifr strains were obtained by in vitro selection of H. pylori 26695 on agar plates with 20 µg/ml rifampin. The genome data provide insights on the genomic diversity of H. pylori under selection by rifampin. PMID:23833139

  7. On the sequencing of the human genome Robert H. Waterston*

    E-print Network

    Batzoglou, Serafim

    . The international Human Ge- nome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics. One was the product of the international Human Genome Project (HGP), and the other was the productOn the sequencing of the human genome Robert H. Waterston* , Eric S. Lander , and John E. Sulston

  8. SEQUENCING THE PIG GENOME USING A BAC BY BAC APPROACH

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have generated a highly contiguous physical map covering >98% of the pig genome in just 176 contigs. The map is localized to the genome through integration with the UIVC RH map as well BAC end sequence alignments to the human genome. Over 265k HindIII restriction digest fingerprints totaling 16.2...

  9. GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT

    E-print Network

    Wurtele, Eve Syrkin

    GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR ON-CAMPUS USERS Please fill out completely, and email, fax or mail to: Genomic Technologies Facility Manager 2025 Roy J. Carver Co-Laboratory Center for Plant Genomics Iowa State University Ames, Iowa 50011-3650 515

  10. GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT

    E-print Network

    Wurtele, Eve Syrkin

    GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR OFF-CAMPUS USERS Please fill out completely, and email, fax or mail to: Genomic Technologies Facility Manager 2025 Roy J. Carver Co-Laboratory Center for Plant Genomics Iowa State University Ames, Iowa 50011-3650 515

  11. Genome-level homology and phylogeny of Vibrionaceae (Gammaproteobacteria: Vibrionales) with three new complete genome sequences

    E-print Network

    Dikow, R. B.; Smith, William Leo

    2013-04-11

    Background Phylogenetic hypotheses based on complete genome data are presented for the Gammaproteobacteria family Vibrionaceae. Two taxon samplings are presented: one including all those taxa for which the genome sequences are complete in terms...

  12. On the current status of Phakopsora pachyrhizi genome sequencing

    PubMed Central

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

  13. Draft Genome Sequence of Mycobacterium heraklionense Strain Davo.

    PubMed

    Greninger, Alexander L; Cunningham, Gail; Chiu, Charles Y; Miller, Steve

    2015-01-01

    We report the draft genome sequence of Mycobacterium heraklionense strain Davo, isolated from a fine-needle aspirate of a right-ankle soft-tissue mass. This is the first draft genome sequence of Mycobacterium heraklionense, a nonpigmented rapidly growing mycobacterium. PMID:26205863

  14. Draft Genome Sequence of Tannerella forsythia Type Strain ATCC 43037.

    PubMed

    Friedrich, Valentin; Pabinger, Stephan; Chen, Tsute; Messner, Paul; Dewhirst, Floyd E; Schäffer, Christina

    2015-01-01

    Tannerella forsythia is an oral pathogen implicated in the development of periodontitis. Here, we report the draft genome sequence of the Tannerella forsythia strain ATCC 43037. The previously available genome of this designation (NCBI reference sequence NC_016610.1) was discovered to be derived from a different strain, FDC 92A2 (= ATCC BAA-2717). PMID:26067981

  15. Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...

  16. Draft Genome Sequence of Neurospora crassa Strain FGSC 73.

    PubMed

    Baker, Scott E; Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V; Simmons, Blake A; McCluskey, Kevin

    2015-01-01

    We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models. PMID:25838471

  17. Almost finished: the complete genome sequence of Mycosphaerella graminicola

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mycosphaerella graminicola causes septoria tritici blotch of wheat. An 8.9x shotgun sequence of bread wheat strain IPO323 was generated through the Community Sequencing Program of the U.S. Department of Energy’s Joint Genome Institute (JGI), and was finished at the Stanford Human Genome Center. The ...

  18. Draft Genome Sequence of Xanthomonas sacchari Strain LMG 476.

    PubMed

    Pieretti, Isabelle; Bolot, Stéphanie; Carrère, Sébastien; Barbe, Valérie; Cociancich, Stéphane; Rott, Philippe; Royer, Monique

    2015-01-01

    We report the high-quality draft genome sequence of Xanthomonas sacchari strain LMG 476, isolated from sugarcane. The genome comparison of this strain with a previously sequenced X. sacchari strain isolated from a distinct environmental source should provide further insights into the adaptation of this species to different habitats and its evolution. PMID:25792064

  19. Draft Genome Sequence of Aspergillus oryzae Strain 3.042

    PubMed Central

    Zhao, Guozhong; Yao, Yunping; Qi, Wei; Wang, Chunling; Hou, Lihua; Zeng, Bin

    2012-01-01

    Aspergillus oryzae is the most important fungus for the traditional fermentation in China and is particularly important in soy sauce fermentation. We report the 36,547,279-bp draft genome sequence of A. oryzae 3.042 and compared it to the published genome sequence of A. oryzae RIB40. PMID:22933657

  20. Sequence and comparative analysis of the chicken genome provide unique

    E-print Network

    Edwards, Scott

    evolution International Chicken Genome Sequencing Consortium* *Lists of participants and affiliations appear is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced the Aves, their Mesozoic dinosaur predecessors, and Crocodilia; the Lepidosauria (lizards, snakes

  1. Draft Genome Sequence of Tannerella forsythia Type Strain ATCC 43037

    PubMed Central

    Friedrich, Valentin; Pabinger, Stephan; Chen, Tsute; Messner, Paul; Dewhirst, Floyd E.

    2015-01-01

    Tannerella forsythia is an oral pathogen implicated in the development of periodontitis. Here, we report the draft genome sequence of the Tannerella forsythia strain ATCC 43037. The previously available genome of this designation (NCBI reference sequence NC_016610.1) was discovered to be derived from a different strain, FDC 92A2 (= ATCC BAA-2717). PMID:26067981

  2. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii

    PubMed Central

    Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  3. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.

    PubMed

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  4. Complete Genome Sequence of Melissococcus plutonius ATCC 35311 ?

    PubMed Central

    Okumura, Kayo; Arai, Rie; Okura, Masatoshi; Kirikae, Teruo; Takamatsu, Daisuke; Osaki, Makoto; Miyoshi-Akiyama, Tohru

    2011-01-01

    We report the first completely annotated genome sequence of Melissococcus plutonius ATCC 35311. M. plutonius is a one-genus, one-species bacterium and the etiological agent of European foulbrood of the honeybee. The genome sequence will provide new insights into the molecular mechanisms underlying its pathogenicity. PMID:21622755

  5. Complete Genome Sequence of Burkholderia cepacia Strain LO6.

    PubMed

    Belcaid, Mahdi; Kang, Yun; Tuanyok, Apichai; Hoang, Tung T

    2015-01-01

    Burkholderia cepacia strain LO6 is a betaproteobacterium that was isolated from a cystic fibrosis patient. Here we report the 6.4 Mb draft genome sequence assembled into 2 contigs. This genome sequence will aid the transcriptomic profiling of this bacterium and help us to better understand the mechanisms specific to pulmonary infections. PMID:26067955

  6. Complete Genome Sequence of Burkholderia cepacia Strain LO6

    PubMed Central

    Belcaid, Mahdi; Kang, Yun; Tuanyok, Apichai

    2015-01-01

    Burkholderia cepacia strain LO6 is a betaproteobacterium that was isolated from a cystic fibrosis patient. Here we report the 6.4 Mb draft genome sequence assembled into 2 contigs. This genome sequence will aid the transcriptomic profiling of this bacterium and help us to better understand the mechanisms specific to pulmonary infections. PMID:26067955

  7. Use of Whole Genome Sequence Data To Infer Baculovirus Phylogeny

    Microsoft Academic Search

    ELISABETH A. HERNIOU; TERESA LUQUE; XINWEN CHEN; JUST M. VLAK; DOREEN WINSTANLEY; JENNIFER S. CORY; D. R. O'Reilly

    2001-01-01

    Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance

  8. Initial sequencing and analysis of the human genome

    Microsoft Academic Search

    Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

    2001-01-01

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

  9. Draft Genome Sequence of Tolypothrix boutellei Strain VB521301

    PubMed Central

    Chandrababunaidu, Mathu Malar; Singh, Deeksha; Sen, Diya; Bhan, Sushma; Das, Subhadeep; Gupta, Akash

    2015-01-01

    We report here the draft genome sequence of the filamentous nitrogen-fixing cyanobacterium Tolypothrix boutellei strain VB521301. The organism is lipid rich and hydrophobic and produces polyunsaturated fatty acids which can be harnessed for industrial purpose. The draft genome sequence assembled into 11,572,263 bp with 70 scaffolds and 7,777 protein coding genes. PMID:25700407

  10. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  11. Enhancing genome assemblies by integrating non-sequence based data

    Microsoft Academic Search

    Thomas N Heider; James Lindsay; Chenwei Wang; Rachel J O’Neill; Andrew J Pask

    2011-01-01

    Introduction  Many genome projects were underway before the advent of high-throughput sequencing and have thus been supported by a wealth\\u000a of genome information from other technologies. Such information frequently takes the form of linkage and physical maps, both\\u000a of which can provide a substantial amount of data useful in de novo sequencing projects. Furthermore, the recent abundance of genome resources enables

  12. The human genome sequence: impact on health care

    Microsoft Academic Search

    M. D. Bashyam; S. E. Hasnain

    2003-01-01

    The recent sequencing of the human genome, resulting from two independent global efforts, is poised to revolutionize all aspects of human health. This landmark achievement has also vindicated two different methodologies that can now be used to target other important large genomes. The human genome sequence has revealed several novel\\/surprising features notably the probable presence of a mere 30-35,000 genes.

  13. Genome sequence of the human malaria parasite Plasmodium falciparum

    Microsoft Academic Search

    Malcolm J. Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W. Hyman; Jane M. Carlton; Arnab Pain; Sharen Bowman; Ian T. Paulsen; Keith James; Kim Rutherford; Steven L. Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J. Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W. Mather; Akhil B. Vaidya; Alan H. Fairlamb; Martin J. Fraunholz; David S. Roos; Stuart A. Ralph; Geoffrey I. McFadden; Leda M. Cummings; G. Mani Subramanian; Chris Mungall; J. Craig Venter; Daniel J. Carucci; Stephen L. Hoffman; Chris Newbold; Ronald W. Davis; Claire M. Fraser; Bart Barrell

    2002-01-01

    The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date.

  14. Single-molecule DNA sequencing technologies for future genomics research.

    PubMed

    Gupta, Pushpendra K

    2008-11-01

    During the current genomics revolution, the genomes of a large number of living organisms have been fully sequenced. However, with the advent of new sequencing technologies, genomics research is now at the threshold of a second revolution. Several second-generation sequencing platforms became available in 2007, but a further revolution in DNA resequencing technologies is being witnessed in 2008, with the launch of the first single-molecule DNA sequencer (Helicos Biosciences), which has already been used to resequence the genome of the M13 virus. This review discusses several single-molecule sequencing technologies that are expected to become available during the next few years and explains how they might impact on genomics research. PMID:18722683

  15. Draft Genome Sequence of Gerbil-Adapted Carcinogenic Helicobacter pylori Strain 7.13

    PubMed Central

    Asim, Mohammad; Chikara, Surendra K.; Ghosh, Arpita; Vudathala, Srinivas; Romero-Gallo, Judith; Krishna, Uma S.; Wilson, Keith T.; Israel, Dawn A.; Peek, Richard M.

    2015-01-01

    We report here the draft genome sequence of Helicobacter pylori strain 7.13, a gerbil-adapted strain that causes gastric cancer in gerbils. Strain 7.13 is derived from clinical strain B128, isolated from a patient with a duodenal ulcer. This study reveals genes associated with the virulence of the strain. PMID:26067974

  16. Clinical efficacy and possible applications of genomics in lung cancer.

    PubMed

    Alharbi, Khalid Khalaf

    2015-01-01

    The heterogeneous nature of lung cancer has become increasingly apparent since introduction of molecular classification. In general, advanced lung cancer is an aggressive malignancy with a poor prognosis. Activating alterations in several potential driver oncogenic genes have been identified, including EGFR, ROS1 and ALK and understanding of their molecular mechanisms underlying development, progression, and survival of lung cancer has led to the design of personalized treatments that have produced superior clinical outcomes in tumours harbouring these mutations. In light of the tsunami of new biomarkers and targeted agents, next generation sequencing testing strategies will be more appropriate in identifying the patients for each therapy and enabling personalized patients care. The challenge now is how best to interpret the results of these genomic tests, in the context of other clinical data, to optimize treatment choices. In genomic era of cancer treatment, the traditional one-size-fits-all paradigm is being replaced with more effective, personalized oncologic care. This review provides an overview of lung cancer genomics and personalized treatment. PMID:25773789

  17. Whole-genome sequencing in outbreak analysis.

    PubMed

    Gilchrist, Carol A; Turner, Stephen D; Riley, Margaret F; Petri, William A; Hewlett, Erik L

    2015-07-01

    In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  18. Genome remodelling in a basal-like breast cancer metastasis and xenograft

    Microsoft Academic Search

    Li Ding; Matthew J. Ellis; Shunqiang Li; David E. Larson; Ken Chen; John W. Wallis; Christopher C. Harris; Michael D. McLellan; Robert S. Fulton; Lucinda L. Fulton; Rachel M. Abbott; Jeremy Hoog; David J. Dooling; Daniel C. Koboldt; Heather Schmidt; Joelle Kalicki; Qunyuan Zhang; Lei Chen; Ling Lin; Michael C. Wendl; Joshua F. McMichael; Vincent J. Magrini; Lisa Cook; Sean D. McGrath; Tammi L. Vickery; Elizabeth Appelbaum; Katherine Deschryver; Sherri Davies; Therese Guintoli; Li Lin; Robert Crowder; Yu Tao; Jacqueline E. Snider; Scott M. Smith; Adam F. Dukes; Gabriel E. Sanderson; Craig S. Pohl; Kim D. Delehaunty; Catrina C. Fronick; Kimberley A. Pape; Jerry S. Reed; Jody S. Robinson; Jennifer S. Hodges; William Schierding; Nathan D. Dees; Dong Shen; Devin P. Locke; Madeline E. Wiechert; James M. Eldred; Josh B. Peck; Benjamin J. Oberkfell; Justin T. Lolofie; Feiyu Du; Amy E. Hawkins; Michelle D. O'Laughlin; Kelly E. Bernard; Mark Cunningham; Glendoria Elliott; Mark D. Mason; Dominic M. Thompson Jr.; Jennifer L. Ivanovich; Paul J. Goodfellow; Charles M. Perou; George M. Weinstock; Rebecca Aft; Mark Watson; Timothy J. Ley; Richard K. Wilson; Elaine R. Mardis

    2010-01-01

    Massively parallel DNA sequencing technologies provide an unprecedented ability to screen entire genomes for genetic changes associated with tumour progression. Here we describe the genomic analyses of four DNA samples from an African-American patient with basal-like breast cancer: peripheral blood, the primary tumour, a brain metastasis and a xenograft derived from the primary tumour. The metastasis contained two de novo

  19. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Microsoft Academic Search

    Tina T. Hu; Pedro Pattyn; Erica G. Bakker; Jun Cao; Jan-Fang Cheng; Richard M. Clark; Noah Fahlgren; Jeffrey A. Fawcett; Jane Grimwood; Heidrun Gundlach; Georg Haberer; Jesse D. Hollister; Stephan Ossowski; Robert P. Ottilar; Asaf A. Salamov; Korbinian Schneeberger; Manuel Spannagl; Xi Wang; Liang Yang; Mikhail E. Nasrallah; Joy Bergelson; James C. Carrington; Brandon S. Gaut; Jeremy Schmutz; Klaus F. X. Mayer; Yves Van de Peer; Igor V. Grigoriev; Magnus Nordborg; Detlef Weigel; Ya-Long Guo

    2011-01-01

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN\\/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how

  20. Standards for Sequencing Viral Genomes in the Era of High-Throughput Sequencing

    PubMed Central

    Beitzel, Brett; Chain, Patrick S. G.; Davenport, Matthew G.; Donaldson, Eric; Frieman, Matthew; Kugelman, Jeffrey; Kuhn, Jens H.; O’Rear, Jules; Sabeti, Pardis C.; Wentworth, David E.; Wiley, Michael R.; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher

    2014-01-01

    ABSTRACT Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

  1. Standards for sequencing viral genomes in the era of high-throughput sequencing.

    PubMed

    Ladner, Jason T; Beitzel, Brett; Chain, Patrick S G; Davenport, Matthew G; Donaldson, Eric F; Frieman, Matthew; Kugelman, Jeffrey R; Kuhn, Jens H; O'Rear, Jules; Sabeti, Pardis C; Wentworth, David E; Wiley, Michael R; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher; Palacios, Gustavo

    2014-01-01

    Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five "standard" categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

  2. Genome Science: A Video Tour of the Washington University Genome Sequencing Center for High School and Undergraduate Students

    ERIC Educational Resources Information Center

    Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.

    2005-01-01

    Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…

  3. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    SciTech Connect

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  4. Emerging Knowledge from Genome Sequencing of Crop Species

    Microsoft Academic Search

    Delfina Barabaschi; Davide Guerra; Katia Lacrima; Paolo Laino; Vania Michelotti; Simona Urso; Giampiero Valè; Luigi Cattivelli

    Extensive insights into the genome composition, organization, and evolution have been gained from the plant genome sequencing\\u000a and annotation ongoing projects. The analysis of crop genomes provided surprising evidences with important implications in\\u000a plant origin and evolution: genome duplication, ancestral re-arrangements and unexpected polyploidization events opened new\\u000a doors to address fundamental questions related to species proliferation, adaptation, and functional modulations.

  5. Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus

    PubMed Central

    Wang, Qing-Yan; Huang, Yan-Yan; Song, Li-Fu; Du, Qi-Shi; Yu, Bo; Chen, Dong

    2014-01-01

    We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies. PMID:25395648

  6. Cancer vulnerabilities unveiled by genomic loss | Office of Cancer Genomics

    Cancer.gov

    Integrated analysis of RNAi and copy number data across a panel of cancer cell lines revealed the CYCLOPS (copy number alterations yielding cancer liabilities owing to partial loss) genes, which include components of the spliceosome, ribosome and proteasome, as potential candidates for targeted cancer therapies. Partial loss of these genes may make tumor cells more sensitive than normal cells to gene suppression with targeted agents.

  7. Interpretation of personal genome sequencing data in terms of disease ranks based on mutual information

    PubMed Central

    2015-01-01

    Background The rapid advances in genome sequencing technologies have resulted in an unprecedented number of genome variations being discovered in humans. However, there has been very limited coverage of interpretation of the personal genome sequencing data in terms of diseases. Methods In this paper we present the first computational analysis scheme for interpreting personal genome data by simultaneously considering the functional impact of damaging variants and curated disease-gene association data. This method is based on mutual information as a measure of the relative closeness between the personal genome and diseases. We hypothesize that a higher mutual information score implies that the personal genome is more susceptible to a particular disease than other diseases. Results The method was applied to the sequencing data of 50 acute myeloid leukemia (AML) patients in The Cancer Genome Atlas. The utility of associations between a disease and the personal genome was explored using data of healthy (control) people obtained from the 1000 Genomes Project. The ranks of the disease terms in the AML patient group were compared with those in the healthy control group using "Leukemia, Myeloid, Acute" (C04.557.337.539.550) as the corresponding MeSH disease term. The mutual information rank of the disease term was substantially higher in the AML patient group than in the healthy control group, which demonstrates that the proposed methodology can be successfully applied to infer associations between the personal genome and diseases. Conclusions Overall, the area under the receiver operating characteristics curve was significantly larger for the AML patient data than for the healthy controls. This methodology could contribute to consequential discoveries and explanations for mining personal genome sequencing data in terms of diseases, and have versatility with respect to genomic-based knowledge such as drug-gene and environmental-factor-gene interactions. PMID:26045178

  8. Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo electric organ

    E-print Network

    Vertes, Akos

    Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo discovery Open reading frame (ORF) Uncharacterized open reading frames (ORFs) in human genomic sequence Elsevier B.V. All rights reserved. 1. Introduction The availability of complete genomic sequences

  9. Basics of Genome Sequence Analysis in Bioinformatics -- its Fundamental Ideas and Problems

    Microsoft Academic Search

    Tomonori Suzuki; Satoru Miyazaki

    2009-01-01

    The genome sequences are one of the most fundamental data among various omics analyses. So far, basic bioinformatics tools have developing to treat genome sequences. First step of genome sequence analysis is to predict or assign \\

  10. Community-wide analysis of microbial genome sequence signatures

    PubMed Central

    Dick, Gregory J; Andersson, Anders F; Baker, Brett J; Simmons, Sheri L; Thomas, Brian C; Yelton, A Pepper; Banfield, Jillian F

    2009-01-01

    Background Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape them. Results We analyzed metagenomic sequence data from two acidophilic biofilm communities, including composite genomes reconstructed for nine archaea, three bacteria, and numerous associated viruses, as well as thousands of unassigned fragments from strain variants and low-abundance organisms. Genome signatures, in the form of tetranucleotide frequencies analyzed by emergent self-organizing maps, segregated sequences from all known populations sharing < 50 to 60% average amino acid identity and revealed previously unknown genomic clusters corresponding to low-abundance organisms and a putative plasmid. Signatures were pervasive genome-wide. Clusters were resolved because intra-genome differences resulting from translational selection or protein adaptation to the intracellular (pH ~5) versus extracellular (pH ~1) environment were small relative to inter-genome differences. We found that these genome signatures stem from multiple influences but are primarily manifested through codon composition, which we propose is the result of genome-specific mutational biases. Conclusions An important conclusion is that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities. Thus, genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities. PMID:19698104

  11. Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC

    Microsoft Academic Search

    Dirk Pöhler; Nadine Werner; Rasmus Steinkamp; Burkhard Morgenstern

    2005-01-01

    Comparative analysis of genomic sequences is a powerful approach to discover functional sites in these sequences. Herein, we present a WWW-based software system for multiple alignment of genomic sequences. We use the local alignment tool CHAOS to rapidly identify chains of pairwise similarities. These similarities are used as anchor points to speed up the DIALIGN multiple-alignment program. Finally,thevisualizationtoolABCisusedforinteract- ive graphical

  12. MIPS: a database for genomes and protein sequences

    Microsoft Academic Search

    Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

  13. Computational methods and resources for the interpretation of genomic variants in cancer

    PubMed Central

    2015-01-01

    The recent improvement of the high-throughput sequencing technologies is having a strong impact on the detection of genetic variations associated with cancer. Several institutions worldwide have been sequencing the whole exomes and or genomes of cancer patients in the thousands, thereby providing an invaluable collection of new somatic mutations in different cancer types. These initiatives promoted the development of methods and tools for the analysis of cancer genomes that are aimed at studying the relationship between genotype and phenotype in cancer. In this article we review the online resources and computational tools for the analysis of cancer genome. First, we describe the available repositories of cancer genome data. Next, we provide an overview of the methods for the detection of genetic variation and computational tools for the prioritization of cancer related genes and causative somatic variations. Finally, we discuss the future perspectives in cancer genomics focusing on the impact of computational methods and quantitative approaches for defining personalized strategies to improve the diagnosis and treatment of cancer. PMID:26111056

  14. Cancer Proliferation Gene Discovery Through Functional Genomics

    PubMed Central

    Schlabach, Michael R.; Luo, Ji; Solimini, Nicole L.; Hu, Guang; Xu, Qikai; Li, Mamie Z.; Zhao, Zhenming; Smogorzewska, Agata; Sowa, Mathew E.; Ang, Xiaolu L.; Westbrook, Thomas F.; Liang, Anthony C.; Chang, Kenneth; Hackett, Jennifer A.; Harper, J. Wade; Hannon, Gregory J.; Elledge, Stephen J.

    2010-01-01

    Retroviral short hairpin RNA (shRNA)–mediated genetic screens in mammalian cells are powerful tools for discovering loss-of-function phenotypes. We describe a highly parallel multiplex methodology for screening large pools of shRNAs using half-hairpin barcodes for microarray deconvolution. We carried out dropout screens for shRNAs that affect cell proliferation and viability in cancer cells and normal cells. We identified many shRNAs to be antiproliferative that target core cellular processes, such as the cell cycle and protein translation, in all cells examined. Moreover, we identified genes that are selectively required for proliferation and survival in different cell lines. Our platform enables rapid and cost-effective genome-wide screens to identify cancer proliferation and survival genes for target discovery. Such efforts are complementary to the Cancer Genome Atlas and provide an alternative functional view of cancer cells. PMID:18239126

  15. Identification of Optimum Sequencing Depth Especially for De Novo Genome Assembly of Small Genomes Using Next Generation Sequencing Data

    PubMed Central

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources. PMID:23593174

  16. Data structures and compression algorithms for genomic sequence data

    PubMed Central

    Brandon, Marty C.; Wallace, Douglas C.; Baldi, Pierre

    2009-01-01

    Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data. Results: The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences. These locations and the corresponding differential variants can be encoded into binary strings using various entropy coding methods, from fixed codes such as Golomb and Elias codes, to variables codes, such as Huffman codes. We demonstrate the approach and various tradeoffs using highly variables human mitochondrial genome sequences as a testbed. With only a partial level of optimization, 3615 genome sequences occupying 56 MB in GenBank are compressed down to only 167 KB, achieving a 345-fold compression rate, using the revised Cambridge Reference Sequence as the reference sequence. Using the consensus sequence as the reference sequence, the data can be stored using only 133 KB, corresponding to a 433-fold level of compression, roughly a 23% improvement. Extensions to nuclear genomes and high-throughput sequencing data are discussed. Availability: Data are publicly available from GenBank, the HapMap web site, and the MITOMAP database. Supplementary materials with additional results, statistics, and software implementations are available from http://mammag.web.uci.edu/bin/view/Mitowiki/ProjectDNACompression. Contact: pfbaldi@ics.uci.edu PMID:19447783

  17. The Brachypodium genome sequence: a resource for oat genomics research

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oat (Avena sativa) is an important cereal crop used as both an animal feed and for human consumption. Genetic and genomic research on oat is hindered because it is hexaploid and possesses a large (13 Gb) genome. Diploid Avena relatives have been employed for genetic and genomic studies, but only mod...

  18. Reference genome sequence of the model plant Setaria

    SciTech Connect

    Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  19. Publications | Office of Cancer Genomics

    Cancer.gov

    Philadelphia chromosome-like acute lymphoblastic leukemia was found to be characterized by a range of genomic alterations that activate a limited number of signaling pathways, all of which may be amenable to inhibition with approved tyrosine kinase inhibitors.

  20. Metastatic tumor evolution and organoid modeling implicate TGFBR2 as a cancer driver in diffuse gastric cancer | Office of Cancer Genomics

    Cancer.gov

    Gastric cancer is the second-leading cause of global cancer deaths, with metastatic disease representing the primary cause of mortality. To identify candidate drivers involved in oncogenesis and tumor evolution, we conduct an extensive genome sequencing analysis of metastatic progression in a diffuse gastric cancer. This involves a comparison between a primary tumor from a hereditary diffuse gastric cancer syndrome proband and its recurrence as an ovarian metastasis.

  1. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    SciTech Connect

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Jando, Marlen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  2. Community-wide analysis of microbial genome sequence signatures

    Microsoft Academic Search

    Gregory J Dick; Anders F Andersson; Brett J Baker; Sheri L Simmons; Brian C Thomas; A Pepper Yelton; Jillian F Banfield

    2009-01-01

    Background  Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional\\u000a characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome\\u000a evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding\\u000a the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape\\u000a them.

  3. Complete genome sequence of Spirosoma linguale type strain (1T)

    SciTech Connect

    Lail, Kathleen [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Schutze, Andrea [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chen, Feng [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Spirosoma linguale Migula 1894 is the type species of the genus. S. linguale is a free-living and non-pathogenic organism, known for its peculiar ringlike and horseshoe-shaped cell morphology. Here we describe the features of this organism, together with the complete ge-nome sequence and annotation. This is only the third completed genome sequence of a member of the family Cytophagaceae. The 8,491,258 bp long genome with its eight plas-mids, 7,069 protein-coding and 60 RNA genes is part of the Genomic Encyclopedia of Bacte-ria and Archaea project.

  4. Genetic and Clonal Dissection of Murine Small Cell Lung Carcinoma Progression by Genome Sequencing

    PubMed Central

    McFadden, David G.; Papagiannakopoulos, Thales; Taylor-Weiner, Amaro; Stewart, Chip; Carter, Scott L.; Cibulskis, Kristian; Bhutkar, Arjun; McKenna, Aaron; Dooley, Alison; Vernon, Amanda; Sougnez, Carrie; Malstrom, Scott; Heimann, Megan; Park, Jennifer; Chen, Frances; Farago, Anna F.; Dayton, Talya; Shefler, Erica; Gabriel, Stacey; Getz, Gad; Jacks, Tyler

    2014-01-01

    Summary Small cell lung carcinoma (SCLC) is a highly lethal, smoking-associated cancer with few known targetable genetic alterations. Using genome sequencing, we characterized the somatic evolution of a genetically engineered mouse model (GEMM) of SCLC initiated by loss of Trp53 and Rb1. We identified alterations in DNA copy number and complex genomic rearrangements and demonstrated a low somatic point mutation frequency in the absence of tobacco mutagens. Alterations targeting the tumor suppressor Pten occurred in the majority of murine SCLC studied, and engineered Pten deletion accelerated murine SCLC and abrogated loss of Chr19 in Trp53; Rb1; Pten compound mutant tumors. Finally, we found evidence for polyclonal and sequential metastatic spread of murine SCLC by comparative sequencing of families of related primary tumors and metastases. We propose a temporal model of SCLC tumorigenesis with implications for human SCLC therapeutics and the nature of cancer-genome evolution in GEMMs. PMID:24630729

  5. STATUS OF THE RB51 GENOME SEQUENCING PROJECT

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The shotgun sequencing of the B. abortus vaccine strain, RB51 genome is nearly complete. Thus far, approximately 49,000 recombinant clones have been sequenced, generating approximately 34,300,000-bp of raw DNA sequence data. The resulting data has been compiled and aligned using the B. abortus st...

  6. Discrete-Length Repeated Sequences in Eukaryotic Genomes

    Microsoft Academic Search

    William R. Pearson; John F. Morrow

    1981-01-01

    Two of the four repeated DNA sequences near the 5' end of the silk fibroin gene hybridize with discrete-length families of repeated DNA. These two families comprise 0.5% of the animal's genome. A repeated sequence with a conserved length has also been found in the short class of moderately repeated sequences in the sea urchin. The discrete length, interspersion, and

  7. prot4EST: Translating Expressed Sequence Tags from neglected genomes

    Microsoft Academic Search

    James D Wasmuth; Mark L Blaxter

    2004-01-01

    Background: The genomes of an increasing number of species are being investigated through generation of expressed sequence tags (ESTs). However, ESTs are prone to sequencing errors and typically define incomplete transcripts, making downstream annotation difficult. Annotation would be greatly improved with robust polypeptide translations. Many current solutions for EST translation require a large number of full-length gene sequences for training

  8. Lacunarity Analysis of Genomic Sequences: A Potential Bio-Sequence Analysis Method

    Microsoft Academic Search

    Gopakumar G; Achuthsankar S. Nair

    2011-01-01

    This paper proposes the use of lacunarity analysis of genomic sequences as a potential bio-sequence analysis method. In the present work the fractal property of DNA sequences is confirmed using the lacunarity analysis of their Chaos Game Representation matrices. In another study, the distribution of various n-mers in a genomic sequence is investigated based on the lacunarity analysis of one-dimensional

  9. Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens

    PubMed Central

    Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E.; Bakker, Freek T.

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

  10. Complete genome sequence of Ferroglobus placidus AEDII12DO

    PubMed Central

    Anderson, Iain; Risso, Carla; Holmes, Dawn; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Samuel; Saunders, Elizabeth; Brettin, Thomas; Detter, John C.; Han, Cliff; Tapia, Roxanne; Larimer, Frank; Land, Miriam; Hauser, Loren; Woyke, Tanja; Lovley, Derek; Kyrpides, Nikos; Ivanova, Natalia

    2011-01-01

    Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryarchaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemolithoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and annotation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was sequenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project. PMID:22180810

  11. Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome

    Microsoft Academic Search

    Andreia J Amaral; Hendrik-Jan Megens; Hindrik HD Kerstens; Henri CM Heuven; Bert Dibbits; Richard PMA Crooijmans; Johan T den Dunnen; Martien AM Groenen

    2009-01-01

    BACKGROUND: Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of

  12. National Institutes of Health to Map Genomic Changes of Lung, Brain, and Ovarian Cancers | Office of Cancer Genomics

    Cancer.gov

    National Institutes of Health to Map Genomic Changes of Lung, Brain, and Ovarian Cancers The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today announced the first three cancers that will be studied in the pilot phase of The Cancer Genome Atlas (TCGA) project. The cancers to be studied in the TCGA Pilot Project are lung, brain (glioblastoma), and ovarian.

  13. Characterization of Three Mycobacterium spp. with Potential Use in Bioremediation by Genome Sequencing and Comparative Genomics.

    PubMed

    Das, Sarbashis; Pettersson, B M Fredrik; Behra, Phani Rama Krishna; Ramesh, Malavika; Dasgupta, Santanu; Bhattacharya, Alok; Kirsebom, Leif A

    2015-01-01

    We provide the genome sequences of the type strains of the polychlorophenol-degrading Mycobacterium chlorophenolicum (DSM43826), the degrader of chlorinated aliphatics Mycobacterium chubuense (DSM44219) and Mycobacterium obuense (DSM44075) that has been tested for use in cancer immunotherapy. The genome sizes of M. chlorophenolicum, M. chubuense, and M. obuense are 6.93, 5.95, and 5.58 Mb with GC-contents of 68.4%, 69.2%, and 67.9%, respectively. Comparative genomic analysis revealed that 3,254 genes are common and we predicted approximately 250 genes acquired through horizontal gene transfer from different sources including proteobacteria. The data also showed that the biodegrading Mycobacterium spp. NBB4, also referred to as M. chubuense NBB4, is distantly related to the M. chubuense type strain and should be considered as a separate species, we suggest it to be named Mycobacterium ethylenense NBB4. Among different categories we identified genes with potential roles in: biodegradation of aromatic compounds and copper homeostasis. These are the first nonpathogenic Mycobacterium spp. found harboring genes involved in copper homeostasis. These findings would therefore provide insight into the role of this group of Mycobacterium spp. in bioremediation as well as the evolution of copper homeostasis within the Mycobacterium genus. PMID:26079817

  14. Draft Genome Sequence of Stenotrophomonas maltophilia Strain UV74 Reveals Extensive Variability within Its Genomic Group

    PubMed Central

    Conchillo-Solé, Oscar; Yero, Daniel; Coves, Xavier; Huedo, Pol; Martínez-Servat, Sònia

    2015-01-01

    We report the draft genome sequence of Stenotrophomonas maltophilia UV74, isolated from a vascular ulcer. This draft genome sequence shall contribute to the understanding of the evolution and pathogenicity of this species, particularly regarding isolates of clinical origin. PMID:26067959

  15. A physical map of the papaya genome with integrated genetic map and genome sequence

    Microsoft Academic Search

    Qingyi Yu; Eric Tong; Rachel L Skelton; John E Bowers; Meghan R Jones; Jan E Murray; Shaobin Hou; Peizhu Guan; Ricelle A Acob; Ming-Cheng Luo; Paul H Moore; Maqsudul Alam; Andrew H Paterson; Ray Ming

    2009-01-01

    BACKGROUND: Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we

  16. Draft Genome Sequence of Stenotrophomonas maltophilia Strain UV74 Reveals Extensive Variability within Its Genomic Group.

    PubMed

    Conchillo-Solé, Oscar; Yero, Daniel; Coves, Xavier; Huedo, Pol; Martínez-Servat, Sònia; Daura, Xavier; Gibert, Isidre

    2015-01-01

    We report the draft genome sequence of Stenotrophomonas maltophilia UV74, isolated from a vascular ulcer. This draft genome sequence shall contribute to the understanding of the evolution and pathogenicity of this species, particularly regarding isolates of clinical origin. PMID:26067959

  17. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    SciTech Connect

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  18. Accurate whole human genome sequencing using reversible terminator chemistry.

    PubMed

    Bentley, David R; Balasubramanian, Shankar; Swerdlow, Harold P; Smith, Geoffrey P; Milton, John; Brown, Clive G; Hall, Kevin P; Evers, Dirk J; Barnes, Colin L; Bignell, Helen R; Boutell, Jonathan M; Bryant, Jason; Carter, Richard J; Keira Cheetham, R; Cox, Anthony J; Ellis, Darren J; Flatbush, Michael R; Gormley, Niall A; Humphray, Sean J; Irving, Leslie J; Karbelashvili, Mirian S; Kirk, Scott M; Li, Heng; Liu, Xiaohai; Maisinger, Klaus S; Murray, Lisa J; Obradovic, Bojan; Ost, Tobias; Parkinson, Michael L; Pratt, Mark R; Rasolonjatovo, Isabelle M J; Reed, Mark T; Rigatti, Roberto; Rodighiero, Chiara; Ross, Mark T; Sabot, Andrea; Sankar, Subramanian V; Scally, Aylwyn; Schroth, Gary P; Smith, Mark E; Smith, Vincent P; Spiridou, Anastassia; Torrance, Peta E; Tzonev, Svilen S; Vermaas, Eric H; Walter, Klaudia; Wu, Xiaolin; Zhang, Lu; Alam, Mohammed D; Anastasi, Carole; Aniebo, Ify C; Bailey, David M D; Bancarz, Iain R; Banerjee, Saibal; Barbour, Selena G; Baybayan, Primo A; Benoit, Vincent A; Benson, Kevin F; Bevis, Claire; Black, Phillip J; Boodhun, Asha; Brennan, Joe S; Bridgham, John A; Brown, Rob C; Brown, Andrew A; Buermann, Dale H; Bundu, Abass A; Burrows, James C; Carter, Nigel P; Castillo, Nestor; Chiara E Catenazzi, Maria; Chang, Simon; Neil Cooley, R; Crake, Natasha R; Dada, Olubunmi O; Diakoumakos, Konstantinos D; Dominguez-Fernandez, Belen; Earnshaw, David J; Egbujor, Ugonna C; Elmore, David W; Etchin, Sergey S; Ewan, Mark R; Fedurco, Milan; Fraser, Louise J; Fuentes Fajardo, Karin V; Scott Furey, W; George, David; Gietzen, Kimberley J; Goddard, Colin P; Golda, George S; Granieri, Philip A; Green, David E; Gustafson, David L; Hansen, Nancy F; Harnish, Kevin; Haudenschild, Christian D; Heyer, Narinder I; Hims, Matthew M; Ho, Johnny T; Horgan, Adrian M; Hoschler, Katya; Hurwitz, Steve; Ivanov, Denis V; Johnson, Maria Q; James, Terena; Huw Jones, T A; Kang, Gyoung-Dong; Kerelska, Tzvetana H; Kersey, Alan D; Khrebtukova, Irina; Kindwall, Alex P; Kingsbury, Zoya; Kokko-Gonzales, Paula I; Kumar, Anil; Laurent, Marc A; Lawley, Cynthia T; Lee, Sarah E; Lee, Xavier; Liao, Arnold K; Loch, Jennifer A; Lok, Mitch; Luo, Shujun; Mammen, Radhika M; Martin, John W; McCauley, Patrick G; McNitt, Paul; Mehta, Parul; Moon, Keith W; Mullens, Joe W; Newington, Taksina; Ning, Zemin; Ling Ng, Bee; Novo, Sonia M; O'Neill, Michael J; Osborne, Mark A; Osnowski, Andrew; Ostadan, Omead; Paraschos, Lambros L; Pickering, Lea; Pike, Andrew C; Pike, Alger C; Chris Pinkard, D; Pliskin, Daniel P; Podhasky, Joe; Quijano, Victor J; Raczy, Come; Rae, Vicki H; Rawlings, Stephen R; Chiva Rodriguez, Ana; Roe, Phyllida M; Rogers, John; Rogert Bacigalupo, Maria C; Romanov, Nikolai; Romieu, Anthony; Roth, Rithy K; Rourke, Natalie J; Ruediger, Silke T; Rusman, Eli; Sanches-Kuiper, Raquel M; Schenker, Martin R; Seoane, Josefina M; Shaw, Richard J; Shiver, Mitch K; Short, Steven W; Sizto, Ning L; Sluis, Johannes P; Smith, Melanie A; Ernest Sohna Sohna, Jean; Spence, Eric J; Stevens, Kim; Sutton, Neil; Szajkowski, Lukasz; Tregidgo, Carolyn L; Turcatti, Gerardo; Vandevondele, Stephanie; Verhovsky, Yuli; Virk, Selene M; Wakelin, Suzanne; Walcott, Gregory C; Wang, Jingwen; Worsley, Graham J; Yan, Juying; Yau, Ling; Zuerlein, Mike; Rogers, Jane; Mullikin, James C; Hurles, Matthew E; McCooke, Nick J; West, John S; Oaks, Frank L; Lundberg, Peter L; Klenerman, David; Durbin, Richard; Smith, Anthony J

    2008-11-01

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications. PMID:18987734

  19. Complete Genome Sequence of Staphylococcus aureus Tager 104, a Sequence Type 49 Ancestor

    PubMed Central

    Davis, Richard; Hossain, Mohammad J.; Liles, Mark R.

    2013-01-01

    We report here the complete genome sequence of Staphylococcus aureus Tager 104, originally isolated from a cutaneous abscess in 1947 by Morris Tager. Sequence typing of the strain revealed its membership in sequence type 49 (ST49), a previously unknown multilocus sequence type (MLST) in clinical samples. PMID:24029757

  20. Transforming activity of human papillomavirus type 16 DNA sequence in a cervical cancer.

    PubMed Central

    Tsunokawa, Y; Takebe, N; Kasamatsu, T; Terada, M; Sugimura, T

    1986-01-01

    A genomic DNA sample from cervical cancer tissue, containing human papillomavirus (HPV) type 16, was found to induce malignant transformation of NIH 3T3 cells when it was tested by transfection assays using the calcium phosphate coprecipitation technique. The primary and secondary transformants contained the HPV type 16 DNA sequences and human specific Alu family sequences. To the best of our knowledge, it has not been reported previously that HPV type 16 DNA sequences in total genomic DNA from a cervical cancer have transforming activity. Images PMID:3008153

  1. Assembly of large genomes using second-generation sequencing

    PubMed Central

    Schatz, Michael C.; Delcher, Arthur L.; Salzberg, Steven L.

    2010-01-01

    Second-generation sequencing technology can now be used to sequence an entire human genome in a matter of days and at low cost. Sequence read lengths, initially very short, have rapidly increased since the technology first appeared, and we now are seeing a growing number of efforts to sequence large genomes de novo from these short reads. In this Perspective, we describe the issues associated with short-read assembly, the different types of data produced by second-gen sequencers, and the latest assembly algorithms designed for these data. We also review the genomes that have been assembled recently from short reads and make recommendations for sequencing strategies that will yield a high-quality assembly. PMID:20508146

  2. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    Microsoft Academic Search

    Hervé Tettelin; Vega Masignani; Michael J. Cieslewicz; Jonathan A. Eisen; Scott Peterson; Michael R. Wessels; Ian T. Paulsen; Karen E. Nelson; Immaculada Margarit; Timothy D. Read; Lawrence C. Madoff; Alex M. Wolf; Maureen J. Beanan; Lauren M. Brinkac; Sean C. Daugherty; Robert T. Deboy; A. Scott Durkin; James F. Kolonay; Ramana Madupu; Matthew R. Lewis; Diana Radune; Nadezhda B. Fedorova; David Scanlan; Hoda Khouri; Stephanie Mulligan; Heather A. Carty; Robin T. Cline; Susan E. van Aken; John Gill; Maria Scarselli; Marirosa Mora; Emilia T. Iacobini; Cecilia Brettoni; Giuliano Galli; Massimo Mariani; Filippo Vegni; Domenico Maione; Daniela Rinaudo; Rino Rappuoli; John L. Telford; Dennis L. Kasper; Guido Grandi; Claire M. Fraser

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined

  3. A survey of tools for variant analysis of next-generation genome sequencing data

    PubMed Central

    Pabinger, Stephan; Dander, Andreas; Fischer, Maria; Snajder, Rene; Sperk, Michael; Efremova, Mirjana; Krabichler, Birgit; Speicher, Michael R.; Zschocke, Johannes

    2014-01-01

    Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers. PMID:23341494

  4. Genome sequencing and analysis of the model grass Brachypodium distachyon

    SciTech Connect

    Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

    2010-01-01

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

  5. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    SciTech Connect

    Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Foster, Brian [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. Genome sequencing and analysis of the model grass Brachypodium distachyon.

    PubMed

    2010-02-11

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

  7. The Release 6 reference sequence of the Drosophila melanogaster genome

    PubMed Central

    Carlson, Joseph W.; Wan, Kenneth H.; Park, Soo; Mendez, Ivonne; Galle, Samuel E.; Booth, Benjamin W.; Pfeiffer, Barret D.; George, Reed A.; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V.; Andreyeva, Evgeniya N.; Boldyreva, Lidiya V.; Marra, Marco; Carvalho, A. Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F.; Rubin, Gerald M.; Karpen, Gary H.

    2015-01-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

  8. Neuroblastoma | Office of Cancer Genomics

    Cancer.gov

    Neuroblastoma (NBL) is a cancer that arises in immature nerve cells of the sympathetic nervous system, primarily affecting infants and children. It can have a devastating impact on patients and their families.

  9. Targeted deep resequencing of the human cancer genome using next-generation technologies

    PubMed Central

    MYLLYKANGAS, SAMUEL; JI, HANLEE P.

    2015-01-01

    Next generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, base pair resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations. PMID:21415896

  10. Research | Office of Cancer Genomics

    Cancer.gov

    The CTD2 initiative seeks novel insights into cancer etiology that can be developed and in the future applied to improve therapeutic strategies. To achieve this goal, each center utilizes a distinct array of advanced computational and functional systems biology approaches. These methods allow reconstruction of cell-context specific gene networks that underlie each cancer subtype. The CTD2 Centers gain power from having both complementary and reinforcing expertise.

  11. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    PubMed

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  12. Mutational hotspots in the mitochondrial genome of lung cancer.

    PubMed

    Choi, So-Jung; Kim, Sung-Hyun; Kang, Ho Y; Lee, Jinseon; Bhak, Jong H; Sohn, Insuk; Jung, Sin-Ho; Choi, Yong Soo; Kim, Hong Kwan; Han, Jungho; Huh, Nam; Lee, Gyusang; Kim, Byung C; Kim, Jhingook

    2011-04-01

    We determined the somatic mutations in the mitochondrial genomes of 70 lung cancer patients by pair-wise comparative analyses of the normal- and tumor-genome sequences acquired using Affymetrix Mitochondrial Resequencing Array 2.0. The overall mutation rates in lung cancers were Approximately 100 fold higher than those in normal cells, with significant statistical correlation with smoking (p=0.00088). Total of 532 somatic mutations were evenly distributed in 499 positions with very low overall frequency (1.07/bp), but the non-synonymous mutations causing amino acid substitution occurred more frequently (1.83/bp), particularly at two positions, 8701 and 10398 (10.5/bp) that code for ATPase6 and NADH dehydrogenase 3, respectively. Despite the randomness or even distribution of the mutations, these two mutations occurred together in 86% of the cases. The linkage between the two most frequent mutations suggests that they were selected together, possibly due to their cooperative role during cancer development. Indeed, the mutation at 10398 was shown by Canter, Pezzotti, and their colleagues in 2009, as a risk factor for breast cancer. In this study, we identified two potential biomarkers that might be functionally linked together during the development of cancer. PMID:21334307

  13. Draft Genome Sequence of Pseudomonas syringae pv. persicae NCPPB 2254.

    PubMed

    Zhao, Wenjun; Jiang, Hongshan; Tian, Qian; Hu, Jie

    2015-01-01

    Pseudomonas syringae pv. persicae is a pathogen that causes bacterial decline of stone fruit. Here, we report the draft genome sequence for P. syringae pv. persicae, which was isolated from Prunus persica. PMID:26044420

  14. Complete Genome Sequence of Staphylococcus aureus Phage GRCS.

    PubMed

    Swift, Steven M; Nelson, Daniel C

    2014-01-01

    The Staphylococcus aureus phage GRCS was isolated from a sewage treatment facility in India and has shown potential for phage therapy in a mouse model of bacteremia. Here, we report the complete genome sequence of this bacteriophage. PMID:24723702

  15. Complete genome sequence of Rahnella aquatilis CIP 78.65.

    PubMed

    Martinez, Robert J; Bruce, David; Detter, Chris; Goodwin, Lynne A; Han, James; Han, Cliff S; Held, Brittany; Land, Miriam L; Mikhailova, Natalia; Nolan, Matt; Pennacchio, Len; Pitluck, Sam; Tapia, Roxanne; Woyke, Tanja; Sobecky, Patricia A

    2012-06-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis. PMID:22582378

  16. Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050

    PubMed Central

    Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

    2014-01-01

    Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

  17. Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050.

    PubMed

    Dabrazhynetskaya, Alena; Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

    2014-01-01

    Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

  18. Initial genome sequencing and analysis of multiple myeloma

    E-print Network

    Lander, Eric S.

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

  19. Draft Genome Sequence of Aneurinibacillus migulanus Strain Nagano

    PubMed Central

    Alenezi, Faizah N.; Weitz, Hedda J.; Ben Rebah, Hassen; Luptakova, Lenka; Jaspars, Marcel; Woodward, Stephen

    2015-01-01

    Aneurinibacillus migulanus is characterized by inhibition of growth of a range of plant-pathogenic bacteria and fungi. Here, we report the high-quality draft genome sequences of A. migulanus Nagano. PMID:25838487

  20. Genome sequence of the fish pathogen Flavobacterium columnare ATCC 49512

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Flavobacterium columnare is a Gram-negative, rod shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. ...

  1. Operational streamlining in a high-throughput genome sequencing center

    E-print Network

    Person, Kerry P. (Kerry Patrick)

    2006-01-01

    Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

  2. Fulfilling the Promise of a Sequenced Human Genome – Part I

    SciTech Connect

    Green, Eric [National Human Genome Research Institute

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

  3. Fulfilling the Promise of a Sequenced Human Genome – Part II

    SciTech Connect

    Green, Eric [National Human Genome Research Institute

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

  4. Draft Genome Sequence of Pseudomonas syringae pv. persicae NCPPB 2254

    PubMed Central

    Zhao, Wenjun; Tian, Qian; Hu, Jie

    2015-01-01

    Pseudomonas syringae pv. persicae is a pathogen that causes bacterial decline of stone fruit. Here, we report the draft genome sequence for P. syringae pv. persicae, which was isolated from Prunus persica. PMID:26044420

  5. Compressing Genomic Sequence Fragments Using SlimGene

    NASA Astrophysics Data System (ADS)

    Kozanitis, Christos; Saunders, Chris; Kruglyak, Semyon; Bafna, Vineet; Varghese, George

    With the advent of next generation sequencing technologies, the cost of sequencing whole genomes is poised to go below 1000 per human individual in a few years. As more and more genomes are sequenced, analysis methods are undergoing rapid development, making it tempting to store sequencing data for long periods of time so that the data can be re-analyzed with the latest techniques. The challenging open research problems, huge influx of data, and rapidly improving analysis techniques have created the need to store and transfer very large volumes of data.

  6. Complete genome sequence of a novel vitivirus isolated from grapevine.

    PubMed

    Al Rwahnih, Maher; Sudarshana, Mysore R; Uyemoto, Jerry K; Rowhani, Adib

    2012-09-01

    A novel virus-like sequence from grapevine was identified by Illumina sequencing. The complete genome is 7,551 nucleotides in length, with polyadenylation at the 3' end. Translation of the sequence revealed five open reading frames (ORFs). The genomic organization was most similar to those of vitiviruses. The polymerase (ORF1) and coat protein (ORF4) genes shared 31 to 49% nucleotide and 40 to 70% amino acid sequence identities, respectively, with other grapevine vitiviruses. The virus was tentatively named grapevine virus F (GVF). PMID:22879616

  7. A compressing method for genome sequence cluster using sequence alignment

    Microsoft Academic Search

    Kwang Su Jung; Nam Hee Yu; Seung Jung Shin; Keun Ho Ryu

    2008-01-01

    After identifying the function of a protein, biologists produce new useful proteins by substituting some residues of the identified protein. These new proteins have high sequence homology (similarity). We define a sequence cluster as a cluster that is constituted of similar sequences. As another example of a sequence cluster, we consider a SNP (single nucleotide polymorphism) cluster. A SNP is

  8. Complete Genome Sequences of Helicobacter pylori Rifampin-Resistant Strains.

    PubMed

    Momynaliev, Kuvat; Chelysheva, Vera; Selezneva, Oksana; Akopian, Tatyana; Alexeev, Dmitry; Govorun, Vadim

    2013-01-01

    Here we present the complete genome sequences of two Helicobacter pylori rifampin-resistant (Rif(r)) strains (Rif1 and Rif2). Rif(r) strains were obtained by in vitro selection of H. pylori 26695 on agar plates with 20 µg/ml rifampin. The genome data provide insights on the genomic diversity of H. pylori under selection by rifampin. PMID:23833139

  9. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  10. Complete Genome Sequence of Mycoplasma synoviae Strain WVU 1853T.

    PubMed

    May, Meghan A; Kutish, Gerald F; Barbet, Anthony F; Michaels, Dina L; Brown, Daniel R

    2015-01-01

    A hybrid sequence assembly of the complete Mycoplasma synoviae type strain WVU 1853(T) genome was compared to that of strain MS53. The findings support prior conclusions about M. synoviae, based on the genome of that otherwise uncharacterized field strain, and provide the first evidence of epigenetic modifications in M. synoviae. PMID:26021934

  11. Draft Genome Sequence of Rhodococcus sp. Strain 311R

    PubMed Central

    Ehsani, Elham; Jauregui, Ruy; Geffers, Robert; Jareck, Michael; Boon, Nico; Pieper, Dietmar H.

    2015-01-01

    Here, we report the draft genome sequence of Rhodococcus sp. strain 311R, which was isolated from a site contaminated with alkanes and aromatic compounds. Strain 311R shares 90% of the genome of Rhodococcus erythropolis SK121, which is the closest related bacteria. PMID:25999565

  12. Genome sequence of the cultivated cotton Gossypium arboreum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

  13. MAIZE CHLOROTIC DWARF VIRUS GENOME SEQUENCE AND POLYPROTEIN CLEAVAGE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomic sequence (11.8 kb) of the severe Ohio Maize chlorotic dwarf virus isolate (MCDV-S, genus Waikavirus) was determined from overlapping cDNA clones. Approximately 400 kDa polyprotein encoded by the viral genome is post-translationally cleaved into several smaller functional proteins. Wher...

  14. Draft Genome Sequence of Entomopathogenic Serratia liquefaciens Strain FK01

    PubMed Central

    Taira, Erika; Mon, Hiroaki; Mori, Kazuki; Akasaka, Taiki; Tashiro, Kousuke; Yasunaga-Aoki, Chisa; Lee, Jae Man; Kusakabe, Takahiro

    2014-01-01

    In the present study, we determined the draft genome sequence of the entomopathogenic bacterium Serratia liquefaciens FK01, which is highly virulent to the silkworm. The draft genome is ~5.28 Mb in size, and the G+C content is 55.8%. PMID:24970828

  15. Complete Genome Sequence of Antarctic Bacterium Psychrobacter sp. Strain G

    PubMed Central

    Che, Shuai; Song, Lai; Song, Weizhi; Yang, Meng

    2013-01-01

    Here, we report the complete genome sequence of Psychrobacter sp. strain G, isolated from King George Island, Antarctica, which can produce lipolytic enzymes at low temperatures. The genomics information of this strain will facilitate the study of the physiology, cold adaptation properties, and evolution of this genus. PMID:24051316

  16. Response to ‘pervasive sequence patents cover the entire human genome

    PubMed Central

    2014-01-01

    A response toPervasive sequence patents cover the entire human genome by J Rosenfeld and C Mason. Genome Med 2013, 5:27. See related Correspondence by Rosenfeld and Mason, http://genomemedicine.com/content/5/3/27 and related letter by Rosenfeld and Mason, http://genomemedicine.com/content/6/2/15 PMID:25031614

  17. Draft Genome Sequence of Pseudomonas sp. nov. H2.

    PubMed

    Loftie-Eaton, Wesley; Suzuki, Haruo; Bashford, Kelsie; Heuer, Holger; Stragier, Pieter; De Vos, Paul; Settles, Matthew L; Top, Eva M

    2015-01-01

    We report the draft genome sequence of Pseudomonas sp. nov. H2, isolated from creek sediment in Moscow, ID, USA. The strain is most closely related to Pseudomonas putida. However, it has a slightly smaller genome that appears to have been impacted by horizontal gene transfer and poorly maintains IncP-1 plasmids. PMID:25838493

  18. Complete Genome Sequence of Mycoplasma synoviae Strain WVU 1853T

    PubMed Central

    Kutish, Gerald F.; Barbet, Anthony F.; Michaels, Dina L.

    2015-01-01

    A hybrid sequence assembly of the complete Mycoplasma synoviae type strain WVU 1853T genome was compared to that of strain MS53. The findings support prior conclusions about M. synoviae, based on the genome of that otherwise uncharacterized field strain, and provide the first evidence of epigenetic modifications in M. synoviae. PMID:26021934

  19. Genome Sequence of a Salinibacterium sp. Isolated from Antarctic Soil

    PubMed Central

    Shin, Seung Chul; Kim, Su Jin; Ahn, Do Hwan; Lee, Jong Kyu; Lee, Hyoungseok; Lee, Jungeun; Hong, Soon Gyu; Lee, Yung Mi

    2012-01-01

    The draft genome of Salinibacterium sp. PAMC 21357, isolated from permafrost soil of Antarctica, was determined. Here we present a 3.1-Mb draft genome sequence of Salinibacterium sp. that could provide further insight into the genetic determination of its cold-adaptive properties. PMID:22493208

  20. Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium

    PubMed Central

    Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  1. Alfresco---A Workbench for Comparative Genomic Sequence Analysis

    Microsoft Academic Search

    Niclas Jareborg; Richard Durbin

    2000-01-01

    Comparative analysis of genomic sequences provides a powerful tool for identifying regions of potential biologic function; by comparing corresponding regions of genomes from suitable species, protein coding or regulatory regions can be identified by their homology. This requires the use of several specific types of computational analysis tools. Many programs exist for these types of analysis; not many exist for

  2. Complete Genome Sequence of Lactococcus lactis subsp. cremoris A76

    PubMed Central

    Quinquis, Benoit; Ehrlich, Stanislas Dusko; Sorokin, Alexei

    2012-01-01

    We report the complete genome sequence of Lactococcus lactis subsp. cremoris A76, a dairy strain isolated from a cheese production outfit. Genome analysis detected two contiguous islands fitting to the L. lactis subsp. lactis rather than to the L. lactis subsp. cremoris lineage. This indicates the existence of genetic exchange between the diverse subspecies, presumably related to the technological process. PMID:22328746

  3. Complete genome sequence of Lactococcus lactis subsp. cremoris A76.

    PubMed

    Bolotin, Alexander; Quinquis, Benoit; Ehrlich, Stanislas Dusko; Sorokin, Alexei

    2012-03-01

    We report the complete genome sequence of Lactococcus lactis subsp. cremoris A76, a dairy strain isolated from a cheese production outfit. Genome analysis detected two contiguous islands fitting to the L. lactis subsp. lactis rather than to the L. lactis subsp. cremoris lineage. This indicates the existence of genetic exchange between the diverse subspecies, presumably related to the technological process. PMID:22328746

  4. Complete genome sequence of Pronghorn Virus, a Pestivirus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  5. Complete genome sequence of pronghorn virus, a pestivirus.

    PubMed

    Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

    2014-01-01

    The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

  6. RESEARCH Open Access Genomic and small RNA sequencing of

    E-print Network

    Green, Pamela

    . Included within the Andropogoneae are major crops such as maize, Sorghum bicolor (sorghum), sugarcane of sorghum as a reference genome sequence for Andropogoneae grasses Kankshita Swaminathan1,2 , Magdy origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome

  7. Draft Genome Sequence of Rhodococcus sp. Strain 311R.

    PubMed

    Ehsani, Elham; Jauregui, Ruy; Geffers, Robert; Jareck, Michael; Boon, Nico; Pieper, Dietmar H; Vilchez-Vargas, Ramiro

    2015-01-01

    Here, we report the draft genome sequence of Rhodococcus sp. strain 311R, which was isolated from a site contaminated with alkanes and aromatic compounds. Strain 311R shares 90% of the genome of Rhodococcus erythropolis SK121, which is the closest related bacteria. PMID:25999565

  8. Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila

    Microsoft Academic Search

    Hiromi Takarada; Mitsuo Sekine; Hiroki Kosugi; Yasunori Matsuo; Takatomo Fujisawa; Seiha Omata; Emi Kishi; Ai Shimizu; Naofumi Tsukatani; Satoshi Tanikawa; Nobuyuki Fujita; Shigeaki Harayama

    2008-01-01

    The soil actinomycete Kocuria rhizophila belongs to the suborder Micrococcineae, a divergent bacterial group for which only a limited amount of genomic information is currently available. K. rhizophila is also important in industrial applications; e.g., it is commonly used as a standard quality control strain for antimicrobial susceptibility testing. Sequencing and annotation of the genome of K. rhizophila DC2201 (NBRC

  9. Fractals related to long DNA sequences and complete genomes

    Microsoft Academic Search

    Bai-Lin Hao; H. C. Lee; Shu-Yu Zhang

    2000-01-01

    In visualizing very long DNA sequences, including the complete genomes of several bacteria, yeast and segments of human genes, we encounter fractal-like patterns underlying these biological objects of prominent importance. The method used here to visualize genomes of organisms may well be used as a convenient tool to trace, e.g., evolutionary relatedness of species. We describe the method and explain

  10. The Complete Genome Sequence of Mycoplasma bovis Strain Hubei-1

    Microsoft Academic Search

    Yuan Li; Huajun Zheng; Yang Liu; Yanwei Jiang; Jiuqing Xin; Wei Chen; Zhiqiang Song; Herman Tse

    2011-01-01

    Infection by Mycoplasma bovis (M. bovis) can induce diseases, such as pneumonia and otitis media in young calves and mastitis and arthritis in older animals. Here, we report the finished and annotated genome sequence of M. bovis strain Hubei-1, a strain isolated in 2008 that caused calf pneumonia on a Chinese farm. The genome of M. bovis strain Hubei-1 contains

  11. A snapshot of the emerging tomato genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

  12. Genomic sequence analysis and characterization of Sneathia amnii sp. nov

    PubMed Central

    2012-01-01

    Background Bacteria of the genus Sneathia are emerging as potential pathogens of the female reproductive tract. Species of Sneathia, which were formerly grouped with Leptotrichia, can be part of the normal microbiota of the genitourinary tracts of men and women, but they are also associated with a variety of clinical conditions including bacterial vaginosis, preeclampsia, preterm labor, spontaneous abortion, post-partum bacteremia and other invasive infections. Sneathia species also exhibit a significant correlation with sexually transmitted diseases and cervical cancer. Because Sneathia species are fastidious and rarely cultured successfully in vitro; and the genomes of members of the genus had until now not been characterized, very little is known about the physiology or the virulence of these organisms. Results Here, we describe a novel species, Sneathia amnii sp. nov, which closely resembles bacteria previously designated "Leptotrichia amnionii". As part of the Vaginal Human Microbiome Project at VCU, a vaginal isolate of S. amnii sp. nov. was identified, successfully cultured and bacteriologically cloned. The biochemical characteristics and virulence properties of the organism were examined in vitro, and the genome of the organism was sequenced, annotated and analyzed. The analysis revealed a reduced circular genome of ~1.34 Mbp, containing ~1,282 protein-coding genes. Metabolic reconstruction of the bacterium reflected its biochemical phenotype, and several genes potentially associated with pathogenicity were identified. Conclusions Bacteria with complex growth requirements frequently remain poorly characterized and, as a consequence, their roles in health and disease are unclear. Elucidation of the physiology and identification of genes putatively involved in the metabolism and virulence of S. amnii may lead to a better understanding of the role of this potential pathogen in bacterial vaginosis, preterm birth, and other issues associated with vaginal and reproductive health. PMID:23281612

  13. Draft genome sequence of Therminicola potens strain JR

    SciTech Connect

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  14. Sequence and Organization of the Neodiprion lecontei Nucleopolyhedrovirus Genome

    Microsoft Academic Search

    Hilary A. M. Lauzon; Christopher J. Lucarotti; Peter J. Krell; Qili Feng; Arthur Retnakaran; Basil M. Arif

    2004-01-01

    All fully sequenced baculovirus genomes, with the exception of the dipteran Culex nigripalpus nucleopoly- hedrovirus (CuniNPV), have previously been from Lepidoptera. This study reports the sequencing and char- acterization of a hymenopteran baculovirus, Neodiprion lecontei nucleopolyhedrovirus (NeleNPV), from the red- headed pine sawfly. NeleNPV has the smallest genome so far published (81,755 bp) and has a GC content of only

  15. The Genome Sequence of the SARS-Associated Coronavirus

    Microsoft Academic Search

    Marco A. Marra; Steven J. M. Jones; Caroline R. Astell; Robert A. Holt; Angela Brooks-Wilson; Yaron S. N. Butterfield; Jaswinder Khattra; Jennifer K. Asano; Sarah A. Barber; Susanna Y. Chan; Alison Cloutier; Shaun M. Coughlin; Doug Freeman; Noreen Girn; Obi L. Griffith; Stephen R. Leach; Michael Mayo; Helen McDonald; Stephen B. Montgomery; Pawan K. Pandoh; Anca S. Petrescu; A. Gordon Robertson; Jacqueline E. Schein; Asim Siddiqui; Duane E. Smailus; Jeff M. Stott; George S. Yang; Francis Plummer; Anton Andonov; Harvey Artsob; Nathalie Bastien; Kathy Bernard; Timothy F. Booth; Donnie Bowness; Michael Drebot; Lisa Fernando; Ramon Flick; Michael Garbutt; Michael Garbutt; Allen Grolla; Heinz Feldmann; Adrienne Meyers; Amin Kabani; Yan Li; Susan Normand; Ute Stroher; Graham A. Tipples; Shaun Tyler; Robert Vogrig; Diane Ward; Robert C. Brunham; Mel Krajden; Martin Petric; Danuta M. Skowronski; Chris Upton; Rachel L. Roper

    2003-01-01

    We sequenced the 29,751-base genome of the severe acute respiratory syndrome (SARS)-associated coronavirus known as the Tor2 isolate. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously

  16. Draft genome sequence of Gluconobacter thailandicus NBRC 3257

    PubMed Central

    Matsutani, Minenosuke; Yakushi, Toshiharu

    2014-01-01

    Gluconobacter thailandicus strain NBRC 3257, isolated from downy cherry (Prunus tomentosa), is a strict aerobic rod-shaped Gram-negative bacterium. Here, we report the features of this organism, together with the draft genome sequence and annotation. The draft genome sequence is composed of 107 contigs for 3,446,046 bp with 56.17% G+C content and contains 3,360 protein-coding genes and 54 RNA genes. PMID:25197448

  17. Genome sequence of the biocontrol strain Pseudomonas fluorescens F113.

    PubMed

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A; Giddens, Stephen R; Coppoolse, Eric R; Muriel, Candela; Stiekema, Willem J; Rainey, Paul B; Dowling, David; O'Gara, Fergal; Martín, Marta; Rivilla, Rafael

    2012-03-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  18. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    PubMed Central

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  19. Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii

    Microsoft Academic Search

    Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

    1996-01-01

    The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

  20. Comparative Genome Analysis at the Sequence Level in the Brassicaceae

    Microsoft Academic Search

    Chris Town; Renate Schmidt; Ian Bancroft

    \\u000a In the world of plant genome sequencing, the cultivated Brassica species have been relatively under-resourced compared with other crop species largely due to their position in the economic\\u000a hierarchy of perceived importance. Thus, with the completion of the Arabidopsis thaliana genome in the year 2000, the limited sequencing efforts undertaken in the Brassica crops and other species of the Brassicaceae

  1. Complete chloroplast genome sequences of Solanum bulbocastanum , Solanum lycopersicum and comparative analyses with other Solanaceae genomes

    Microsoft Academic Search

    Henry Daniell; Seung-Bum Lee; Justin Grevich; Christopher Saski; Tania Quesada-Vargas; Chittibabu Guda; Jeffrey Tomkins; Robert K. Jansen

    2006-01-01

    Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and

  2. Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani

    Microsoft Academic Search

    Holger Br; Gerhard Gottschalkb

    The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

  3. Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani

    Microsoft Academic Search

    Holger Brüggemann; Gerhard Gottschalk

    2004-01-01

    The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

  4. Spatial genomic heterogeneity within localized, multifocal prostate cancer.

    PubMed

    Boutros, Paul C; Fraser, Michael; Harding, Nicholas J; de Borja, Richard; Trudel, Dominique; Lalonde, Emilie; Meng, Alice; Hennings-Yeomans, Pablo H; McPherson, Andrew; Sabelnykova, Veronica Y; Zia, Amin; Fox, Natalie S; Livingstone, Julie; Shiah, Yu-Jia; Wang, Jianxin; Beck, Timothy A; Have, Cherry L; Chong, Taryne; Sam, Michelle; Johns, Jeremy; Timms, Lee; Buchner, Nicholas; Wong, Ada; Watson, John D; Simmons, Trent T; P'ng, Christine; Zafarana, Gaetano; Nguyen, Francis; Luo, Xuemei; Chu, Kenneth C; Prokopec, Stephenie D; Sykes, Jenna; Dal Pra, Alan; Berlin, Alejandro; Brown, Andrew; Chan-Seng-Yue, Michelle A; Yousif, Fouad; Denroche, Robert E; Chong, Lauren C; Chen, Gregory M; Jung, Esther; Fung, Clement; Starmans, Maud H W; Chen, Hanbo; Govind, Shaylan K; Hawley, James; D'Costa, Alister; Pintilie, Melania; Waggott, Daryl; Hach, Faraz; Lambin, Philippe; Muthuswamy, Lakshmi B; Cooper, Colin; Eeles, Rosalind; Neal, David; Tetu, Bernard; Sahinalp, Cenk; Stein, Lincoln D; Fleshner, Neil; Shah, Sohrab P; Collins, Colin C; Hudson, Thomas J; McPherson, John D; van der Kwast, Theodorus; Bristow, Robert G

    2015-07-01

    Herein we provide a detailed molecular analysis of the spatial heterogeneity of clinically localized, multifocal prostate cancer to delineate new oncogenes or tumor suppressors. We initially determined the copy number aberration (CNA) profiles of 74 patients with index tumors of Gleason score 7. Of these, 5 patients were subjected to whole-genome sequencing using DNA quantities achievable in diagnostic biopsies, with detailed spatial sampling of 23 distinct tumor regions to assess intraprostatic heterogeneity in focal genomics. Multifocal tumors are highly heterogeneous for single-nucleotide variants (SNVs), CNAs and genomic rearrangements. We identified and validated a new recurrent amplification of MYCL, which is associated with TP53 deletion and unique profiles of DNA damage and transcriptional dysregulation. Moreover, we demonstrate divergent tumor evolution in multifocal cancer and, in some cases, tumors of independent clonal origin. These data represent the first systematic relation of intraprostatic genomic heterogeneity to predicted clinical outcome and inform the development of novel biomarkers that reflect individual prognosis. PMID:26005866

  5. Medulloblastoma | Office of Cancer Genomics

    Cancer.gov

    CGCI developed the Medulloblastoma Project to apply newly emerging genomic methods towards the discovery of novel genetic alterations in medulloblastoma (MB)Opens in a New Tab. MB is the most common malignant brain tumor in children, accounting for approximately 20% of all pediatric brain tumors. Despite significant progress in treatment over the last several decades, about 50% of MB patients do not live more than 5 years after diagnosis.

  6. Genome sequence of the date palm Phoenix dactylifera L

    PubMed Central

    Al-Mssallem, Ibrahim S.; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M.; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O.; Jia, Shangang; Yin, An; Alhuzimi, Eman M.; Alsaihati, Burair A.; Al-Owayyed, Saad A.; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A.; Sun, Gaoyuan; Majrashi, Majed A.; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A.; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F.; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R.; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

    2013-01-01

    Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm’s unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

  7. Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences

    PubMed Central

    Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

    2002-01-01

    As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

  8. Characterizing the walnut genome through analyses of BAC end sequences.

    PubMed

    Wu, Jiajie; Gu, Yong Q; Hu, Yuqin; You, Frank M; Dandekar, Abhaya M; Leslie, Charles A; Aradhya, Mallikarjuna; Dvorak, Jan; Luo, Ming-Cheng

    2012-01-01

    Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2 Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut. PMID:22101470

  9. Cancer genomics: why rare is valuable.

    PubMed

    Jamshidi, Farzad; Nielsen, Torsten O; Huntsman, David G

    2015-04-01

    Rare conditions are sometimes ignored in biomedical research because of difficulties in obtaining specimens and limited interest from fund raisers. However, the study of rare diseases such as unusual cancers has again and again led to breakthroughs in our understanding of more common diseases. It is therefore unsurprising that with the development and accessibility of next-generation sequencing, much has been learnt from studying cancers that are rare and in particular those with uniform biological and clinical behavior. Herein, we describe how shotgun sequencing of cancers such as granulosa cell tumor, endometrial stromal sarcoma, epithelioid hemangioendothelioma, ameloblastoma, small-cell carcinoma of the ovary, clear-cell carcinoma of the ovary, nonepithelial ovarian tumors, chondroblastoma, and giant cell tumor of the bone has led to rapidly translatable discoveries in diagnostics and tumor taxonomies, as well as providing insights into cancer biology. PMID:25676695

  10. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    Microsoft Academic Search

    Inês C. Conceição; Anthony D. Long; Jonathan D. Gruber; Patrícia Beldade

    2011-01-01

    BackgroundAnalysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available

  11. Choosing a benchtop sequencing machine to characterise Helicobacter pylori genomes.

    PubMed

    Perkins, Timothy T; Tay, Chin Yen; Thirriot, Fanny; Marshall, Barry

    2013-01-01

    The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori. PMID:23840736

  12. Choosing a Benchtop Sequencing Machine to Characterise Helicobacter pylori Genomes

    PubMed Central

    Perkins, Timothy T.; Tay, Chin Yen; Thirriot, Fanny; Marshall, Barry

    2013-01-01

    The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori. PMID:23840736

  13. Endoplasmic Reticulum Stress, Genome Damage, and Cancer

    PubMed Central

    Dicks, Naomi; Gutierrez, Karina; Michalak, Marek; Bordignon, Vilceu; Agellon, Luis B.

    2015-01-01

    Endoplasmic reticulum (ER) stress has been linked to many diseases, including cancer. A large body of work has focused on the activation of the ER stress response in cancer cells to facilitate their survival and tumor growth; however, there are some studies suggesting that the ER stress response can also mitigate cancer progression. Despite these contradictions, it is clear that the ER stress response is closely associated with cancer biology. The ER stress response classically encompasses activation of three separate pathways, which are collectively categorized the unfolded protein response (UPR). The UPR has been extensively studied in various cancers and appears to confer a selective advantage to tumor cells to facilitate their enhanced growth and resistance to anti-cancer agents. It has also been shown that ER stress induces chromatin changes, which can also facilitate cell survival. Chromatin remodeling has been linked with many cancers through repression of tumor suppressor and apoptosis genes. Interplay between the classic UPR and genome damage repair mechanisms may have important implications in the transformation process of normal cells into cancer cells. PMID:25692096

  14. Endoplasmic reticulum stress, genome damage, and cancer.

    PubMed

    Dicks, Naomi; Gutierrez, Karina; Michalak, Marek; Bordignon, Vilceu; Agellon, Luis B

    2015-01-01

    Endoplasmic reticulum (ER) stress has been linked to many diseases, including cancer. A large body of work has focused on the activation of the ER stress response in cancer cells to facilitate their survival and tumor growth; however, there are some studies suggesting that the ER stress response can also mitigate cancer progression. Despite these contradictions, it is clear that the ER stress response is closely associated with cancer biology. The ER stress response classically encompasses activation of three separate pathways, which are collectively categorized the unfolded protein response (UPR). The UPR has been extensively studied in various cancers and appears to confer a selective advantage to tumor cells to facilitate their enhanced growth and resistance to anti-cancer agents. It has also been shown that ER stress induces chromatin changes, which can also facilitate cell survival. Chromatin remodeling has been linked with many cancers through repression of tumor suppressor and apoptosis genes. Interplay between the classic UPR and genome damage repair mechanisms may have important implications in the transformation process of normal cells into cancer cells. PMID:25692096

  15. Overview | Office of Cancer Genomics

    Cancer.gov

    The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative uses comprehensive molecular characterization to determine the genetic changes that drive the initiation and progression of hard-to-treat childhood cancers. TARGET aims to identify therapeutic targets and prognostic markers so that new, more effective treatment strategies can be developed and applied.

  16. RESTseq – Efficient Benchtop Population Genomics with RESTriction Fragment SEQuencing

    PubMed Central

    Stolle, Eckart; Moritz, Robin F. A.

    2013-01-01

    We present RESTseq, an improved approach for a cost efficient, highly flexible and repeatable enrichment of DNA fragments from digested genomic DNA using Next Generation Sequencing platforms including small scale Personal Genome sequencers. Easy adjustments make it suitable for a wide range of studies requiring SNP detection or SNP genotyping from fine-scale linkage mapping to population genomics and population genetics also in non-model organisms. We demonstrate the validity of our approach by comparing two honeybee and several stingless bee samples. PMID:23691128

  17. Complete genome sequence of Serratia plymuthica strain AS12

    SciTech Connect

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  18. A standard variation file format for human genome sequences.

    PubMed

    Reese, Martin G; Moore, Barry; Batchelor, Colin; Salas, Fidel; Cunningham, Fiona; Marth, Gabor T; Stein, Lincoln; Flicek, Paul; Yandell, Mark; Eilbeck, Karen

    2010-01-01

    Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment. PMID:20796305

  19. Complete genome sequence of Ferroglobus placidus AEDII12DO

    SciTech Connect

    Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Risso, Carla [University of Massachusetts, Amherst; Holmes, Dawn [University of Massachusetts, Amherst; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Lovley, Derek [University of Massachusetts, Amherst; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryar- chaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemoli- thoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and anno- tation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was se- quenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project.

  20. Complete mitochondrial genome sequence of Aoluguya reindeer (Rangifer tarandus).

    PubMed

    Ju, Yan; Liu, Huamiao; Rong, Min; Yang, Yifeng; Wei, Haijun; Shao, Yuanchen; Chen, Xiumin; Xing, Xiumei

    2014-12-01

    Abstract The complete mitochondria genome of the reindeer, Rangifer tarandus, was determined by accurate polymerase chain reaction. The entire genome is 16,357?bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a D-loop region, all of which are arranged in a typical vertebrate manner. The overall base composition of the reindeer's mitochondrial genome is 33.7% of A, 23.1% of C, 30.1% of T and 13.2%of G. A termination associated sequence and several conserved central sequence block domains were discovered within the control region. PMID:25469816

  1. Analysis of Common k-mers for Whole Genome Sequences Using SSB-Tree

    Microsoft Academic Search

    Jeong-Hyeon Choi Hwan-Gue Cho

    2002-01-01

    As sequenced genomes become larger and sequencing process becomes faster, there is a need to develop a tool to analyze sequences in the whole genomic scale. However, on-memory algorithms such as sux tree and sux array are not applicable to the analysis of whole genome sequence set, since the size of individual whole genome ranges from several million base pairs

  2. Improved Yield and Diverse Finished Bacterial Genomes using Pacific Biosciences RS II SMRT Sequencing

    E-print Network

    Weber, David J.

    genome sequencing in our center. Further, using comparative Illumina sequencing, we found a median of one Evaluation As one measure of genome consensus sequence quality, we used Illumina MiSeq 250bp PE data to align to complete genomes sequenced using PacBio data alone and assembled using one of three genome assemblers. We

  3. GS-Aligner: A Novel Tool for Aligning Genomic Sequences Using Bit-Level Operations

    Microsoft Academic Search

    Arthur Chun-Chieh Shih; Wen-Hsiung Li

    2003-01-01

    A novel algorithm, GS-Aligner, that uses bit-level operations was developed for aligning genomic sequences. GS- Aligner is efficient in terms of both time and space for aligning two very long genomic sequences and for identifying genomic rearrangements such as translocations and inversions. It is suitable for aligning fairly divergent sequences such as human and mouse genomic sequences. It consists of

  4. Sequence analysis and organization of the Neodiprion abietis nucleopolyhedrovirus genome.

    PubMed

    Duffy, Simon P; Young, Aaron M; Morin, Benoit; Lucarotti, Christopher J; Koop, Ben F; Levin, David B

    2006-07-01

    Of 30 baculovirus genomes that have been sequenced to date, the only nonlepidopteran baculoviruses include the dipteran Culex nigripalpus nucleopolyhedrovirus and two hymenopteran nucleopolyhedroviruses that infect the sawflies Neodiprion lecontei (NeleNPV) and Neodiprion sertifer (NeseNPV). This study provides a complete sequence and genome analysis of the nucleopolyhedrovirus that infects the balsam fir sawfly Neodiprion abietis (Hymenoptera, Symphyta, Diprionidae). The N. abietis nucleopolyhedrovirus (NeabNPV) is 84,264 bp in size, with a G+C content of 33.5%, and contains 93 predicted open reading frames (ORFs). Eleven predicted ORFs are unique to this baculovirus, 10 ORFs have a putative sequence homologue in the NeleNPV genome but not the NeseNPV genome, and 1 ORF (neab53) has a putative sequence homologue in the NeseNPV genome but not the NeleNPV genome. Specific repeat sequences are coincident with major genome rearrangements that distinguish NeabNPV and NeleNPV. Genes associated with these repeat regions encode a common amino acid motif, suggesting that they are a family of repeated contiguous gene clusters. Lepidopteran baculoviruses, similarly, have a family of repeated genes called the bro gene family. However, there is no significant sequence similarity between the NeabNPV and bro genes. Homologues of early-expressed genes such as ie-1 and lef-3 were absent in NeabNPV, as they are in the previously sequenced hymenopteran baculoviruses. Analyses of ORF upstream sequences identified potential temporally distinct genes on the basis of putative promoter elements. PMID:16809301

  5. The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data.

    PubMed

    Wilks, Christopher; Cline, Melissa S; Weiler, Erich; Diehkans, Mark; Craft, Brian; Martin, Christy; Murphy, Daniel; Pierce, Howdy; Black, John; Nelson, Donavan; Litzinger, Brian; Hatton, Thomas; Maltbie, Lori; Ainsworth, Michael; Allen, Patrick; Rosewood, Linda; Mitchell, Elizabeth; Smith, Bradley; Warner, Jim; Groboske, John; Telc, Haifang; Wilson, Daniel; Sanford, Brian; Schmidt, Hannes; Haussler, David; Maltbie, Daniel

    2014-01-01

    The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4?PB of data, has grown at an average rate of 50?TB a month and serves >100?TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu. PMID:25267794

  6. The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data

    PubMed Central

    Wilks, Christopher; Cline, Melissa S.; Weiler, Erich; Diehkans, Mark; Craft, Brian; Martin, Christy; Murphy, Daniel; Pierce, Howdy; Black, John; Nelson, Donavan; Litzinger, Brian; Hatton, Thomas; Maltbie, Lori; Ainsworth, Michael; Allen, Patrick; Rosewood, Linda; Mitchell, Elizabeth; Smith, Bradley; Warner, Jim; Groboske, John; Telc, Haifang; Wilson, Daniel; Sanford, Brian; Schmidt, Hannes; Haussler, David; Maltbie, Daniel

    2014-01-01

    The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4?PB of data, has grown at an average rate of 50?TB a month and serves >100?TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu PMID:25267794

  7. A nucleotide composition constraint of genome sequences.

    PubMed

    Zhang, Chun-Ting; Zhang, Ren

    2004-04-01

    Let a, c, g and t denote the occurrence frequencies of A, C, G and T, respectively, in a genome. We calculated the statistical quantity S = a2 + c2 + g2 + t2 for each of 809 genomes (11 archaea, 42 bacteria, 3 eukaryota, 90 phages, 36 viroids and 627 viruses) and 236 plasmids. We found that S < 1/3 is strictly valid for almost all of the above genomes or plasmids. As a direct deduction of the above observation, it is shown that (i) the statistical quantity S is a kind of genome order index, which is negatively correlated with the Shannon H function; (ii) S < 1/3 suggests that a minimal value of the Shannon H function is required for each genome; (iii) S defined above would be a new biological statistical quantity, useful to describe the composition features of genomes; (iv) By jointly considering the Chargaff Parity Rule 2, it is shown that the genomic G + C content should be in between 0.211 and 0.789. PMID:15130543

  8. Genome sequence and comparative genome analysis of Pseudomonas syringae pv. syringae type strain ATCC 19310.

    PubMed

    Park, Yong-Soon; Jeong, Haeyoung; Sim, Young Mi; Yi, Hwe-Su; Ryu, Choong-Min

    2014-04-01

    Pseudomonas syringae pv. syringae (Psy) is a major bacterial pathogen of many economically important plant species. Despite the severity of its impact, the genome sequence of the type strain has not been reported. Here, we present the draft genome sequence of Psy ATCC 19310. Comparative genomic analysis revealed that Psy ATCC 19310 is closely related to Psy B728a. However, only a few type III effectors, which are key virulence factors, are shared by the two strains, indicating the possibility of host-pathogen specificity and genome dynamics, even under the pathovar level. PMID:24444998

  9. Quantifying Genome Editing Outcomes at Endogenous Loci using SMRT Sequencing

    PubMed Central

    Clark, Joseph; Punjya, Niraj; Sebastiano, Vittorio; Bao, Gang; Porteus, Matthew H

    2014-01-01

    SUMMARY Targeted genome editing with engineered nucleases has transformed the ability to introduce precise sequence modifications at almost any site within the genome. A major obstacle to probing the efficiency and consequences of genome editing is that no existing method enables the frequency of different editing events to be simultaneously measured across a cell population at any endogenous genomic locus. We have developed a novel method for quantifying individual genome editing outcomes at any site of interest using single molecule real time (SMRT) DNA sequencing. We show that this approach can be applied at various loci, using multiple engineered nuclease platforms including TALENs, RNA guided endonucleases (CRISPR/Cas9), and ZFNs, and in different cell lines to identify conditions and strategies in which the desired engineering outcome has occurred. This approach facilitates the evaluation of new gene editing technologies and permits sensitive quantification of editing outcomes in almost every experimental system used. PMID:24685129

  10. Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes

    Microsoft Academic Search

    Sotaro Chiba; Hideki Kondo; Akio Tani; Daisuke Saisho; Wataru Sakamoto; Satoko Kanematsu; Nobuhiro Suzuki

    2011-01-01

    Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The

  11. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  12. Complete genome sequence of Streptobacillus moniliformis type strain (9901T)

    SciTech Connect

    Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Sims, David [Los Alamos National Laboratory (LANL); Meincke, Linda [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sproer, Cathrin [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL)

    2009-01-01

    Streptobacillus moniliformis Levaditi et al. 1925 is the sole and type species of the genus, and is of phylogenetic interest because of its isolated location in the sparsely populated and neither taxonomically nor genomically much accessed family 'Leptotrichiaceae' within the phylum 'Fusobacteria'. S. moniliformis, a Gram-negative, non-motile and pleomorphic bacterium, is the etiologic agent of rat bite fever and Haverhill fever. Strain 9901T, the type strain of the species, was isolated from a patient with rat bite fever. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is only the second completed genome sequence of the order 'Fusobacteriales' and no more than the third sequence from the phylum 'Fusobacteria'. The 1,662,578 bp long chromosome and the 10,702 bp plasmid with a total of 1511 protein-coding and 55 RNA genes are part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Publications | Office of Cancer Genomics

    Cancer.gov

    These results establish that the CRISPR system can be used as a modular and flexible DNA-binding platform for the recruitment of proteins to a target DNA sequence, revealing the potential of CRISPRi as a general tool for the precise regulation of gene expression in eukaryotic cells.

  14. Detecting selection using a single genome sequence of

    E-print Network

    Plotkin, Joshua B.

    on different stages of an organism's life cycle: genes expressed in the ring stage4 of P. falciparum are under differential selective pressures on genes by inspecting a single genome sequence for a footprint of non-synonymous substitutions. Our method rests on a simple observation: if a protein coding region of a nucleotide sequence has

  15. GENOMIC SEQUENCE ANALYSIS OF LEPTOSPIRA BORGPETERSENII SEROVAR HARDJO

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A genomic library from Leptospira borgpetersenii serovar hardjo strain JB197 was prepared by mechanically shearing the DNA and inserting it into a positive selection vector. DNA was prepared from approximately 22,000 random clones and used as templates for automated sequencing. Sequence data was c...

  16. Nucleotide Sequence of Potato Virus Y (N Strain) Genomic RNA

    Microsoft Academic Search

    CHRISTOPHE ROBAGLIA; M. Durand-Tardif; M. Tronchet; G. Boudazin; S. Astier-Manifacier; F. Casse-Delbart

    1989-01-01

    SUMMARY The complete nucleotide sequence of the genomic RNA of the potyvirus potato virus Y strain N (PVYn) was obtained from cloned cDNAs. This sequence is 9704 nucleotides long and can encode a polyprotein of 3063 amino acids. The positions of the cleavage sites at the N terminus of the capsid and cytoplasmic inclusion proteins have been determined. Other putative

  17. Interpreting the Human Genome Sequence, Using Stochastic Grammars

    Microsoft Academic Search

    Richard Durbin

    2001-01-01

    The 3 billion base pair sequence of the human genome is now available, and attention is focusing on annotating it to extract biological meaning. I will discuss what we have obtained, and the methods that are being used to analyse biological sequences. In particular I will discuss approaches using stochastic grammars analogous to those used in computational linguistics, both for

  18. Draft Genome Sequences of Two Toxigenic Corynebacterium ulcerans Strains

    PubMed Central

    Fournier, Eric; Massé, Cynthia; Charest, Hugues; Bernard, Kathryn; Côté, Jean-Charles; Tremblay, Cécile

    2015-01-01

    Here, we present the draft genome sequences of two toxigenic Corynebacterium ulcerans strains isolated from two different patients: one from a blood sample and the other from a scar exudate following surgery. Although these two strains harbor the diphtheria toxin gene tox, no full prophage sequences were found in the flanking regions. PMID:26112794

  19. Characterization of microsatellites revealed by genomic sequencing of Populus trichocarpa

    Microsoft Academic Search

    Gerald A. Tuskan; Lee E. Gunter; Zamin K. Yang; TongMing Yin; Mitchell M. Sewell; Stephen P. DiFazio

    2004-01-01

    Microsatellites or simple sequence repeats (SSRs) are highly polymorphic, codominant markers that have great value for the construction of genetic maps, comparative mapping, population genetic surveys, and paternity analy- ses. Here, we report the development and testing of a set of SSR markers derived from shotgun sequencing from Populus trichocarpa Torr. & A. Gray, a nonenriched genomic DNA library, and

  20. Targeted enrichment of genomic DNA regions for next generation sequencing

    Microsoft Academic Search

    F. Mertens; A. El-Sharawy; S. Sauer; J. Van Helvoort; P. J. Van der Zaag; A. Franke; M. Nilsson; Lehrach. H; A. Brookes

    2011-01-01

    In this review we discuss the latest targeted enrichment methods, and aspects of their utilization along with second generation sequencing for complex genome analysis. In doing so we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next generation sequencing has made great progress

  1. Complete Genome Sequence of the Alfalfa latent virus

    PubMed Central

    Shao, Jonathan; Postnikova, Olga A.

    2015-01-01

    The first complete genome sequence of the Alfalfa latent carlavirus (ALV) was obtained by primer walking and Illumina RNA sequencing. The virus differs substantially from the Czech ALV isolate and the Pea streak virus isolate from Wisconsin. The absence of a clear nucleic acid-binding protein indicates ALV divergence from other carlaviruses. PMID:25883281

  2. Complete Genomic Sequence of Issyk-Kul Virus.

    PubMed

    Atkinson, Barry; Marston, Denise A; Ellis, Richard J; Fooks, Anthony R; Hewson, Roger

    2015-01-01

    Issyk-Kul virus (ISKV) is an ungrouped virus tentatively assigned to the Bunyaviridae family and is associated with an acute febrile illness in several central Asian countries. Using next-generation sequencing technologies, we report here the full-genome sequence for this novel unclassified arboviral pathogen circulating in central Asia. PMID:26139711

  3. Environmental Genome Shotgun Sequencing of the Sargasso Sea

    Microsoft Academic Search

    J. Craig Venter; Karin Remington; John F. Heidelberg; Aaron L. Halpern; Doug Rusch; Dongying Wu; Ian Paulsen; Karen E. Nelson; William Nelson; Derrick E. Fouts; Samuel Levy; Anthony H. Knap; Michael W. Lomas; Ken Nealson; Owen White; Jeremy Peterson; Jeff Hoffman; Rachel Parsons; Holly Baden-Tillson; Cynthia Pfannkoch; Yu-Hui Rogers; Hamilton O. Smith

    2004-01-01

    We have applied ``whole-genome shotgun sequencing'' to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These

  4. Complete Genomic Sequence of Issyk-Kul Virus

    PubMed Central

    Marston, Denise A.; Ellis, Richard J.; Fooks, Anthony R.; Hewson, Roger

    2015-01-01

    Issyk-Kul virus (ISKV) is an ungrouped virus tentatively assigned to the Bunyaviridae family and is associated with an acute febrile illness in several central Asian countries. Using next-generation sequencing technologies, we report here the full-genome sequence for this novel unclassified arboviral pathogen circulating in central Asia. PMID:26139711

  5. Genomic sequencing of single microbial cells from environmental samples.

    PubMed

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S

    2008-06-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification. Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing. PMID:18550420

  6. Complete genome sequencing and variant analysis of a Pakistani individual.

    PubMed

    Azim, Muhammad Kamran; Yang, Chuanchun; Yan, Zhixiang; Choudhary, Muhammad Iqbal; Khan, Asifullah; Sun, Xiao; Li, Ran; Asif, Huma; Sharif, Sana; Zhang, Yong

    2013-09-01

    We sequenced the genome of a Pakistani male at 25.5x coverage using massively parallel sequencing technology. More than 90% of the sequence reads were mapped to the human reference genome. In subsequent analysis, we identified 3,224,311 single-nucleotide polymorphisms (SNPs), of which 388,532 (12% of the total SNPs) had not been previously recorded in single nucleotide polymorphism database (dbSNP) or the 1000 Genomes Project database. The 5991 non-synonymous coding variants were screened for deleterious or disease-associated SNPs. Analysis of genes with deleterious SNPs identified 'retinoic acid signaling' and 'regulation of transcription' as the enriched Gene Ontology terms. Scanning of non-synonymous SNPs against the OMIM revealed several disease and phenotype-associated variants in Pakistani genome. Comparative analysis with Indian genome sequence revealed >1.8 million shared SNPs; 32% of which were annotated in ~14,000 genes. Gene Ontology (GO) terms analysis of these genes identified 'response to jasmonic acid stimulus', 'aminoglycoside antibiotic metabolic process' and 'glycoside metabolic process' with considerable enrichment. A total of 59,558 of small indels (1-5 bp) and 16,063 large structural variations were found; 54% of which was novel. Substantial number of novel structural variations discovered in Pakistani genome enforced previous inferences that (a) structural variations are major type of variation in the genome and (b) compared with SNPs, they putatively exhibit equivalent or superior functional roles. This genome sequence information will be an important reference for population-wide genomics studies of ethnically diverse South Asian subcontinent. PMID:23842039

  7. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells.

    PubMed

    Ju, Young Seok; Tubio, Jose M C; Mifsud, William; Fu, Beiyuan; Davies, Helen R; Ramakrishna, Manasa; Li, Yilong; Yates, Lucy; Gundem, Gunes; Tarpey, Patrick S; Behjati, Sam; Papaemmanuil, Elli; Martin, Sancha; Fullam, Anthony; Gerstung, Moritz; Nangalia, Jyoti; Green, Anthony R; Caldas, Carlos; Borg, Åke; Tutt, Andrew; Lee, Ming Ta Michael; Van't Veer, Laura J; Tan, Benita K T; Aparicio, Samuel; Span, Paul N; Martens, John W M; Knappskog, Stian; Vincent-Salomon, Anne; Børresen-Dale, Anne-Lise; Eyfjörd, Jórunn Erla; Flanagan, Adrienne M; Foster, Christopher; Neal, David E; Cooper, Colin; Eeles, Rosalind; Lakhani, Sunil R; Desmedt, Christine; Thomas, Gilles; Richardson, Andrea L; Purdie, Colin A; Thompson, Alastair M; McDermott, Ultan; Yang, Fengtang; Nik-Zainal, Serena; Campbell, Peter J; Stratton, Michael R

    2015-06-01

    Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells. PMID:25963125

  8. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells

    PubMed Central

    Ju, Young Seok; Tubio, Jose M.C.; Mifsud, William; Fu, Beiyuan; Davies, Helen R.; Ramakrishna, Manasa; Li, Yilong; Yates, Lucy; Gundem, Gunes; Tarpey, Patrick S.; Behjati, Sam; Papaemmanuil, Elli; Martin, Sancha; Fullam, Anthony; Gerstung, Moritz; Nangalia, Jyoti; Green, Anthony R.; Caldas, Carlos; Borg, Åke; Tutt, Andrew; Lee, Ming Ta Michael; van't Veer, Laura J.; Tan, Benita K.T.; Aparicio, Samuel; Span, Paul N.; Martens, John W.M.; Knappskog, Stian; Vincent-Salomon, Anne; Børresen-Dale, Anne-Lise; Eyfjörd, Jórunn Erla; Flanagan, Adrienne M.; Foster, Christopher; Neal, David E.; Cooper, Colin; Eeles, Rosalind; Lakhani, Sunil R.; Desmedt, Christine; Thomas, Gilles; Richardson, Andrea L.; Purdie, Colin A.; Thompson, Alastair M.; McDermott, Ultan; Yang, Fengtang; Nik-Zainal, Serena; Campbell, Peter J.; Stratton, Michael R.

    2015-01-01

    Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells. PMID:25963125

  9. The Genomic Signature of Breast Cancer Prevention

    Microsoft Academic Search

    Jose Russo; Gabriela Balogh; Daniel Mailo; Patricia A. Russo; Rebecca Heulings; Irma H. Russo

    Early pregnancy imprints in the breast permanent genomic changes or a signature that reduces the susceptibility of this organ to cancer. The breast attains its maximum development during pregnancy and\\u000a lactation. After menopause, the breast regresses in both nulliparous and parous women containing lobular structures designated\\u000a Lob.1. The Lob 1 found in the breast of nulliparous women and of parous

  10. Genome Sequence of Luminous Piezophile Photobacterium phosphoreum ANT-2200

    PubMed Central

    Zhang, Sheng-Da; Barbe, Valérie; Garel, Marc; Zhang, Wei-Jia; Chen, Haitao; Santini, Claire-Lise; Murat, Dorothée; Jing, Hongmei; Zhao, Yuan; Lajus, Aurélie; Martini, Séverine; Pradel, Nathalie; Tamburini, Christian

    2014-01-01

    Bacteria of the genus Photobacterium thrive worldwide in oceans and show substantially varied lifestyles, including free-living, commensal, pathogenic, symbiotic, and piezophilic. Here, we present the genome sequence of a luminous, piezophilic Photobacterium phosphoreum strain, ANT-2200, isolated from a water column at 2,200 m depth in the Mediterranean Sea. It is the first genomic sequence of the P. phosphoreum group. An analysis of the sequence provides insight into the adaptation of bacteria to the deep-sea habitat. PMID:24744322

  11. Identification of genes in genomic and EST sequences

    SciTech Connect

    Fields, C.; Adams, M.D.; Kerlavage, A.R.; Dubnick, M.; McCombie, W.R.; Martin-Gallardo, A.; Venter, J.C. [National Inst. of Neurological Disorders and Stroke, Bethesda, MD (United States). Receptor Biochemistry and Molecular Biology Section; White, O. [New Mexico State Univ., Las Cruces, NM (United States). Computing Research Lab.

    1993-12-31

    Currently-available software tools are capable of predicting the locations of most protein-coding genes in anonymous genomic DNA sequences. The use of predicted exxon to select primers for PCR amplification from cDNA libraries allows the complete structures of novel genes to be determined efficiently. As the number of expressed sequence tag (EST) sequences increases, the fraction of genes that can be localized in genomic sequences by searching EST databases will rapidly approach unity. The challenge for automated DNA sequence analysis is now to develop methods for accurately predicting gene structure and alternative splicing patterns. Substantially improving current accuracies in gene structure prediction will require retrospective comparative analysis of sequences from different organisms and gene families.

  12. Zebrafish genomic instability mutants and cancer susceptibility.

    PubMed

    Moore, Jessica L; Rush, Lindsay M; Breneman, Carol; Mohideen, Manzoor-Ali P K; Cheng, Keith C

    2006-10-01

    Somatic loss of tumor suppressor gene function comprising the second hit of Knudson's two-hit hypothesis is important in human cancer. A genetic screen was performed in zebrafish (Danio rerio) to find mutations that cause genomic instability (gin), as scored by Streisinger's mosaic-eye assay that models this second hit. The assay, based on a visible test for loss of wild-type gene function at a single locus, golden, is representative of genomewide events. Twelve ENU-induced genomic instability (gin) mutations were isolated. Most mutations showed weak dominance in heterozygotes and all showed a stronger phenotype in homozygotes. Trans-heterozygosity for 7 of these mutations showed greatly enhanced instability. A variety of spontaneous tumors were found in heterozygous adults from all gin lines, consistent with the expectation that genomic instability (mutator) mutations can accelerate carcinogenesis. The incidence of spontaneous cancer at 30-34 months was increased 9.6-fold in heterozygotes for the mutant with the strongest phenotype, gin-10. Tumors were seen in skin, colon, kidney, liver, pancreas, ovary, testis, and neuronal tissues, with multiple tumors in some fish. The study of these mutants will add to our understanding of the mechanisms of somatic loss of gene function and how those mechanisms contribute to cancer susceptibility. PMID:16888336

  13. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  14. A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    FitzGerald, Michael [Broad Institute

    2013-02-12

    Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  15. Genomic multiple sequence alignments: refinement using a genetic algorithm

    PubMed Central

    Wang, Chunlin; Lefkowitz, Elliot J

    2005-01-01

    Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation) score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned) regions of the orthopoxvirus alignment. Overall sequence identity increased only slightly; but significantly, this occurred at the same time that the overall alignment length decreased – through the removal of gaps – by approximately 200 gapped regions representing roughly 1,300 gaps. Conclusion We have implemented a genetic algorithm in parallel mode to optimize multiple genomic sequence alignments initially generated by various alignment tools. Benchmarking experiments showed that the refinement algorithm improved genomic sequence alignments within a reasonable period of time. PMID:16086841

  16. The genomic landscape of fibrolamellar hepatocellular carcinoma: whole genome sequencing of ten patients

    PubMed Central

    Darcy, David G.; Chiaroni-Clarke, Rachel; Murphy, Jennifer M.; Honeyman, Joshua N.; Bhanot, Umesh; LaQuaglia, Michael P.; Simon, Sanford M.

    2015-01-01

    Fibrolamellar hepatocellular carcinoma is a rare, malignant liver tumor that often arises in the otherwise normal liver of adolescents and young adults. Previous studies have focused on biomarkers and comparisons to traditional hepatocellular carcinoma, and have yielded little data on the underlying pathophysiology. We performed whole genome sequencing on paired tumor and normal samples from 10 patients to identify recurrent mutations and structural variations that could predispose to oncogenesis. There are relatively few coding, somatic mutations in this cancer, putting it on the low end of the mutational spectrum. Aside from a previously described heterozygous deletion on chromosome 19 that encodes for a functional, chimeric protein, there were no other recurrent structural variations that contribute to the tumor genotype. The lack of a second-hit mutation in the genomic landscape of fibrolamellar hepatocellular carcinoma makes the DNAJB1-PRKACA fusion protein the best target for diagnostic and therapeutic advancements. The mutations, altered pathways and structural variants that characterized fibrolamellar hepatocellular carcinoma were distinct from those in hepatocellular carcinoma, further defining it as a distinct carcinoma. PMID:25605237

  17. The Diploid Genome Sequence of an Individual Human

    Microsoft Academic Search

    Samuel Levy; Granger Sutton; Pauline C. Ng; Lars Feuk; Aaron L. Halpern; Brian P. Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F. Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R. MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B. Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A. Kravitz; Dana A. Busam; Karen Y. Beeson; Tina C. McIntosh; Karin A. Remington; Josep F. Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E. Frazier; Stephen W. Scherer; Robert L. Strausberg; J. Craig Venter

    2007-01-01

    Presented here is a genome sequence of an individual human. It was produced from ;32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison

  18. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Microsoft Academic Search

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-01-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis\\u000a of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST.\\u000a The routines are used to develop a system for automated annotation of genome DNA sequences.

  19. Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies

    Microsoft Academic Search

    Sacha A. F. T. Van Hijum; Aldert L. Zomer; Oscar P. Kuipers; Jan Kok

    2005-01-01

    With genome sequencing efforts increasing expo- nentially, valuable information accumulates on geno- mic content of the various organisms sequenced. Projector 2 uses (un)finished genomic sequences of an organism as a template to infer linkage informa- tion for a genome sequence assembly of a related organism being sequenced. The remaining gaps between contigs for which no linkage information is present can

  20. Mining for single nucleotide polymorphisms in pig genome sequence data

    PubMed Central

    Kerstens, Hindrik HD; Kollers, Sonja; Kommadath, Arun; del Rosario, Marisol; Dibbits, Bert; Kinders, Sylvia M; Crooijmans, Richard P; Groenen, Martien AM

    2009-01-01

    Background Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation. PMID:19126189

  1. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  2. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    E-print Network

    2010-01-01

    after the first plant genome sequence was com- pleted [1],of the genome sequence of the flowering plant Arabidopsisgenome ref- erence sequence would fill a great evolutionary gap, but it * Correspondence: dbneale@ucdavis.edu Department of Plant

  3. Application of full mitochondrial genome sequencing using 454 GS FLX pyrosequencing

    Microsoft Academic Search

    Martin Mikkelsen; Eszter Rockenbauer; Andrea Wächter; Liane Fendt; Bettina Zimmermann; Walther Parson; Sandra Abel Nielsen; Tom Gilbert; Eske Willerslev; Niels Morling

    2009-01-01

    The GS FLX pyrosequencing platform using parallel tagged sequencing was tested on 10 Somali individuals for sequencing of the complete mitochondrial genome. The amplicons were sequenced twice with increasing coverage to establish the minimum of coverage needed to produce reliable sequence reads. The genome sequences were compared to previously obtained control regions sequences with Sanger sequencing and 49 SNPs in

  4. Genome Sequence of the Pea Aphid Acyrthosiphon pisum

    PubMed Central

    2010-01-01

    Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. PMID:20186266

  5. CONTRIBUTIONS OF GENOME SEQUENCING TO UNDERSTANDING THE BIOLOGY OF HELICOBACTER PYLORI

    Microsoft Academic Search

    Zhongming Ge; Diane E. Taylor

    1999-01-01

    ? Abstract About half of the world’s population carriesHelicobacter pylori ,a gram-negative, spiral bacterium that colonizes the human stomach. The link between H. pylori and, ulceration as well as its association with the development of both gas- tric cancer and mucosa-associated lymphoid,tissue lymphoma,in humans,is a serious public health concern. The publication,of the genome,sequences,of two,stains of H. pylori gives rise to

  6. The Scientific Drunk and the Lamppost: Massive Sequencing Efforts in Cancer Discovery and Treatment

    NSDL National Science Digital Library

    Michael B. Yaffe (American Association for the Advancement of Science; Science Signaling REV)

    2013-04-02

    The massive resources devoted to genome sequencing of human tumors have produced important data sets for the cancer biology community. Paradoxically, however, these studies have revealed very little new biology. Despite this, additional resources in the United States are slated to continue such work and to expand similar efforts in genome sequencing to mouse tumors. It may be that scientists are “addicted” to the large amounts of data that can be relatively easily obtained, even though these data seem unlikely, on their own, to unveil new cancer treatment options or result in the ultimate goal of a cancer cure. Rather than using more tumor genetic sequences, a better strategy for identifying new treatment options may be to develop methods for analyzing the signaling networks that underlie cancer development, progression, and therapeutic resistance at both a personal and systems-wide level.

  7. Plasmodium knowlesi Genome Sequences from Clinical Isolates Reveal Extensive Genomic Dimorphism

    PubMed Central

    Millar, Scott B.; Sanderson, Theo; Otto, Thomas D.; Lu, Woon Chan; Krishna, Sanjeev; Rayner, Julian C.; Cox-Singh, Janet

    2015-01-01

    Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology. PMID:25830531

  8. Plasmodium knowlesi genome sequences from clinical isolates reveal extensive genomic dimorphism.

    PubMed

    Pinheiro, Miguel M; Ahmed, Md Atique; Millar, Scott B; Sanderson, Theo; Otto, Thomas D; Lu, Woon Chan; Krishna, Sanjeev; Rayner, Julian C; Cox-Singh, Janet

    2015-01-01

    Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology. PMID:25830531

  9. Building a model: developing genomic resources for common milkweed ( Asclepias syriaca ) with low coverage genome sequencing

    Microsoft Academic Search

    Shannon CK Straub; Mark Fishbein; Tatyana Livshultz; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C Cronn; Aaron Liston

    2011-01-01

    Background  Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic\\u000a resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of\\u000a the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development

  10. Mapping the Human Reference Genome's Missing Sequence by Three-Way Admixture in Latino Genomes

    E-print Network

    McCarroll, Steve

    ARTICLE Mapping the Human Reference Genome's Missing Sequence by Three-Way Admixture in Latino Genomes Giulio Genovese,1,2,3,* Robert E. Handsaker,2,3 Heng Li,2,3 Eimear E. Kenny,4,5,6,7,8 and Steven A. McCarroll1,2,3,* A principal obstacle to completing maps and analyses of the human genome involves

  11. Modeling cancer metabolism on a genome scale

    PubMed Central

    Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

    2015-01-01

    Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field. PMID:26130389

  12. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Microsoft Academic Search

    Nikos Kyrpides; Jason Rodriquez; Jason Rodriguez; Dwi Susanti; Claudia Reich; Luke E. Ulrich; James G. Elkins; Kostas Mavromatis; Athanasios Lykidis; Matt Nolan; Linda S. Thompson; Alla L. Lapidus; Alex Copeland; Igor B Zhulin; Chris Detter; Biswarup Mukhopadhyay; James Bristow; William Whitman

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids,

  13. The genome sequence of caenorhabditis briggsae: a platform for comparative genomics

    Microsoft Academic Search

    Lincoln D. Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R. Brent; Nansheng Chen; Asif Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter DEustachio; David H. A. Fitch; Lucinda A. Fulton; Robert E. Fulton; Sam Griffiths-Jones; Todd W. Harris; LaDeana W. Hillier; Ravi Kamath; Patricia E. Kuwabara; Elaine R. Mardis; Marco A. Marra; Tracie L. Miner; Patrick Minx; James C. Mullikin; Robert W. Plumb; Jane Rogers; Jacqueline E. Schein; Marc Sohrmann; John Spieth; Jason E. Stajich; Chaochun Wei; David Willey; Richard K. Wilson; Richard Durbin; Robert H. Waterston

    2003-01-01

    The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome

  14. The ClinSeq Project: Piloting large-scale genome sequencing for research in genomic medicine

    Microsoft Academic Search

    Leslie G. Biesecker; James C. Mullikin; Flavia M. Facio; Clesson Turner; Praveen F. Cherukuri; Robert W. Blakesley; Gerard G. Bouffard; Peter S. Chines; Pedro Cruz; Nancy F. Hansen; Jamie K. Teer; Baishali Maskeri; Alice C. Young; Teri A. Manolio; Alexander F. Wilson; Toren Finkel; Paul Hwang; Andrew Arai; Alan T. Remaley; Vandana Sachdev; Robert Shamburek; Richard O. Cannon; Eric D. Green

    2009-01-01

    ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DN A sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues re- lated to the genetic architecture of

  15. Characterization of HPV and host genome interactions in primary head and neck cancers

    PubMed Central

    Parfenov, Michael; Pedamallu, Chandra Sekhar; Gehlenborg, Nils; Freeman, Samuel S.; Danilova, Ludmila; Bristow, Christopher A.; Lee, Semin; Hadjipanayis, Angela G.; Ivanova, Elena V.; Wilkerson, Matthew D.; Protopopov, Alexei; Yang, Lixing; Seth, Sahil; Song, Xingzhi; Tang, Jiabin; Ren, Xiaojia; Zhang, Jianhua; Pantazi, Angeliki; Santoso, Netty; Xu, Andrew W.; Mahadeshwar, Harshad; Wheeler, David A.; Haddad, Robert I.; Jung, Joonil; Ojesina, Akinyemi I.; Issaeva, Natalia; Yarbrough, Wendell G.; Hayes, D. Neil; Grandis, Jennifer R.; El-Naggar, Adel K.; Meyerson, Matthew; Park, Peter J.; Chin, Lynda; Seidman, J. G.; Hammerman, Peter S.; Kucherlapati, Raju; Ally, Adrian; Balasundaram, Miruna; Birol, Inanc; Bowlby, Reanne; Butterfield, Yaron S.N.; Carlsen, Rebecca; Cheng, Dean; Chu, Andy; Dhalla, Noreen; Guin, Ranabir; Holt, Robert A.; Jones, Steven J.M.; Lee, Darlene; Li, Haiyan I.; Marra, Marco A.; Mayo, Michael; Moore, Richard A.; Mungall, Andrew J.; Robertson, A. Gordon; Schein, Jacqueline E.; Sipahimalani, Payal; Tam, Angela; Thiessen, Nina; Wong, Tina; Protopopov, Alexei; Santoso, Netty; Lee, Semin; Parfenov, Michael; Zhang, Jianhua; Mahadeshwar, Harshad S.; Tang, Jiabin; Ren, Xiaojia; Seth, Sahil; Haseley, Psalm; Zeng, Dong; Yang, Lixing; Xu, Andrew W.; Song, Xingzhi; Pantazi, Angeliki; Bristow, Christopher; Hadjipanayis, Angela; Seidman, Jonathan; Chin, Lynda; Park, Peter J.; Kucherlapati, Raju; Akbani, Rehan; Casasent, Tod; Liu, Wenbin; Lu, Yiling; Mills, Gordon; Motter, Thomas; Weinstein, John; Diao, Lixia; Wang, Jing; Fan, You Hong; Liu, Jinze; Wang, Kai; Auman, J. Todd; Balu, Saianand; Bodenheimer, Tom; Buda, Elizabeth; Hayes, D. Neil; Hoadley, Katherine A.; Hoyle, Alan P.; Jefferys, Stuart R.; Jones, Corbin D.; Kimes, Patrick K.; Marron, J.S.; Meng, Shaowu; Mieczkowski, Piotr A.; Mose, Lisle E.; Parker, Joel S.; Perou, Charles M.; Prins, Jan F.; Roach, Jeffrey; Shi, Yan; Simons, Janae V.; Singh, Darshan; Soloway, Mathew G.; Tan, Donghui; Veluvolu, Umadevi; Walter, Vonn; Waring, Scot; Wilkerson, Matthew D.; Wu, Junyuan; Zhao, Ni; Cherniack, Andrew D.; Hammerman, Peter S.; Tward, Aaron D.; Pedamallu, Chandra Sekhar; Saksena, Gordon; Jung, Joonil; Ojesina, Akinyemi I.; Carter, Scott L.; Zack, Travis I.; Schumacher, Steven E.; Beroukhim, Rameen; Freeman, Samuel S.; Meyerson, Matthew; Cho, Juok; Chin, Lynda; Getz, Gad; Noble, Michael S.; DiCara, Daniel; Zhang, Hailei; Heiman, David I.; Gehlenborg, Nils; Voet, Doug; Lin, Pei; Frazer, Scott; Stojanov, Petar; Liu, Yingchun; Zou, Lihua; Kim, Jaegil; Lawrence, Michael S.; Sougnez, Carrie; Lichtenstein, Lee; Cibulskis, Kristian; Lander, Eric; Gabriel, Stacey B.; Muzny, Donna; Doddapaneni, HarshaVardhan; Kovar, Christie; Reid, Jeff; Morton, Donna; Han, Yi; Hale, Walker; Chao, Hsu; Chang, Kyle; Drummond, Jennifer A.; Gibbs, Richard A.; Kakkar, Nipun; Wheeler, David; Xi, Liu; Ciriello, Giovanni; Ladanyi, Marc; Lee, William; Ramirez, Ricardo; Sander, Chris; Shen, Ronglai; Sinha, Rileen; Weinhold, Nils; Taylor, Barry S.; Aksoy, B. Arman; Dresdner, Gideon; Gao, Jianjiong; Gross, Benjamin; Jacobsen, Anders; Reva, Boris; Schultz, Nikolaus; Sumer, S. Onur; Sun, Yichao; Chan, Timothy; Morris, Luc; Stuart, Joshua; Benz, Stephen; Ng, Sam; Benz, Christopher; Yau, Christina; Baylin, Stephen B.; Cope, Leslie; Danilova, Ludmila; Herman, James G.; Bootwalla, Moiz; Maglinte, Dennis T.; Laird, Peter W.; Triche, Timothy; Weisenberger, Daniel J.; Van Den Berg, David J.; Agrawal, Nishant; Bishop, Justin; Boutros, Paul C.; Bruce, Jeff P; Byers, Lauren Averett; Califano, Joseph; Carey, Thomas E.; Chen, Zhong; Cheng, Hui; Chiosea, Simion I.; Cohen, Ezra; Diergaarde, Brenda; Egloff, Ann Marie; El-Naggar, Adel K.; Ferris, Robert L.; Frederick, Mitchell J.; Grandis, Jennifer R.; Guo, Yan; Haddad, Robert I.; Hammerman, Peter S.; Harris, Thomas; Hayes, D. Neil; Hui, Angela BY; Lee, J. Jack; Lippman, Scott M.; Liu, Fei-Fei; McHugh, Jonathan B.; Myers, Jeff; Ng, Patrick Kwok Shing; Perez-Ordonez, Bayardo; Pickering, Curtis R.; Prystowsky, Michael; Romkes, Marjorie; Saleh, Anthony D.; Sartor, Maureen A.; Seethala, Raja; Seiwert, Tanguy Y.; Si, Han; Tward, Aaron D.; Van Waes, Carter; Waggott, Daryl M.; Wiznerowicz, Maciej; Yarbrough, Wendell; Zhang, Jiexin; Zuo, Zhixiang; Burnett, Ken; Crain, Daniel; Gardner, Johanna; Lau, Kevin; Mallery, David; Morris, Scott; Paulauskis, Joseph; Penny, Robert; Shelton, Candance; Shelton, Troy; Sherman, Mark; Yena, Peggy; Black, Aaron D.; Bowen, Jay; Frick, Jessica; Gastier-Foster, Julie M.; Harper, Hollie A.; Lichtenberg, Tara M.; Ramirez, Nilsa C.; Wise, Lisa; Zmuda, Erik; Baboud, Julien; Jensen, Mark A.

    2014-01-01

    Previous studies have established that a subset of head and neck tumors contains human papillomavirus (HPV) sequences and that HPV-driven head and neck cancers display distinct biological and clinical features. HPV is known to drive cancer by the actions of the E6 and E7 oncoproteins, but the molecular architecture of HPV infection and its interaction with the host genome in head and neck cancers have not been comprehensively described. We profiled a cohort of 279 head and neck cancers with next generation RNA and DNA sequencing and show that 35 (12.5%) tumors displayed evidence of high-risk HPV types 16, 33, or 35. Twenty-five cases had integration of the viral genome into one or more locations in the human genome with statistical enrichment for genic regions. Integrations had a marked impact on the human genome and were associated with alterations in DNA copy number, mRNA transcript abundance and splicing, and both inter- and intrachromosomal rearrangements. Many of these events involved genes with documented roles in cancer. Cancers with integrated vs. nonintegrated HPV displayed different patterns of DNA methylation and both human and viral gene expressions. Together, these data provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viral oncogenesis. PMID:25313082

  16. Sequence-Tagged Connectors: A Sequence Approach to Mapping and Scanning the Human Genome

    Microsoft Academic Search

    Gregory G. Mahairas; James C. Wallace; Kim Smith; Steven Swartzell; Ted Holzman; Andrew Keller; Ron Shaker; Jepf Furlong; Janet Young; Shaying Zhao; Mark D. Adams; Leroy Hood

    1999-01-01

    The sequence-tagged connector (STC) strategy proposes to generate sequence tags densely scattered (every 3.3 kilobases) across the human genome by arraying 450,000 bacterial artificial chromosomes (BACs) with randomly cleaved inserts, sequencing both ends of each, and preparing a restriction enzyme fingerprint of each. The STC resource, containing end sequences, fingerprints, and arrayed BACs, creates a map where the interrelationships of

  17. Complete mitochondrial genome sequence of Cheirotonus jansoni (Coleoptera: Scarabaeidae).

    PubMed

    Shao, L L; Huang, D Y; Sun, X Y; Hao, J S; Cheng, C H; Zhang, W; Yang, Q

    2014-01-01

    We sequenced the complete mitochondrial genome (mitogenome) of Cheirotonus jansoni (Coleoptera: Scarabaeidae), an endangered insect species from Southeast Asia. This long legged scarab is widely collected and reared for sale, although it is rare and protected in the wild. The circular genome is 17,249 bp long and contains a typical gene complement: 13 protein-coding genes, 2 rRNA genes, 22 putative tRNA genes, and a non-coding AT-rich region. Its gene order and arrangement are identical to the common type found in most insect mitogenomes. As with all other sequenced coleopteran species, a 5-bp long TAGTA motif was detected in the intergenic space sequence located between trnS(UCN) and nad1. The atypical cox1 start codon is AAC, and the putative initiation codon for the atp8 gene appears to be GTC, instead of the frequently found ATN. By sequence comparison, the 2590-bp long non-coding AT-rich region is the second longest among the coleopterans, with two tandem repeat regions: one is 10 copies of an 88-bp sequence and the other is 2 copies of a 153-bp sequence. Additionally, the A+T content (64%) of the 13 protein-coding genes is the lowest among all sequenced coleopteran species. This newly sequenced genome aids in our understanding of the comparative biology of the mitogenomes of coleopteran species and supplies important data for the conservation of this species. PMID:24634126

  18. Whole genome sequencing in clinical and public health microbiology

    PubMed Central

    Kwong, J. C.; McCallum, N.; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories. As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future. Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

  19. Complete genome sequence of Desulfotomaculum acetoxidans type strain (5575T)

    SciTech Connect

    Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Schroder, Maren [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Gleim, Dorothea [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sims, David [Los Alamos National Laboratory (LANL); Meincke, Linda [Los Alamos National Laboratory (LANL); Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Han, Cliff [Los Alamos National Laboratory (LANL)

    2009-01-01

    Desulfotomaculum acetoxidans Widdel and Pfennig 1977 was one of the first sulfate-reducing bacteria known to grow with acetate as sole energy and carbon source. It is able to oxidize substrates completely to carbon dioxide with sulfate as the electron acceptor, which is reduced to hydrogen sulfide. All available data about this species are based on strain 5575T, isolated from piggery waste in Germany. Here we describe the features of this organ-ism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a Desulfotomaculum species with validly published name. The 4,545,624 bp long single replicon genome with its 4370 protein-coding and 100 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  20. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)

    PubMed Central

    Mavromatis, Konstantinos; Sikorski, Johannes; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Chain, Patrick; Meincke, Linda; Sims, David; Chertkov, Olga; Han, Cliff; Brettin, Thomas; Detter, John C.; Wahrenburg, Claudia; Rohde, Manfred; Pukall, Rüdiger; Göker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family ‘Alicyclobacillaceae’. A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family ‘Alicyclobacillaceae’. The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304673

  1. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IA).

    PubMed

    Mavromatis, Konstantinos; Sikorski, Johannes; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Meincke, Linda; Sims, David; Chertkov, Olga; Han, Cliff; Brettin, Thomas; Detter, John C; Wahrenburg, Claudia; Rohde, Manfred; Pukall, Rüdiger; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family 'Alicyclobacillaceae'. A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family 'Alicyclobacillaceae'. The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304673

  2. Whole genome sequencing in clinical and public health microbiology.

    PubMed

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

  3. Complete genome sequence of Halogeometricum borinquense type strain (PR3).

    PubMed

    Malfatti, Stephanie; Tindall, Brian J; Schneider, Susanne; Fähnrich, Regine; Lapidus, Alla; Labuttii, Kurt; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Anderson, Iain; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; D'haeseleer, Patrik; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Chain, Patrick

    2009-01-01

    Halogeometricum borinquense Montalvo-Rodríguez et al. 1998 is the type species of the genus, and is of phylogenetic interest because of its distinct location between the halobacterial genera Haloquadratum and Halosarcina. H. borinquense requires extremely high salt (NaCl) concentrations for growth. It can not only grow aerobically but also anaerobically using nitrate as electron acceptor. The strain described in this report is a free-living, motile, pleomorphic, euryarchaeon, which was originally isolated from the solar salterns of Cabo Rojo, Puerto Rico. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the halobacterial genus Halogeometricum, and this 3,944,467 bp long six replicon genome with its 3937 protein-coding and 57 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304651

  4. Complete genome sequence of Arthrobacter sp. strain FB24

    SciTech Connect

    Nakatsu, C. H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, T.; Han, Cliff F.; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

    2013-09-30

    Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program.

  5. Complete genome sequence of Klebsiella pneumoniae phage JD001.

    PubMed

    Cui, Zelin; Shen, Wenbin; Wang, Zheng; Zhang, Haotian; Me, Rao; Wang, Yanchun; Zeng, Lingbin; Zhu, Yongzhang; Qin, Jinhong; He, Ping; Guo, Xiaokui

    2012-12-01

    Klebsiella pneumoniae is a member of the family Enterobacteriaceae, opportunistic pathogens that are among the eight most prevalent infectious agents in hospitals. The emergence of multidrug-resistant strains of K. pneumoniae has became a public health problem globally. To develop an effective antimicrobial agent, we isolated a bacteriophage, named JD001, from seawater and sequenced its genome. Comparative genome analysis of phage JD001 with other K. pneumoniae bacteriophages revealed that phage JD001 has little similarity to previously published K. pneumoniae phages KP15, KP32, KP34, and phiKO2. Here we announce the complete genome sequence of JD001 and report major findings from the genomic analysis. PMID:23166250

  6. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)

    SciTech Connect

    Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Meincke, Linda [Los Alamos National Laboratory (LANL); Sims, David [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Wahrenburg, Claudia [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family Alicyclobacillaceae . A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Alicyclobacillaceae . The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  7. Sequencing and analysis of an Irish human genome

    PubMed Central

    2010-01-01

    Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge. PMID:20822512

  8. The International Pea Genome Sequencing Project: Sequencing and Assembly Progresses Updates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The International Consortium for the Pea Genome Sequencing (ICPG) includes scientists from six countries around the world. Its aim is to provide a high quality reference of the pea genome to the scientific community as well as to the pea breeder community. The consortium proposed a strategy that int...

  9. Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma.

    PubMed

    Gartner, Jared J; Parker, Stephen C J; Prickett, Todd D; Dutton-Regester, Ken; Stitzel, Michael L; Lin, Jimmy C; Davis, Sean; Simhadri, Vijaya L; Jha, Sujata; Katagiri, Nobuko; Gotea, Valer; Teer, Jamie K; Wei, Xiaomu; Morken, Mario A; Bhanot, Umesh K; Chen, Guo; Elnitski, Laura L; Davies, Michael A; Gershenwald, Jeffrey E; Carter, Hannah; Karchin, Rachel; Robinson, William; Robinson, Steven; Rosenberg, Steven A; Collins, Francis S; Parmigiani, Giovanni; Komar, Anton A; Kimchi-Sarfaty, Chava; Hayward, Nicholas K; Margulies, Elliott H; Samuels, Yardena

    2013-08-13

    Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683-691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This mutation led to increased BCL2L12 mRNA and protein levels because of differential targeting of WT and mutant BCL2L12 by hsa-miR-671-5p. Protein made from mutant BCL2L12 transcript bound p53, inhibited UV-induced apoptosis more efficiently than WT BCL2L12, and reduced endogenous p53 target gene transcription. This report shows selection of a recurrent somatic synonymous mutation in cancer. Our data indicate that silent alterations have a role to play in human cancer, emphasizing the importance of their investigation in future cancer genome studies. PMID:23901115

  10. Genome Sequences of Mycobacteriophages Luchador and Nerujay.

    PubMed

    Pope, Welkin H; Ahmed, Taha; Drobitch, Marissa K; Early, David R; Eljamri, Soukaina; Kasturiarachi, Naomi S; Klonicki, Emily F; Manjooran, Daniel T; Ní Chochlain, Aífe N; Puglionesi, Andrew O; Rajakumar, Vinod; Shindle, Katherine A; Tran, Mai T; Brown, Bryony R; Churilla, Bryce M; Cohen, Karen L; Wilkes, Kellyn E; Grubb, Sarah R; Warner, Marcie H; Bowman, Charles A; Russell, Daniel A; Hatfull, Graham F

    2015-01-01

    Luchador and Nerujay are two newly isolated mycobacteriophages recovered from soil samples using Mycobacterium smegmatis. Their genomes are 53,387 bp and 53,455 bp long and have 96 and 97 predicted open reading frames, respectively. Nerujay is related to subcluster A1 phages, and Luchador represents a new subcluster, A14. PMID:26089414

  11. Genome Sequences of Mycobacteriophages Luchador and Nerujay

    PubMed Central

    Ahmed, Taha; Drobitch, Marissa K.; Early, David R.; Eljamri, Soukaina; Kasturiarachi, Naomi S.; Klonicki, Emily F.; Manjooran, Daniel T.; Ní Chochlain, Aífe N.; Puglionesi, Andrew O.; Rajakumar, Vinod; Shindle, Katherine A.; Tran, Mai T.; Brown, Bryony R.; Churilla, Bryce M.; Cohen, Karen L.; Wilkes, Kellyn E.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Luchador and Nerujay are two newly isolated mycobacteriophages recovered from soil samples using Mycobacterium smegmatis. Their genomes are 53,387 bp and 53,455 bp long and have 96 and 97 predicted open reading frames, respectively. Nerujay is related to subcluster A1 phages, and Luchador represents a new subcluster, A14.

  12. Genome Sequence of Sinorhizobium meliloti Rm41

    PubMed Central

    Weidner, Stefan; Baumgarth, Birgit; Göttfert, Michael; Jaenicke, Sebastian; Pühler, Alfred; Schneiker-Bekel, Susanne; Serrania, Javier; Szczepanowski, Rafael

    2013-01-01

    Sinorhizobium meliloti Rm41 nodulates alfalfa plants, forming indeterminate type nodules. It is characterized by a strain-specific K-antigen able to replace exopolysaccharides in promotion of nodule invasion. We present the Rm41 genome, composed of one chromosome, the chromid pSymB, the megaplasmid pSymA, and the nonsymbiotic plasmid pRme41a. PMID:23405285

  13. Diverse mechanisms of somatic structural variations in human cancer genomes

    PubMed Central

    Yang, Lixing; Luquette, Lovelace J.; Gehlenborg, Nils; Xi, Ruibin; Haseley, Psalm S.; Hsieh, Chih-Heng; Zhang, Chengsheng; Ren, Xiaojia; Protopopov, Alexei; Chin, Lynda; Kucherlapati, Raju; Lee, Charles; Park, Peter J.

    2013-01-01

    Summary Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements. PMID:23663786

  14. Personalized genomic analyses for cancer mutation discovery and interpretation

    PubMed Central

    Jones, Siân; Anagnostou, Valsamo; Lytle, Karli; Parpart-Li, Sonya; Nesselbush, Monica; Riley, David R.; Shukla, Manish; Chesnick, Bryan; Kadan, Maura; Papp, Eniko; Galens, Kevin G.; Murphy, Derek; Zhang, Theresa; Kann, Lisa; Sausen, Mark; Angiuoli, Samuel V.; Diaz, Luis A.; Velculescu, Victor E.

    2015-01-01

    Massively parallel sequencing approaches are beginning to be used clinically to characterize individual patient tumors and to select therapies based on the identified mutations. A major question in these analyses is the extent to which these methods identify clinically actionable alterations and whether the examination of the tumor tissue alone is sufficient or whether matched normal DNA should also be analyzed to accurately identify tumor-specific (somatic) alterations. To address these issues, we comprehensively evaluated 815 tumor-normal paired samples from patients of 15 tumor types. We identified genomic alterations using next-generation sequencing of whole exomes or 111 targeted genes that were validated with sensitivities >95% and >99%, respectively, and specificities >99.99%. These analyses revealed an average of 140 and 4.3 somatic mutations per exome and targeted analysis, respectively. More than 75% of cases had somatic alterations in genes associated with known therapies or current clinical trials. Analyses of matched normal DNA identified germline alterations in cancer-predisposing genes in 3% of patients with apparently sporadic cancers. In contrast, a tumor-only sequencing approach could not definitively identify germline changes in cancer-predisposing genes and led to additional false-positive findings comprising 31% and 65% of alterations identified in targeted and exome analyses, respectively, including in potentially actionable genes. These data suggest that matched tumor-normal sequencing analyses are essential for precise identification and interpretation of somatic and germline alterations and have important implications for the diagnostic and therapeutic management of cancer patients. PMID:25877891

  15. Contribution to Sequencing of the Deinococcus radiodurans Genome

    SciTech Connect

    Minton, K.W.

    1999-03-11

    The stated goal of this project was to supply The Institute for Genomic Research (TIGR) with pure DNA from the bacterium Deinocmus radiodurans RI for purposes of complete genomic sequencing by TIGR. We subsequently decided to expand this project to include a second goal; this second goal was the development of a NotI chromosomal map of D. radiodurans R1 using Pulsed Field Gel Electrophoresis (PFGE).

  16. Prediction of probable genes by Fourier analysis of genomic sequences

    Microsoft Academic Search

    Shrish Tiwari; S. Ramachandran; Alok Bhattacharya; Sudha Bhattacharya; Ramakrishna Ramaswamy

    1997-01-01

    Motivation: The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA. Result: The three-base periodicity in the nucleotide arrange- ment is evidenced as a sharp peak at frequency fº 1=3 in the Fourier

  17. Genome Sequence of Corynebacterium ulcerans Strain FRC11

    PubMed Central

    Benevides, Leandro de Jesus; Viana, Marcus Vinicius Canário; Mariano, Diego César Batista; Rocha, Flávia de Souza; Bagano, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Dorella, Fernanda Alves; Leal, Carlos Augusto Gomes; Carvalho, Alex Fiorini; Soares, Siomar de Castro; Carneiro, Adriana; Ramos, Rommel; Badell-Ocando, Edgar; Guiso, Nicole; Silva, Artur; Figueiredo, Henrique; Guimarães, Luis Carlos

    2015-01-01

    Here, we present the genome sequence of Corynebacterium ulcerans strain FRC11. The genome includes one circular chromosome of 2,442,826 bp (53.35% G+C content), and 2,210 genes were predicted, 2,146 of which are putative protein-coding genes, with 12 rRNAs and 51 tRNAs; 1 pseudogene was also identified. PMID:25767241

  18. Structure, sequence and expression of the hepatitis delta (?) viral genome

    NASA Astrophysics Data System (ADS)

    Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael

    1986-10-01

    Biochemical and electron microscopic data indicate that the human hepatitis ? viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis ? viral infections.

  19. Mitochondrial genome sequence of the bluegill sunfish (Lepomis macrochirus).

    PubMed

    Li, Sheng-Jie; Cai, Lei; Bai, Jun-Jie

    2011-10-01

    The bluegill sunfish (Lepomis macrochirus) belongs to Lepomis genera of the family Centrarchidae, which is an economically important freshwater species in China. This study presents the complete mitochondrial genome of L. macrochirus, which is the first complete sequence from sunfish species. L. macrochirus mitochondrial DNA is 16,489 bp long, with the genome organization and gene order being identical to that of the typical vertebrate. PMID:22165836

  20. Complete genome sequence of the fish pathogen Flavobacterium psychrophilum

    Microsoft Academic Search

    Mekki Boussaha; Valentin Loux; Jean-François Bernardet; Christian Michel; Brigitte Kerouault; Stanislas Mondot; Pierre Nicolas; Robert Bossy; Christophe Caron; Philippe Bessières; Jean-François Gibrat; Stéphane Claverol; Fabien Dumetz; Michel Le Hénaff; Abdenour Benmansour; Eric Duchaud

    2007-01-01

    We report here the complete genome sequence of the virulent strain JIP02\\/86 (ATCC 49511) of Flavobacterium psychrophilum, a widely distributed pathogen of wild and cultured salmonid fish. The genome consists of a 2,861,988–base pair (bp) circular chromosome with 2,432 predicted protein-coding genes. Among these predicted proteins, stress response mediators, gliding motility proteins, adhesins and many putative secreted proteases are probably